Updating the metagenomics toolbox

It also offers a variety of analytical and visualization tools to support examination and comparison of datasets.Through partnership with the European Nucleotide Archive (ENA), EBI Metagenomics also has a unique archiving remit.EBI Metagenomics (https://ac.uk/metagenomics/) aims to address many of these issues as a freely available hub for the analysis, exploration and archiving of metagenomic data.In common with analysis platforms such as MG-RAST (18) and IMG/M (19), EBI Metagenomics provides standardised processing and analysis pipelines that allow functional and taxonomic analyses of user-submitted sequences.Alongside this, choice of analysis tools, reference databases, and software settings can profoundly influence taxonomic classification and function prediction (15–17).As a result, it is hard to make meaningful comparisons between the analysis results of two different datasets that have been processed using different pipelines.Assembly of sequences into longer contigs helps to address this problem, allowing more detailed functional annotation.In addition, the generation of longer assemblies enables detection of larger and more complex genomic features, such as operons and CRISPRs, and allows inference of function based upon genome context.

With many experiments involving tens or hundreds of such runs, data volumes can quickly overwhelm the storage capacities and analysis capabilities of individual researchers.The sequences themselves, meanwhile, tend to be relatively short, ranging from approximately 100 to 500 bp (with a mean of ∼230 base pairs) following merging and quality trimming for typical Illumina paired-end runs - the dominant sequencing platform for metagenomics.This can pose a problem when trying to determine the functional activity encoded within a metagenome.We also describe the addition of metagenomic assembly as a new analysis service.Our pilot studies have produced over 2400 assemblies from datasets in the public domain.

