Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Handbook of Molecular Microbial Ecology II: Metagenomics in Different Habitats
Handbook of Molecular Microbial Ecology II: Metagenomics in Different Habitats
Handbook of Molecular Microbial Ecology II: Metagenomics in Different Habitats
Ebook1,755 pages19 hours

Handbook of Molecular Microbial Ecology II: Metagenomics in Different Habitats

Rating: 5 out of 5 stars

5/5

()

Read preview

About this ebook

The premiere two-volume reference on revelations from studying complex microbial communities in many distinct habitats

Metagenomics is an emerging field that has changed the way microbiologists study microorganisms. It involves the genomic analysis of microorganisms by extraction and cloning of DNA from a group of microorganisms, or the direct use of the purified DNA or RNA for sequencing, which allows scientists to bypass the usual protocol of isolating and culturing individual microbial species. This method is now used in laboratories across the globe to study microorganism diversity and for isolating novel medical and industrial compounds.

Handbook of Molecular Microbial Ecology is the first comprehensive two-volume reference to cover unculturable microorganisms in a large variety of habitats, which could not previously have been analyzed without metagenomic methodology. It features review articles as well as a large number of case studies, based largely on original publications and written by international experts. This second volume, Metagenomics in Different Habitats, covers such topics as:

  • Viral genomes

  • Metagenomics studies in a variety of habitats, including marine environments and lakes, soil, and human and animal digestive tracts

  • Other habitats, including those involving microbiome diversity in human saliva and functional intestinal metagenomics; diversity of archaea in terrestrial hot springs; and microbial communities living at the surface of building stones

  • Biodegradation

  • Biocatalysts and natural products

A special feature of this book is the highlighting of the databases and computer programs used in each study; they are listed along with their sites in order to facilitate the computer-assisted analysis of the vast amount of data generated by metagenomic studies. Such studies in a variety of habitats are described here, which present a large number of different system-dependent approaches in greatly differing habitats.

Handbook of Molecular Microbial Ecology II is an invaluable reference for researchers in metagenomics, microbial ecology, microbiology, and environmental microbiology; those working on the Human Microbiome Project; microbial geneticists; and professionals in molecular microbiology and bioinformatics.

LanguageEnglish
PublisherWiley
Release dateOct 14, 2011
ISBN9781118010532
Handbook of Molecular Microbial Ecology II: Metagenomics in Different Habitats

Related to Handbook of Molecular Microbial Ecology II

Related ebooks

Biology For You

View More

Related articles

Reviews for Handbook of Molecular Microbial Ecology II

Rating: 5 out of 5 stars
5/5

1 rating0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Handbook of Molecular Microbial Ecology II - Frans J. de Bruijn

    Chapter 1

    Introduction

    Frans J. de Bruijn

    In this second volume of Handbook of Molecular Microbial Ecology, examples are given of metagenomic studies in a large variety of habitats and using diverse techniques. Part 1 discusses the Metagenomics of Viral Genomes, with an Introduction in Chapter 2 and sample Chapters on viruses in soil and aquatic environments, modern strombolites and thrombolites, Yellowstone hot springs, human specimens and plants. In the last case, Chapter 7 proposes next-generation sequencing and metagenomic analysis as a novel, universal diagnostic tool in plant virology. Various methods are described to generate viral metagenomes and to assemble and analyze them.

    In Part 2, the soil habitat is the subject, with an Introduction in Chapter 9 and topics addressed include methods for soil DNA and RNA isolation and purification for multiple metagenomics applications (Chapters 10 and 11). These techniques are essential for construction of (large insert) clone libraries (see also Volume I, Chapter 22) and random DNA sequencing studies. New approaches to retrieve full-length functional genes and phylogenetic analysis of bacterial populations or major soil-borne lineages, as well as rare members of the soil biosphere (Chapter 15), using the 16S rRNA gene and metagenomic libraries, are presented. The soil antibiotic resistome is also analyzed (Chapter 17).

    In Part 3, chapters on the digestive tract are presented with an Introduction in Chapter 18, which includes the human gut microbiota, and the possible correlation of human disease with human microbiota (Chapters 18–21). Some of these studies are part of consortia such as The Human Microbiome Project, and The Human Gutna Microbiome Initiative, which are discussed in Volume I, Chapter 35. This part is complemented with chapters addressing the metagenomics of termite guts and buffalo rumens (Chapters 22 and 23).

    In Part 4, the metagenomics of microbiota in the marine and lake habitats are the subject of study. Microbial diversity in the deep sea, deep sediments and the underexplored rare biosphere, as well as lakes is investigated (Chapters 24–31). Genomic adaptations in marine organisms (Chapter 26), the ecological genomics of marine Picocyanobacteria (Chapter 30) and the diversity and role of bacterial integron/gene cassette metagenomes in extreme environments (Chapter 31; see also Volume I, Chapter 26) are highlighted. These studies are complemented by a metatranscriptome analysis of complex marine microbial communities (Chapter 27; see also Volume I, Chapters 62–64).

    In Part 5, metagenomic analysis of microbes in a varied number of habitats is presented, ranging from gutless marine worms, human saliva, an acid mine draining environment, terrestrial hotsprings, deep-sea hydrothermal vents, biogas plants, visicomyid host clams, to the surface of building stones (Chapters 32–42). The purpose of this section is to expose the reader to many different habitats and metagenomic approaches to study them.

    In Part 6, studies on the application of metagenomics to the discovery of biodegradation genes/enzymes from different habitats, such as aromatic degradation pathway genes, benzoate degradation genes, an alcohol/aldehyde dehydrogenase gene, and alkane hydroxylase genes are presented (Chapters 43–46).

    In Part 7, the metagenomic discovery of several novel natural products by different methods is presented. An overview of functional Metagenomics and its industrial relevance is presented in Chapter 47. The examples shown include a cold active lipase, coenzyme B12 dependent glycerol dehydratase- and diol dehydrogenases, biomedicals, and antibiotics, as well as the discovery, development and commercialization of Pyrophage 3173 DNA Polymerase (Chapters 48–53).

    These parts do not mean to be all inclusive, but should serve the reader with examples of different approaches to use in their own systems/habitats, as well as provide references back to the original publication(s) the chapter was derived from and an extensive literature on the topic.

    Volume II ends with a summary section comprising a perspective on the future of the omics and single-cell analysis (Chapter 54), as well as an article on the birthday of Darwin's On the Origin of Species and the relevance of Darwin's work to today's molecular methods and species concepts (Chapter 55).

    Part 1

    VIRAL GENOMES

    Chapter 2

    Viral Metagenomics

    Shannon J. Williamson

    2.1 Introduction

    The term metagenomics was coined by the soil microbial ecologist Dr. Jo Handlesman in 1998 [Handelsman et al., 1998]. Metagenomics, or community genomics, refers to the study of the genomic contents of microbes extracted directly from the environment. The establishment of metagenomic techniques was an important breakthrough in microbial ecology because microbes that can be cultivated in the laboratory are thought to account for less than 1% that exist in many environments [Whitman et al., 1998]. Over the past decade, metagenomics has developed into an emerging field of study for researchers specializing in diverse disciplines, and metagenomes have been created from the simplest biological complexes (i.e., viruses—the subject of this review) as well as from assemblages of eukaryotes. Metagenomic sequence data are typically used to address the following two fundamental questions: Who is there? and What are they doing? The taxonomic and functional data from metagenomic studies have revolutionized our understanding of the diversity of microbes and the roles they play in their communities.

    Viruses are abundant and ubiquitous biological components of every biome on Earth and outnumber all cellular forms of life. Viruses have been the subject of scrutiny for nearly 120 years, but the field of viral metagenomics is relatively young, with the first marine viral metagenome published in 2002 [Breitbart et al., 2002]. Since this time, the number of published viral metagenomes (viromes) has exploded, and the adoption of next-generation sequencing technologies by researchers has resulted in a relative deluge of information on the genomic contents of viral communities inhabiting a diverse range of environments (see Chapter 3 in this volume for a comprehensive list of viromes). Analysis of viromes has enabled a deeper understanding of virus community dynamics including genotypic and taxonomic diversity, functional capacity, biogeography, and evolution. Viral metagenomics is also a powerful tool for viral discovery and has been used in this capacity to reveal the presence of novel DNA and RNA-containing viruses in a variety of samples, ranging from seawater to domesticated plants. This chapter reviews the technical and applied aspects of viral metagenomics, highlighting examples from both natural and human-derived environments.

    2.2 Experimental Approaches and Sequencing Technologies

    Due to their ubiquitous nature, it's possible to collect viruses from almost all types of biological samples. Indeed, viral metagenomic approaches have been applied to samples collected from a diverse range of environments: from marine waters and sediments to the human gut to a vineyard [Breitbart et al., 2002, 2003, 2004a, 2008; Angly et al., 2006; Fierer et al., 2007; Desnues et al., 2008; see also Chapter 5, Vol. II; Kim et al., 2008; McDaniel et al., 2008; Schoenfeld et al., 2008; see also Chapter 6, Vol. II; Vega Thurber et al., 2008; Williamson et al., 2008; Djikeng et al., 2009; Nakamura et al., 2009; Coetzee et al., 2010]. However, the techniques used to extract virus particles vary and are dependent on the type of sample under study. While the isolation and concentration of viruses from aquatic ecosystems is rather straightforward, more complex matrices (such as soils, sediments, tissues, and clinical samples) present a greater challenge. Collection and purification of virus particles is followed by the targeted extraction of viral nucleic acids, either DNA or RNA, with nuclease-mediated destruction of nontargeted molecules. Alternatively, all types of viral nucleic acids (double-stranded or single-stranded DNA or total RNA) can be purified from a sample simultaneously using hydroxyapatite chromatography [Andrews-Pfannkoch et al., 2010].

    Amplification of viral DNA is often performed in order to (1) provide sufficient quantities of nucleic acid for library construction and sequencing, (2) produce unmodified copies of viral DNA [Breitbart et al., 2002], and (3) purify the DNA of potential contaminants that may interfere with downstream molecular applications [Thurber et al., 2009]. Amplification can be accomplished using a linker-mediated approach [Andrews-Pfannkoch et al., 2010] or by multiple displacement amplification using the phi29 DNA polymerase [Thurber et al., 2009]. Lastly, depending on the sequencing technology to be employed, clone-dependent or clone-independent libraries will be constructed in preparation of sequencing. For a more thorough description of the protocols used for generating viromes, see Chapter 3 in this volume.

    There are currently four options when selecting a sequencing platform for metagenomic studies including di-deoxy sequencing (Sanger), pyrosequencing (454-Roche), SOLiD™ (Applied Biosystems) and Illumina® (formerly known as Solexa). Each technology has pros and cons with respect to sequencing performance including overall cost, read length, error rates, and total capacity (see Chapter 18, Vol. I). To date, only Sanger and 454 pyrosequencing have been utilized in viral metagenomic studies. Sanger sequencing, the only option available when the first viral metagenomic study was undertaken [Breitbart et al., 2002], still affords the longest read lengths of all available sequencing technologies [Wommack et al., 2008]. However, the sheer volume of data, increasing read lengths, and cost advantage afforded by pyrosequencing has resulted in a sharp decline in Sanger-based metagenomic projects in the past several years.

    2.3 Environmental Studies

    The diversity of natural ecosystems on our planet presents unparalleled opportunities for viral metagenomic studies, and a tremendous amount of data on virus communities inhabiting a multitude of environments has been produced over a relatively short time span. The majority of viral metagenomic studies have focused on dsDNA-containing viruses, although targeted studies of viruses with alternate nucleic acid types are increasing [Culley et al., 2006; Ng et al., 2009a,b; Andrews-Pfannkoch et al., 2010]. Due to the extensive nature of environmental viral metagenomic studies, it's prohibitive to discuss them all in detail. Therefore, this part of the chapter will highlight the significant observations generated from studies conducted on samples collected from aquatic, terrestrial, and extreme ecosystems. Figure 2.1 shows the distribution of environments from which viral metagenomes have been produced. Most studies have focused on the marine environment, although numerous hypersaline viromes have also been created. A viral metagenome has even been generated from a vineyard, effectively establishing a connection between the disciplines of viral ecology, plant pathology, and oenology [Coetzee et al., 2010].

    Figure 2.1 Distribution of environmental viral metagenomes.

    2.1

    Despite the disparate physical and chemical factors that characterize the environments shown in Figure 2.1, the resultant viromes generally share the following three characteristics: (1) a high incidence of unknown sequences, (2) a high level of genotypic diversity, and (3) evidence of functional and metabolic plasticity. The propensity of unknown sequences, or sequences that share no significant similarity to other sequences in public databases, suggests that environmental viruses are the most uncharacterized and genetically novel biological components on our planet. The high number of estimated virus genotypes in several environments undoubtedly contributes to the novelty of viral metagenomic data as well as substantial evolutionary divergence from viruses within public databases (see Chapter 4 in this Volume for additional information). Functional profiling of environmental metagenomes has perhaps revealed the most intriguing observations with respect to virus–host dynamics, adaptation, and evolution. A multitude of studies have now demonstrated that the adoption of environmentally relevant host genes by viruses, predominantly phage, is a common occurrence (see Rohwer and Thurber (2009) for a review of this topic; also see Dinsdale et al. [2008] and Williamson et al. [2008a]). Expression of host-derived genes is hypothesized to increase viral fitness by prolonging the life of the host while increasing replication efficiency. This phenomenon also has implications for host adaptation to new environmental challenges as well as for the evolution of the shared host and viral gene pool.

    2.3.1 Aquatic Environments: Marine

    Viral metagenomics has its roots in the marine environment, with the first study focusing on viral communities collected from two bodies of water off of Southern California [Breitbart et al., 2002]. Since this time, many viromes have been created from marine-related material including planktonic samples [Breitbart et al., 2002, 2004; Angly et al., 2006; Culley et al., 2006; Bench et al., 2007; Sharon et al., 2007, 2009; McDaniel et al., 2008; Williamson et al., 2008a], sediments [Breitbart et al., 2004] and marine animals [Vega Thurber et al., 2008; Ng et al., 2009a,b]; representing dsDNA, ssDNA and RNA-containing viruses. While most viral metagenomic studies have been performed on purified virus particles, analyses of viral sequences present within microbial metagenomes have also been reported [Venter et al., 2004; DeLong et al., 2006; Williamson et al., 2008b].

    Assembly-based estimations of dsDNA viral genotypic diversity vary substantially across marine ecosystems. For example, viruses collected from the Arctic Ocean are significantly less diverse than those collected from coastal waters off of British Columbia (∼500 genotypes vs. ∼130,000 genotypes) [Angly et al., 2006]. The upwelling regime that occurs off of the west coast of Canada was suggested as a possible explanation for the elevated levels of viral diversity in this area. Conversely, constraints on microbial diversity at high latitudes was likely responsible for depressed levels of viral diversity in the Arctic [Angly et al., 2006]. Viral communities extracted from marine sediment appear to be the most diverse, with up to an estimated 1 million genotypes per kilogram of sediment [Breitbart et al., 2004; Edwards and Rohwer, 2005]. This extremely high level of diversity may be in response to autochthonous microbial productivity in addition to allochthonous inputs from the overlying seawater. Metagenomic analysis of RNA-containing viruses inhabiting coastal marine ecosystems also revealed an unexpectedly diverse viral community, although the total number of genotypes was not estimated and therefore cannot be directly compared to other studies [Culley et al., 2006]. The RNA virome contained a diverse array of novel viral sequences that were only distantly related to known positive-sense ssRNA viral families.

    Whole community sequencing of marine phage genomes using different approaches has resulted in conflicting theories regarding the biogeographical distribution of marine viruses. Metagenomic analysis of the viral particles collected from four oceanic regions suggested that marine viral species are globally distributed; yet the relative abundance of genotypes fluctuates between specific ecosystems [Angly et al., 2006]. These observations were based on how well viral sequences originating from different regions assembled with one another and subsequent estimations of richness, evenness, and abundance of genotypes. Despite the co-occurrence of phages in different oceanic regions, phylogenetic differences were also noted, suggesting geographical specificity. Alternatively, evaluation of viral sequences present within the microbial size fraction of metagenomic data collected during the Global Ocean Sampling (GOS) Expedition indicated that of the tailed phage families, only myoviruses were ubiquitously distributed [Williamson et al., 2008b]. Assembled contigs that were attributed to podo- and siphoviruses were found to be more geographically isolated. While no significant correlations were found between the distribution of tailed phages and the environmental parameters that were measured, myoviruses appeared to be the most prevalent in tropical oligotrophic waters while podoviruses were more abundant in temperate coastal regions. It is likely that these conflicting theories in part stem from the different methods that were used to assess the occurrence and distribution of phages on a global basis.

    Sequencing of individual viral genomes and viromes from the marine environment has unearthed a diverse range of virus-encoded cellular genes. The abundance and widespread global distribution of virus-encoded host genes was initially a shock to the microbial ecology research community. However, the high level of functional diversity witnessed within marine viromes is now becoming the norm (for a review of this topic, see Rohwer and Thurber (2009)). The complete genome sequences of several phages infective for two major cyanobacterial groups in the marine environment, Prochlorococcus and Synechococcus, offers perhaps the most striking example of how the presence of cellular genes within viral genomes can fundamentally alter our understanding of the importance of viruses to globally important biogeochemical processes [Mann et al., 2003, 2005; Sullivan et al., 2003, 2005, 2006; Lindell et al., 2004; Mann, 2005; Bryan et al., 2008]. The initial finding that cyanophages often carry photosystem I and II genes [Mann et al., 2003; Sullivan et al., 2006; Weigele et al., 2007; Sharon et al., 2009] has been followed by the discovery of a diverse array of cellular genes involved in metabolic and cellular processes ranging from phosphorus and carbon metabolism to nucleotide metabolism, vitamin B12 biosynthesis, antibiotic biosynthesis, virulence, and perhaps even regulation of programmed cell death [Mann, 2005; Sullivan et al., 2005; Weigele et al., 2007; Bryan et al., 2008]. The observed occurrence (and in some cases expression; [Lindell et al., 2005, 2007; Clokie et al., 2006] of host genes suggests that these phages may influence the short-term adaptation of their hosts. In essence, these phages appear to be extending the lifespan of their hosts in an effort to increase replication efficiency. Metagenomic investigation of marine ecosystems has revealed an impressive abundance and distribution of cellular genes in viral communities [Sharon et al., 2007, 2009; Dinsdale et al., 2008; Williamson et al., 2008b]. Moreover, the metagenomic profiling of nine biomes, including the marine environment, [Dinsdale et al., 2008] suggests that microbial and even viral metagenomes are predictive of the biogeochemical conditions that characterize a particular environment. Together, these genomic and metagenomic investigations have revealed an intriguing first glimpse into the genetic details behind the impact of viral-encoded cellular genes on ecosystem processes of global importance.

    2.3.2 Aquatic Environments: Freshwater

    In contrast to the marine environment, only a few viral metagenomic studies have focused on freshwater ecosystems. Similar to marine studies, the viral community sequence data produced from a recreational lake in Maryland and a temporally ice-covered lake in Antarctica are also quite unique [Djikeng et al., 2009; Lopez-Bueno et al., 2009]. However, commonalities with marine viromes are more or less restricted to genetic novelty. Analysis of water samples collected from Lake Needwood in Maryland revealed the presence of predominantly RNA viral families that are known to infect a variety of higher organisms including plants, algae, insects, birds, and mammals with only one family specific to bacteria [Djikeng et al., 2009]. Homologues to known pathogens of plants (i.e., Banana virus), insects (i.e., insect paralysis viruses), and mammals (i.e., circoviruses) were identified within assembled data, perhaps reflective of the various types of land usages and organisms that surround the lake. The Antarctic lake study conducted by Lopez-Bueno and co-workers produced significantly different results than most marine viromes, with an overrepresentation of eukaryotic viruses rather than phage [Lopez-Bueno et al., 2009]. The authors of this study witnessed a seasonal shift in viral community structure as the lake transitioned from an ice-covered to an open water system with eukaryotic ssDNA viruses dominating the former state and dsDNA phycoviruses (algal viruses) dominating the later. The abundance of ssDNA viral families that infect eukaryotes, such as mammals, birds and plants was unusual due to the absence of these organisms in and around the lake, suggesting that these viruses have evolved to infect different types of hosts. The emergence of phycoviruses that appeared in the lake following the summer thaw was most likely in response to a bloom of the prasinophytes, a green alga.

    2.3.3 Terrestrial Environments

    Far fewer efforts have been directed toward investigating viral communities using metagenomic approaches in terrestrial environments. With respect to soil, this may be due in part to the difficulty in extracting a representative virus community [Williamson et al., 2003] and the daunting task of examining what is postulated to be the most diverse environment on Earth [Vogel et al., 2009]. For a more in-depth discussion of viruses in soils, see Chapter 4 in this volume. Metagenomic analysis of viruses that infect plants face similar challenges with respect to virus extraction, along with low virus titers [Lapidot et al., 2001]. However, metagenomic analyses of viruses from various terrestrial ecosystems have been produced, including desert, prairie, and rainforest soils [Fierer et al., 2007], rice paddy soil [Kim et al., 2008], and plants [Adams et al., 2009]. The first viral metagenomic investigation of soil, conducted by Fierer et al. [2007], was a relatively small study, with less than 5000 Sanger sequences analyzed across three soil types. In addition to targeted dsDNA virus community sequencing, the diversity of Bacteria, Archaea, and Fungi was also investigated through small-subunit rRNA amplification and sequencing. Estimations of virus diversity varied widely between soil types, ranging from ∼1000 genotypes in the desert sample to greater than 100 million genotypes in the rainforest sample. While the majority of viral sequences were unique and minimal overlap with marine and fecal viral metagenomes occurred, similarity to known phages of soil microbes was also observed.

    In a different soil study targeting ssDNA viruses [Kim et al., 2008], a high level of genetic novelty was also observed with 60% of the sequences deemed unique. Known sequences were distantly related to several families of ssDNA viruses that infect eukaryotic organisms and were believed to originate from the plants resident to the sample plot (rice and others) as well as wild bird feces and composted manure. Viral metagenomic approaches have also been used as a diagnostic tool in plant virology [Adams et al., 2009]. For more details on this topic, see Chapter 7 in this volume. The authors of this study were able to successfully recover and detect seeded and novel RNA viral genomes from cDNA libraries that were dominated by a plant host signal using a combination of subtractive hybridization and pyrosequencing.

    2.3.4 Extreme Environments

    Viruses are abundant components of many extreme ecosystems such as hypersaline environments [Porter et al., 2007], deep-sea hydrothermal vents [Williamson et al., 2008a] and terrestrial hot springs [Breitbart et al., 2004b; Schoenfeld et al., 2008; see also Chapter 6, Vol. II]. Viruses (predominantly phage) are often the dominant microbial predators in these environments because many of these systems are too hostile to support robust populations of protistan grazers. Several viromes have been created from extreme environments, particularly from hypersaline samples [Dinsdale et al., 2008] [Rodriguez-Brito et al., 2010] (Fig. 2.1). Rodriguez-Brito et al. [2010] created 16 viral and 13 microbial metagenomes from four aquatic samples of increasing salinity, collected at different points in time, in an effort to elucidate virus-host community dynamics. The authors of this study observed temporal fluctuations in the viral genotypes and microbial strains present in each ecosystem, which was indicative of Kill-the-Winner behavior. However, these small-scale fluctuations did not result in major taxonomic shifts or changes in functional capacity, suggesting a high level of overall stability in the microbial community.

    Metagenomic approaches were also used to investigate the biodiversity and biogeography of viruses present within microbialites, which are organosedimentary deposits produced by microbes and were abundant components of ancient aquatic environments [Kempe et al., 1991; Desnues et al., 2008]. For a detailed account of this study, refer to Chapter 5 in this volume. Analysis of metagenomic data indicated that the viruses recovered from stromatolites and thrombolites were very unique, with <3% of sequences resembling other known sequences and little overlap observed with other environmental viral and microbial metagenomes. Cross-comparison of the three viral metagenomes indicated that the microbialites harbored their own unique viral communities, with almost no evidence of common viruses. The authors also discovered that ssDNA microphages heavily dominated one of the phage communities, and subsequent PCR-based testing of a variety of environmental samples indicated that these phages were unique to the stromatolite.

    Metagenomic analysis of viruses inhabiting high-temperature aquatic environments has also been pursued [Schoenfeld et al., 2008; Williamson et al., 2008a]. Schoenfeld et al. [2008] were the first to examine the diversity of terrestrial hot spring viruses through community sequence analysis. Refer to Chapter 6 in this volume for a more in-depth discussion of this topic. Viral metagenomes were created from two mildly alkaline hot springs (74°C and 93°C) located in Yellowstone National Park. Despite the apparent physical differences between the hot springs, ∼25% of viral sequences were shared between the two libraries, suggesting that intermingling of the viral populations occurs underground. Assembly-based estimations of virus diversity using similar stringency levels (>95% identity) as other viral metagenomic studies [Breitbart et al., 2002; Angly et al., 2006] suggested that hot spring viruses were nearly as diverse as estuarine viruses (∼1000 genotypes). However, the estimated number of viral genotypes decreased considerably (∼300–500 genotypes) when assembly parameters were relaxed (50% identity). In another study, metagenomic analysis of whole virioplankton assemblages and induced prophage inhabiting deep-sea diffuse-flow hydrothermal vents [Williamson et al., 2008a] suggested that, compared to other Sanger-based viromes, vent virioplankton communities are among the most genetically novel on Earth.

    2.4 Clinical Studies

    Viral metagenomic approaches have also been applied to human-derived samples, although the number of clinically related studies is dwarfed by those conducted on viruses from the environment. To date, viral metagenomic information has been produced from feces, blood, sputum, nasalpharyngeal samples, cancerous tumors, and transplanted organs [Breitbart et al., 2003, 2008; Allander et al., 2005; Breitbart and Rohwer, 2005; Zhang et al., 2006; Feng et al., 2008; Palacios et al., 2008; Blinkova et al., 2009; Nakamura et al., 2009; Willner et al., 2009; Li et al., 2010] (see Fig. 2.2). These studies contribute to a larger effort that is underway to characterize the distribution and evolution of the human microbiome (i.e., the collection of microbes inhabiting the human body) [Turnbaugh et al., 2007; see also Chapters 18–21, Vol. II].

    Figure 2.2 Distribution of clinical viral metagenomes.

    2.2

    Similar to cellular studies, the majority of human-related viromes have been created from fecal samples [Breitbart et al., 2003; Zhang et al., 2006; Breitbart et al., 2008; Blinkova et al., 2009; Li et al., 2010]. Examination of the microbes present in feces can provide insight into the health and function of the human gut. The human gut is home to diverse consortia of cellular microbes and viruses [Qin et al., 2010; Furuse et al., 1983; Cornax et al., 1994; Eckburg et al., 2005; Turnbaugh et al., 2009] and the interactions between these organisms likely influence the stability of the gut environment. Viromes have been produced from fecal material collected from healthy and diseased adults and children as well as from one infant.

    Initial investigations conducted by Breitbart et al. [2003] and Zhang et al. [2006] focused on samples collected from a limited number of healthy adult individuals and specifically targeted either DNA- or RNA-containing viruses, respectively. While the majority of the DNA virus sequences were unique (59%), a significant proportion (91.5%) of the RNA virus sequences demonstrated similarity to sequences within GenBank, suggesting a lower level of RNA viral diversity within the human gut. Based on the assembly of DNA virus sequences, Breitbart et al. [2003] estimated that the fecal virome contained ∼1900 virus genotypes. Alternatively, only 42 viral species were identified in the RNA fecal virome [Zhang et al., 2006]. Both viromes were deplete in animal virus signatures. Rather, the DNA virome was mainly comprised of phage sequences, and the RNA virome was enriched in sequences similar to plant pathogens. The predominance of phage in the human gut is expected due to the high abundance of potential bacterial host cells [Gill et al., 2006]. However, the discovery of abundant populations of RNA viral plant pathogens in fecal samples was unexpected and suggested that these viruses originated from contaminated foods [Zhang et al., 2006].

    Analysis of a DNA fecal virome produced from a 1-week-old healthy infant also revealed a high level of genetic novelty, with 66% of the sequences demonstrating no similarity to other public sequences [Breitbart et al., 2008]. Similar to the DNA virome produced from the healthy adult, the majority of identifiable viral sequences could be attributed to phage. The most notable difference between the adult- and infant-derived viromes was the diversity of the viral community. Only eight viral genotypes were present in the infant gut virome compared to ∼1900 present in the adult. Furthermore, the infant gut virome was dominated by a genotype that comprised ∼40% of the total community, while the adult gut virome was much more even in this respect. Microarray analysis of the infant gut viral community demonstrated that substantial changes occurred in just one week, suggesting that coincident changes were occurring in the microbial community.

    Most recently, viral metagenomic techniques were used by several researchers to specifically investigate the presence of potential pathogens in the stools of healthy and diseased adults and children and to serve as a mechanism of virus discovery [Finkbeiner et al., 2008; Blinkova et al., 2009; Nakamura et al., 2009; Li et al., 2010]. Finkbeiner and colleagues [Finkbeiner et al., 2008] used metagenomic approaches to characterize the eukaryotic viral communities present in children with diarrhea. Both DNA- and RNA-containing enteric viruses that are known to cause diarrhea were present in the pediatric samples including rotavirus, adenovirus, calicivirus, and astrovirus. Novel subtypes of additional viral families that are not normally associated with gastroenteritis were also detected. In a study conducted by Blinkova et al. [2009], metagenomic sequenceing led to the discovery of novel strains of potentially pathogenic RNA cardioviruses that were consistently shed by both healthy Pakistani and Afghani children and those with nonpolio acute flaccid paralysis (AFP). A similar study by Li et al. [2010] resulted in the discovery of a new genus of human-related DNA circoviruses (Cyclovirus), detected in adults and children from Pakistan, Nigeria, Tunisia, and the United States. Due to the prevalence of circoviruses in farm animals, the authors proposed a transmission route facilitated by meat consumption and contact with animal and/or human fecal material [Li et al., 2010]. In another study, metagenomic analysis of total RNA extracted from fecal samples collected from adults and children during an outbreak of norovirus, a pathogen that causes acute gastroenteritis, confirmed the presence of this virus at high coverage [Nakamura et al., 2009]. An abundance of plant-related RNA virus sequences were also detected, including the pepper mild mottle virus, reinforcing the observations made by Zhang and co-workers in a previous study of RNA viruses in human feces [Zhang et al., 2006]. For a more detailed discussion of this work, refer to Chapter 8 in this volume.

    To date, only one small-scale metagenomic investigation of viruses in human blood has been reported [Breitbart and Rohwer, 2005]. Twenty-two sequences from healthy blood donors demonstrated significant similarity to known viruses, and three novel ssDNA anellovirus-like sequences were detected. Due to the worldwide prevalence of anneloviruses in healthy blood donors and their potential association with post-transfusion hepatitis, the authors analyzed two of the novel sequences further. Phylogenetic analysis of the divergent sequences confirmed their relationship to previously identified anelloviruses and highlighted the diversity within this viral family.

    Viral metagenomic studies of the human respiratory tract have been conducted on nasopharyngeal and sputum samples collected from healthy and diseased adults and children [Allander et al., 2005; Nakamura et al., 2009; Willner et al., 2009; see also Chapter 8 in this volume]. Allander et al. [2005] detected seven types of human-specific viruses in a collection of pooled nasopharyngeal samples including previously uncharacterized corona- and parvoviruses. The metagenomic information was used to identify samples that were positive for the presence of parvovirus via PCR, and complete genome sequences were produced for two novel species of the Bocavirus genus. Nakamura et al. [2009] analyzed total RNA from three nasopharyngeal samples using viral metagenomic approaches and recovered the partial genomes of Influenza A variants. The human endogenous retrovirus HCML-ARV and the WUV polyomavirus, which has been implicated in childhood respiratory disease, were detected in two out of the three samples at a much lower frequency than influenza.

    The association of viruses and cystic fibrosis (CF) was recently reported on by Willner et al. [2009]. In this study, sputum samples were collected from healthy individuals as well as those with CF and viral metagenomic data from DNA-containing viruses was produced. While the overwhelming majority of sequences were unknown (>90%), diversity estimates for the viral assemblages were quite low compared to other human-derived viromes (e.g., fecal samples), with an average richness of 175 genotypes. Taxonomic and functional profiling of the viral communities revealed the presence of core viral taxa (primarily phage) and metabolic profiles that were distinct to CF and non-CF samples with minor exceptions in taxonomic composition. The authors attributed the observed differences in the functional potential of the two types of viromes to the dissimilar types of environments that the viruses inhabit and the specific types of adaptive strategies they have developed in order to persist [Willner et al., 2009].

    Viral metagenomics has proven to be a powerful tool for the discovery of viruses in clinical samples. In addition to the few examples previously discussed, examination of viral metagenomic data produced from cancerous tumors and transplanted organs has also led to the discovery of candidate viral pathogens [Feng et al., 2008; Palacios et al., 2008]. In a study conducted by Feng et al. (2008), cDNA libraries were created from a collection of human merkel cell carcinoma (MCC) tumors. Two novel polyomavirus-like transcript sequences were identified, representing a previously unknown human polyomavirus that was subsequently named Merkel cell polyomavirus (MCV). PCR profiling of an additional 10 MCC tumors resulted in the detection of the MCV virus in 80% of cases compared to <20% detection rate in control tissue samples. Furthermore, MCV viral DNA integration within tumor genomes was found in six out of eight MCC tumor samples, suggesting a link between MCV viral infection and MCC tumor proliferation. In a similar study performed by Palacios [2008], cDNA libraries were created from RNA that was extracted from the brain, cerebrospinal fluid (CSF), serum, kidney, and liver from an individual that received a kidney transplant and from the CSF and serum from an individual that received a liver transplant from the same donor. The donor and organ recipients all died of illnesses involving brain hemorrhage and encephalopathy. Metagenomic analysis resulted in the identification of novel sequences related to Old World Arenavirus, specifically lymphocytic choriomeningitis virus. RNA arenaviruses are endemic to rodents, and transmission to humans has been associated with illnesses including meningitis [Palacios et al., 2008]. The new virus was detected in transplanted organs as well as in the blood and CSF of the recipients via PCR, and arenavirus-specific antigens were also detected in the transplanted organs. Furthermore, antibodies to the virus were detected in the donor as well as in the serum of one of the recipients, suggesting recent infection.

    2.5 Summary

    The field of viral metagenomics is continually evolving and has already revolutionized our perception of the influence of viruses on the population biology of host organisms. The capabilities and challenges associated with viral metagenomics will continue to change and grow as next-generation sequencing platforms are fully adopted and the newest sequencing technologies are put into practice. It is apparent from the examples provided in this chapter that viral metagenomics is one of our most powerful tools for virus discovery, for elucidating the role that viruses play in disease, and for understanding their influence on the adaptation and evolution of hosts. There is little doubt that future efforts in this area will facilitate new paradigm shifts in a multitude of scientific disciplines.

    2.5.1 Acknowledgments

    This work was supported by the Office of Science (BER), U.S. Department of Energy, Cooperative Agreement No. De-FC02-02ER63453.

    References

    Adams IP, Glover RH, Monger WA, Mumford R, Jackeviciene E, et al. 2009. Next-generation sequencing and metagenomic analysis: A universal diagnostic tool in plant virology. Mol. Plant Pathol. 10:537–545.

    Allander T, Tammi MT, Eriksson M, Bjerkner A, Tiveljung-Lindell A, et al. 2005. Cloning of a human parvovirus by molecular screening of respiratory tract samples. Proc. Natl. Acad. Sci. USA 102:12891–12896.

    Andrews-Pfannkoch C, Fadrosh DW, Thorpe J, Williamson SJ. 2010. Hydroxyapatite-mediated separation of dsDNA, ssDNA and RNA genotypes from natural viral assemblages and metagenomic library construction. Appl. Environ. Microbiol. 5039–5045.

    Angly FE, Felts B, Breitbart M, Salamon P, Edwards RA, et al. 2006. The marine viromes of four oceanic regions. PLoS Biol. 4:e368.

    Bench SR, Hanson TE, Williamson KE, Gosh D, Radosovich M, et al. 2007. Metagenomic characterization of Chesapeake Bay virioplankton. Appl. Environ. Microbiol. 73:7629–7641.

    Blinkova O, Kapoor A, Victoria J, Jones M, Wolfe N, et al. 2009. Cardioviruses are genetically diverse and cause common enteric infections in South Asian children. J. Virol. 83:4631–4641.

    Breitbart M, Rohwer F. 2005. Method for discovering novel DNA viruses in blood using viral particle selection and shotgun sequencing. Biotechniques 39:729–736.

    Breitbart M, Salamon P, Andresen B, Mahaffy J, Segall A, et al. 2002. Genomic analysis of uncultured marine viral communities. Proc. Natl. Acad. Sci. USA 99:14250–14255.

    Breitbart M, Hewson I, Felts B, Mahaffy J, Nulton J, et al. 2003. Metagenomic analyses of an uncultured viral community from human feces. J. Bacteriol. 185:6220–6223.

    Breitbart M, Felts B, Kelley S, Mahaffy J, Nulton J, et al. 2004a. Diversity and population structure of a near-shore marine-sediment viral community. Proc. R. Soc. Lond. B Biol. Sci. 271:565–574.

    Breitbart M, Wegley L, Leeds S, Schoenfeld T, Rohwer F. 2004b. Phage community dynamics in hot springs. Appl. Environ. Microbiol. 70:1633–1640.

    Breitbart M, Haynes M, Kelley S, Angly F, Edwards RA, et al. 2008. Viral diversity and dynamics in an infant gut. Res. Microbiol. 159:367–373.

    Bryan MJ, Burroughs NJ, Spence EM, Clokie MR, Mann NH, et al. 2008. Evidence for the intense exchange of MazG in marine cyanophages by horizontal gene transfer. PLoS ONE 3:e2048.

    Clokie MR, Shan J, Bailey S, Jia Y, Krisch HM, et al. 2006. Transcription of a ‘photosynthetic’ T4-type phage during infection of a marine cyanobacterium. Environ. Microbiol. 8:827–835.

    Coetzee B, Freeborough MJ, Maree HJ, Celton JM, Rees DJ, et al. 2010. Deep sequencing analysis of viruses infecting grapevines: Virome of a vineyard. Virology 400:157–163.

    Cornax R, Morinigo MA, Gonzalez-Jaen F, Alonso MC, Borrego JJ. 1994. Bacteriophages presence in human faeces of healthy subjects and patients with gastrointestinal disturbances. Zentralbl. Bakteriol. 281:214–224.

    Culley AI, Lang AS, Suttle CA. 2006. Metagenomic analysis of coastal RNA virus communities. Science 312:1795–1798.

    DeLong EF, Preston CM, Mincer T, Rich V, Hallam SJ, et al. 2006. Community genomics among stratified microbial assemblages in the ocean's interior. Science 311:496–503.

    Desnues C, Rodriguez-Brito B, Rayhawk S, Kelley S, Tran T, et al. 2008. Biodiversity and biogeography of phages in modern stromatolites and thrombolites. Nature 452:340–343.

    Dinsdale EA, Edwards RA, Hall D, Angly F, Breitbart M, et al. 2008. Functional metagenomic profiling of nine biomes. Nature 455:830.

    Djikeng A, Kuzmickas R, Anderson NG, Spiro DJ. 2009. Metagenomic analysis of RNA viruses in a fresh water lake. PLoS ONE. 4:e7264.

    Eckburg PB, Bik EM, Bernstein CN, Purdom E, Dethlefsen L, et al. 2005. Diversity of the human intestinal microbial flora. Science 308:1635–1638.

    Edwards R, Rohwer F. 2005. Viral metagenomics. Nat. Rev. Microbiol. 3:504–510.

    Feng H, Shuda M, Chang Y, Moore PS. 2008. Clonal integration of a polyomavirus in human Merkel cell carcinoma. Science 319:1096–1100.

    Fierer N, Breitbart M, Nulton J, Salamon P, Lozupone C, et al. 2007. Metagenomic and small-subunit rRNA analyses reveal the genetic diversity of bacteria, archaea, fungi, and viruses in soil. Appl. Environ. Microbiol. 73:7059–7066.

    Finkbeiner SR, Allred AF, Tarr PI, Klein EJ, Kirkwood CD, et al. 2008. Metagenomic analysis of human diarrhea: Viral detection and discovery. PLoS Pathog. 4:e1000011.

    Furuse K, Osawa S, Kawashiro J, Tanaka R, Ozawa A, et al. 1983. Bacteriophage distribution in human faeces: Continuous survey of healthy subjects and patients with internal and leukaemic diseases. J. Gen. Virol. 64(Pt 9): 2039–2043.

    Gill SR, Pop M, Deboy RT, Eckburg PB, Turnbaugh PJ, et al. 2006. Metagenomic analysis of the human distal gut microbiome. Science 312:1355–13559.

    Handelsman J, Rondon MR, Brady SF, Clardy J, Goodman RM. 1998. Molecular biological access to the chemistry of unknown soil microbes: A new frontier for natural products. Chem. Biol. 5:R245–R249.

    Kempe S, Kazmierczak J, Landmann G, Konuk T, Reimer A, et al. 1991. Largest known microbialites discovered in Lake Van, Turkey. Nature 349:605–608.

    Kim K-H, Chang H-W, Nam Y-D, Roh SW, Kim M-S, et al. 2008. Amplification of uncultured single-stranded DNA viruses from rice paddy soil. Appl. Environ. Microbiol. 74:5975–5985.

    Lapidot M, Friedmann M, Pilowsky M, Ben-Joseph R, Cohen S. 2001. Effect of host plant resistance to tomato yellow leaf curl virus (TYLCV) on virus acquisition and transmission by its whitefly vector. Phytopathology 91:1209–1213.

    Li L, Kapoor A, Slikas B, Bamidele OS, Wang C, et al. 2010. Multiple diverse circoviruses infect farm animals and are commonly found in human and chimpanzee. Feces J. Virol. 84:1674–1682.

    Lindell D, Sullivan MB, Johnson ZI, Tolonen AC, Rohwer F, et al. 2004. Transfer of photosynthesis genes to and from Prochlorococcus viruses. Proc. Natl. Acad. Sci. USA 101:11013–11018.

    Lindell D, Jaffe JD, Johnson ZI, Church GM, Chisholm, SW. 2005. Photosynthesis genes in marine viruses yield proteins during host infection. Nature 438:86–89.

    Lindell D, Jaffe JD, Coleman ML, Futschik ME, Axmann IM, et al. 2007. Genome-wide expression dynamics of a marine virus and host reveal features of co-evolution. Nature 449:83–86.

    Lopez-Bueno A, Tamames J, Velazquez D, Moya A, Quesada A, et al. 2009. High diversity of the viral community from an Antarctic lake. Science 326:858–861.

    Mann NH. 2005. The third age of phage. PLoS Biol. 3:753–755.

    Mann NH, Cook A, Millard A, Bailey S, Clokie M. 2003. Marine ecosystems: Bacterial photosynthesis genes in a virus. Nature 424:741.

    Mann NH, Clokie MRJ, Millard A, Cook A, Wilson WH, et al. 2005. The genome of S-PM2, a photosynthetic T4-type bacteriophage that infects marine Synechococcus strains. J. Bacteriol. 187:3188–3200.

    McDaniel L, Breitbart M, Mobberley J, Long A, Haynes M, et al. 2008. Metagenomic analysis of lysogeny in Tampa Bay: Implications for prophage gene expression. PLoS ONE 3:e3263.

    Nakamura S, Yang CS, Sakon N, Ueda M, Tougan T, et al. 2009. Direct metagenomic detection of viral pathogens in nasal and fecal specimens using an unbiased high-throughput sequencing approach. PLoS ONE 4:e4219.

    Ng TFF, Manire C, Borrowman K, Langer T, Ehrhart L, et al. 2009a. Discovery of a novel single-stranded DNA virus from a sea turtle fibropapilloma by using viral metagenomics. J. Virol. 83:2500–2509.

    Ng TFF, Suedmeyer WK, Wheeler E, Gulland F, Breitbart M. 2009b. Novel anellovirus discovered from a mortality event of captive California sea lions. J. Gen. Virol. 90:1256–1261.

    Palacios G, Druce J, Du L, Tran T, Birch C, et al. 2008. A new arenavirus in a cluster of fatal transplant-associated diseases. N. Engl. J. Med. 358:991–998.

    Porter K, Russ BE, Dyall-Smith ML. 2007. Virus–host interactions in salt lakes. Curr. Opin. Microbiol. 10:418–424.

    Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, et al. 2010. A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464:59–65.

    Rodriguez-Brito B, Li L, Wegley L, Furlan M, Angly F, et al. 2010. Viral and microbial community dynamics in four aquatic environments. ISME J. 739–751.

    Rohwer F, Thurber RV. 2009. Viruses manipulate the marine environment. Nature 459:207–212.

    Schoenfeld T, Patterson M, Richardson PM, Wommack KE, Young M, et al. 2008. Assembly of viral metagenomes from yellowstone hot springs. Appl. Environ. Microbiol. 74:4164–4174.

    Sharon I, Tzahor S, Williamson S, Shmoish M, Man-Aharonovich D, et al. 2007. Viral photosynthetic reaction center genes and transcripts in the marine environment. ISME J. 1:492–501.

    Sharon I, Alperovitch A, Rohwer F, Haynes M, Glaser F, et al. 2009. Photosystem I gene cassettes are present in marine virus genomes. Nature 461:258–262.

    Sullivan MB, Waterbury JB, Chisholm SW. 2003. Cyanophages infecting the oceanic cyanobacterium Prochlorococcus. Nature 424:1047–1051.

    Sullivan MB, Coleman ML, Weigele P, Rohwer F, Chisholm SW. 2005. Three Prochlorococcus cyanophage genomes: Signature features and ecological interpretations. PLoS Biol. 3:e144.

    Sullivan MB, Lindell D, Lee JA, Thompson LR, Bielawski JP, et al. 2006. Prevalence and evolution of core photosystem II genes in marine cyanobacterial viruses and their hosts. PLoS Biol. 4:e234.

    Thurber RV, Haynes M, Breitbart M, Wegley L, Rohwer F. 2009. Laboratory procedures to generate viral metagenomes. Nat. Protoc. 4:470–483.

    Turnbaugh PJ, Ley RE, Hamady M, Fraser-Liggett CM, Knight R, et al. 2007. The human microbiome project. Nature 449:804–810.

    Turnbaugh PJ, Hamady M, Yatsunenko T, Cantarel BL, Duncan A, et al. 2009. A core gut microbiome in obese and lean twins. Nature 457:480–484.

    Vega Thurber RL, Barott KL, Hall D, Liu H, Rodriguez-Mueller B, et al. 2008. Metagenomic analysis indicates that stressors induce production of herpes-like viruses in the coral Porites compressa. Proc. Natl. Acad. Sci. USA 105:18413–18418.

    Venter JC, Remington K, Heidelberg JF, Halpern AL, Rusch D, et al. 2004. Environmental genome shotgun sequencing of the Sargasso Sea. Science 304:66–74.

    Vogel TM, Simonet P, Jansson JK, Hirsch PR, Tiedje JM, et al. 2009. TerraGenome: A consortium for the sequencing of a soil metagenome. Nat. Rev. Microbiol. 7:252–252.

    Weigele PR, Pope WH, Pedulla ML, Houtz JM, Smith AL, et al. 2007. Genomic and structural analysis of Syn9, a cyanophage infecting marine Prochlorococcus and Synechococcus. Environ. Microbiol. 9:1675–1695.

    Whitman WB, Coleman DC, Wiebe WJ. 1998. Prokaryotes: the unseen majority. Proc. Natl. Acad. Sci. USA 95:6578–83.

    Williamson KE, Wommack KE, Radosevich M. 2003. Sampling natural viral communities from soil for culture-independent analyses. Appl. Environ. Microbiol. 69:6628–6633.

    Williamson SJ, Cary SC, Williamson KE, Helton RR, Bench SR, et al. 2008a. Lysogenic virus–host interactions predominate at deep-sea diffuse-flow hydrothermal vents. ISME J. 2:1112–1121.

    Williamson SJ, Rusch DB, Yooseph S, Halpern AL, Heidelberg KB, et al. 2008b. The Sorcerer II Global Ocean Sampling Expedition: Metagenomic characterization of viruses within aquatic microbial samples. PLoS ONE 3:e1456.

    Willner D, Furlan M, Haynes M, Schmieder R, Angly FE, et al. 2009. Metagenomic analysis of respiratory tract DNA viral communities in cystic fibrosis and non-cystic fibrosis individuals. PLoS ONE 4:e7370.

    Wommack KE, Bhavsar J, Ravel J. 2008. Metagenomics: Read length matters. Appl. Environ. Microbiol. 74:1453–1463.

    Zhang T, Breitbart M, Lee WH, Run JQ, Wei CL, et al. 2006. RNA viral community in human feces: Prevalence of plant pathogenic viruses. PLoS Biol. 4:108–118.

    Chapter 3

    Methods in Viral Metagenomics

    Rebecca Vega Thurber

    3.1 Introduction

    Viruses are ubiquitous and abundant ( ∼ 10³¹ on planet Earth) biological entities that likely infect and disrupt all cellular organisms (see Chapters 2, 4, and 5 in this volume). Viral metagenomics, the characterization and evaluation of viral consortia from environmental samples, has shown that viruses are unexpectedly diverse. More than 5000 viral genotypes or species have been detected in 100 L of seawater, and ∼1 million species have been found in 1 k of sediment [Breitbart et al., 2002, 2004; Angly et al., 2006]. Viromes collected from across the world have also shown that viral species are globally distributed (everything is everywhere) but that the relative abundance of each species is restricted by local selection (for review see Srinivasiah et al. [2008] and Vega Thurber (2009)). Lastly, viromics has shown that viral functional diversity, as well as its influence on host adaptation, has been vastly underestimated (for review see Rohwer and Vega Thurber (2009)).

    Some physical characteristics (e.g., capsid durability) make viruses amenable to purification. However, other aspects of their biology significantly limit viral observation, maintenance, and manipulation, including: (1) the wide range of viral particle sizes, shapes, densities, and sensitivities; (2) viral decay; and (3) variation in viral genome type (DNA vs. RNA and single- vs. double-stranded) and length. For example, some archaeal and eukaryotic viruses (e.g., fuselloviruses, asfraviruses, and iridoviruses) can withstand temperatures above 55°C for extended periods of time [Fauquet et al., 2005]; but depending on the prevailing abiotic and biotic conditions, some marine viral particles decay on the order of hours (∼2.5–12 h) to days [Suttle and Feng, 1992].

    Additionally, the majority of viruses cannot be grown in pure culture because their hosts are recalcitrant to cultivation. The lack of a single phylogenetic marker amongst viral families requires that alternative approaches to gene marker-based phylogenetics (e.g., the 16S rRNA gene sequence commonly used for Bacteria and Archaea; see Chapter 15, Vol. I) be used for the evaluation of viral consortia in environmental samples [Rohwer and Edwards, 2002]. Finally, the disparate phenetic parameters (e.g., genome type, host range, and morphological and physical characteristics, as well as genetic and genomic sequence similarities) used in viral taxonomy have generated polyphyletic viral families and obscured the evolutionary relationships between taxa and sequences [Lawrence et al., 2002]. Therefore, any researcher interested in generating and analyzing viromes must confront a suite of both methodological and analytical challenges.

    Unlike cellular organisms, viruses contain DNA or RNA genomes, and the type of genome (dsDNA, ssDNA, dsRNA, + ssRNA, or − ssRNA) they contain is a defining viral taxonomic characteristic [Fauquet et al., 2005]. Both DNA and RNA viromes from vastly different sample types have been collected and analyzed (Table 3.1), including (1) clinical samples such as tissues, blood, sputum, stool, and gut contents and (2) environmental samples such as freshwater and saltwater, animals, plants, fungal tissues, rocks, sediments, soil, filtered air, and sewage [Breitbart et al., 2002, 2003, 2004a,b; Culley and Welschmeyer, 2002; Breitbart and Rohwer, 2005; Williamson et al., 2005; Zhang et al., 2006; Bench et al., 2007; Melcher et al., 2008; Vega Thurber et al., 2008; Lopez-Bueno et al., 2009].

    Table 3.1 Examples of Virome Projects to Date

    NumberTable

    This chapter provides protocols for generating DNA or RNA viromes, including how to isolate, concentrate, purify, extract, and, if necessary, amplify viral nucleic acids from environmental samples. In addition, multiple ways to eliminate contaminating cells, nuclei, and free nucleic acids, as well as to validate the removal of these contaminants, are described. For full details on these methods, please refer to the original publication [Vega Thurber et al., 2009]. Lastly, a brief section on the challenges of virome bioinformatics analysis is presented.

    It should be noted that the viral composition of some samples will necessitate modifications to the viral concentration and extraction protocols presented here. Although the majority of viruses are less than 250 nm, some viruses are physically larger (∼720-nm particle) than certain bacterial taxa [Raoult and Forterre, 2008]. Many of these giant viruses also contain relatively long genomes (e.g., ∼1 Mb) that are highly chimeric in comparison to other viral groups [Claverie et al., 2006; Filee et al., 2007, 2008; Raoult and Forterre, 2008]. To isolate such large nucleocytoplasmic viruses, see the methods in Koonin (2005) and Lopez-Bueno et al. [2009]. Projects on long filamentous viruses, which can reach lengths in excess of 2 μm, will also require modifications to the protocol presented in this work. Lastly, a few groups of viruses are smaller than 20 nm (circoviruses and nanoviruses) and will require alternate approaches for isolation.

    Viral particles can be enveloped and/or have various modifications to their structures that are sensitive to various steps in the protocols. Extrapolated from the 8th Report of the International Committee on the Taxonomy of Viruses, a list of observed viral buoyancies and sensitivities to different compounds was previously compiled [Vega Thurber et al., 2009]. Ultimately, if researchers are interested in a particular viral Family, they should consult the literature prior to attempting virome generation.

    3.2 Methods

    3.2.1 Preservation of Environmental Samples for Virome Generation

    The methods used to isolate and store samples prior to virome generation can significantly impact the stability of viral particles and, hence, the quality of the viromes created from the presented protocol. The destruction of microbial and host cells is a needed but delicate issue that should be considered when preparing samples for virome construction. If samples cannot be processed immediately, subsequent cell growth can contaminate the samples. For example, if seawater is collected but not properly treated or stored, then microbial growth occurring between sample collection and virome generation can lead to phage bursts and, ultimately, datasets that do not accurately reflect the sample composition at the time of collection. Chloroform (2–5%) is often added to virome samples to prevent unintended cellular growth (e.g., in Vega Thurber et al. [2008]). Solvent addition will, however, alter the infectivity and buoyant density of some viral particles [Fauquet et al., 2005].

    Samples can also be frozen at −80°C to preserve viral particles and prevent cellular growth, although some viral capsids may burst when samples are thawed for virome generation. Addition of buffer or solutions that stabilize nucleic acids (e.g., RNALater from Ambion) prior to or in place of freezing will prevent loss of viral DNA or RNA [Uhlenhaut and Kracht, 2005].

    3.2.2 Concentration of Viral Particles Using Filtration

    Viral particles can be isolated and concentrated using size fractionation methods such as impact and tangential filtration devices. Impact filters remove particles greater than a given pore size and can be used on small volumes (e.g., <100 ml). For an excellent review on these and other methods to isolate viral particles, see Lawrence and Steward (2010).

    Tangential-flow filtration (TFF) is used to isolate viral particles from a variety of liquid samples and is particularly useful for dilute water samples [Suttle et al., 1991]. For an extensive review of this approach for isolating viral particles, see Wommack et al. (2010). Briefly, TFFs concentrate viral particles from a sample into a small final volume by retaining larger particles and removing excess liquid (i.e., filtrate). A backpressure is used to force the sample through the pores of hollow fiber filters; particles that are small enough to be pushed through the filter pores are discarded. The retained sample constitutents (which are larger than the filter pore size) are collected into a reservoir basin and then cycled through the filters additional times. To ensure that these larger particles do not rupture as they are pushed against the filter pores, an oil gauge is used maintain the pressure within the system below 10 psi.

    There are several benefits to using TFFs. The filter surface area allows large volumes (tens to hundreds of liters) of filtrate to pass through rapidly, and the system is less prone to clogging than impact filters [Ludwig and Oshaughnessey, 1989; Kuwabara and Harvey, 1990]. TFFs also have the added benefit of generating viral-particle-free filtrate that is useful for many downstream applications. For example, casual surface viruses can be rinsed from the surfaces of host tissues prior to particle isolation using 100- to 300-kD TFF filtrate [Wommack et al., 2010]. Additionally, TFF viral-particle-free filtrate can also be used in the generation of medium densities for ultracentrifugation (see Section 3.2.3). As with any step in this protocol, researchers should verify that viral particles are absent from TFF filtrate prior to its use for rinsing samples or creating solutions (see Section 3.2.4). Although TFF is a useful tool, the cost of tangential filters, which can exceed several hundred dollars, can be prohibitive.

    3.2.3 Concentration of Viral Particles by Centrifugation

    Viral particles can be concentrated using a variety of centrifugation methods, including differential pelleting and zonal/density-gradient ultracentrifugation (see Lawrence and Steward (2010)). Viral particles may be damaged during differential pelleting, and therefore only density gradient ultracentrifugation is discussed here. Both isopycnic and zonal ultracentrifugation are well-established methods that can be uniquely augmented to isolate and purify viral particles based on their buoyant proprieties. In density gradient ultracentrifugation, media that can reach a variety of densities are spun within the same container. As a result, viral particles of various densities are concentrated within the media fractions that have the corresponding densities. Media that can reach high densities, such as salts of alkali metals (e.g., CsCl) or small hydrophobic organic compounds (e.g., sucrose), are therefore commonly used components of this technique. Depending on the ultimate use of viral particles (e.g., shotgun sequencing vs. viral culturing studies), the media types selected for use in density gradient ultracentrifugation are critical, because each medium has different effects on particle integrity and infectivity [Lawrence and Steward, 2010].

    The media, centrifugation speed, rotor type, as well the gradient design (types and/or number of buoyant layers) used, depend on the physical properties of the target viruses [Fauquet et al., 2005; Lawrence and Steward, 2010]. In order to optimize viral particle isolation from a new sample type, researchers should compare results from numerous media and densities. Regardless of the media selected, gradients should be made from the same buffer type that is present in the original sample (e.g., seawater, phosphate-buffered saline), and buffers used to produce a gradient must first be purified with a 0.02-μm filter to ensure that external viruses and/or other microbes do not contaminate the resulting fractions. An overlayered discontinuous isopycnic gradient using 1- to 2-ml aliquots of each CsCl density (e.g., 1.7, 1.5, 1.35, and 1.2 g ml−1) is often an excellent starting point for new viral metagenomics projects [Vega Thurber et al., 2009].

    After ultracentrifugation, targeted harvest of each density layer is performed using various methods [Lawrence and Steward, 2010]. Each density fraction is then visually examined for the presence of viral particles and contaminating cells, nuclei, or debris. Below is an example a viral particle concentration protocol to isolate marine phages using isopycnic density gradients with CsCl as a medium and a swinging bucket rotor [Vega Thurber et al., 2009].

    1. Deposit each layer by decreasing density (e.g., 1.7, 1.5, 1.35, and then 1.2 g ml−1 CsCl) in clear ultracentrifugation tubes (e.g., Beckman Coulter tubes), marking each interface on the outside of the tube. If tubes are not completely filled, they can collapse during centrifugation. Consult Fauquet et al. (2005) for buoyant densities and media types (e.g., CsCl versus sucrose) appropriate for various viral types. For explicit details on different approaches to gradient layering, sample loading, and isolation of gradient fractions, see Lawrence and Steward (2010).

    2. Precisely balance paired tubes and place correspondingly into the partnered rotor (e.g., swinging bucket rotor). Screw on the lids for centrifugation.

    3. Hold the centrifuge at 4°C and spin tubes at an appropriate speed for a sufficient length of time (e.g., 2 h at 22,000 rpm or ∼ 60, 000 × g) to separate viral particles into different buoyant layers.

    4. Place a sterile narrow (18-16) gauge needle on a sterile Luer-Lok syringe (e.g., 3–10 ml). Different approaches to the isolation of gradient fractions can be found in [Lawrence and Steward, 2010].

    5. Remove tubes from rotor and place over collection vessels (e.g., 50-ml centrifuge tubes).

    6. Place the needle, with the mouth facing upward, just below the density interface of interest (marked in step 1).

    7. Trying not to disturb the gradients, slowly pierce the tube until the needle is midway though the tube. Precautions should be taken to ensure that researchers avoid needle pricks.

    8. Slowly withdraw the plunger of the syringe and pull the desired volume of sample into the syringe barrel. The amount collected should be proportional to the size of the sample loaded. Some rotors hold larger centrifugation tubes, and therefore the volume of each density will be variable from rotor to rotor.

    9. Transfer the collected virion layer into a collection vessel (e.g., 50-ml centrifuge tube).

    10. Using a fresh syringe and needle for each subsequent virion layer, repeat steps 6–9 until all virion layers of interest are collected.

    3.2.4 Verification of Virome Purity

    When targeting viruses to produce a metagenome, any remaining nontarget cells, nuclei, and free nucleic acids will contaminate the resulting virome. With a few exceptions, eukaryotic and microbial genomes are significantly longer than viral genomes; and if not eliminated, these nontarget genomes will represent a disproportionately large amount of sequence data within the viromes they contaminate. Metagenomic library construction and analysis requires considerable time, effort, and financing; therefore sequence contamination should be avoided at all costs. Researchers must verify that eukaryotic and microbial cells are destroyed and/or removed before viral nucleic acid extraction, unless viral nucleic acids are to be used for PCR-based phylogenetics. In this latter case, absolute purity of viromes is not essential.

    To verify the purity of concentrated samples, researchers should visually inspect their recovered viral particle concentrates. This can be done using nucleic acid staining and epifluorescence microscopy or flow cytometry. These techniques will not be discussed here, but discussions on the methods can be found in several publications [Noble and Fuhrman, 1998; Marie et al., 1999; Bettarel et al., 2000; Wen et al., 2004; Patel et al., 2007; Vega Thurber et al., 2009]. After nucleic acid extraction (see Section 3.2.6), researchers should also reconfirm the absence of contaminating DNA using PCR of common bacterial, archaeal, and eukaryotic marker genes (e.g., 16S and 18S rDNA), which viruses lack.

    If contaminating debris or cells remain after filtration and/or ultracentrifugation, the protocol provided here can be amended with several additional steps. A second round of ultracentrifugation can be performed, but this may result in additional loss of viral particles and/or be cost prohibitive. If large volumes (>50

    Enjoying the preview?
    Page 1 of 1