Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Genomic and Precision Medicine: Infectious and Inflammatory Disease
Genomic and Precision Medicine: Infectious and Inflammatory Disease
Genomic and Precision Medicine: Infectious and Inflammatory Disease
Ebook1,018 pages10 hours

Genomic and Precision Medicine: Infectious and Inflammatory Disease

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Genomic and Precision Medicine: Infectious and Inflammatory Disease, Third Edition, provides current clinical solutions on the application of genome discovery on a broad spectrum of disease categories in IMD - including asthma, obesity and multiple sclerosis. Each chapter is organized to cover the application of genomics and personalized medicine tools and technologies, along with information on a) Risk Assessment and Susceptibility, b) Diagnosis and Prognosis, c) Pharmacogenomics and Precision Therapeutics, and d) Emerging and Future Opportunities in the field.

  • Offers comprehensive coverage of infectious and inflammatory disease genomics
  • Provides succinct commentary and key learning points to assist providers with the implementation of genomic and personalized medicine
  • Presents an up-to-date overview on major opportunities for genomic and personalized medicine
  • Includes case studies that highlight the practical use of genomics in the management of patients
LanguageEnglish
Release dateAug 15, 2019
ISBN9780128015582
Genomic and Precision Medicine: Infectious and Inflammatory Disease

Related to Genomic and Precision Medicine

Related ebooks

Biology For You

View More

Related articles

Related categories

Reviews for Genomic and Precision Medicine

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Genomic and Precision Medicine - Geoffrey S. Ginsburg

    Disease.

    Chapter 1

    Overview: Genomic and precision medicine for infectious and inflammatory disease

    Christopher W. Woodsa,b; Ephraim L. Tsalikc,d    a Professor of Medicine and Global Health, Duke Center for Applied Genomics & Precision Medicine, Duke University School of Medicine

    b Chief, Infectious Disease Division, Durham VA Health Care System, United States

    c Emergency Department Service, Durham VA Health Care System, Durham, NC, United States

    d Duke Center for Applied Genomics & Precision Medicine, Duke University School of Medicine, dEmergency Department Service,Durham VA Health Care System, United States

    Abstract

    A century of advances in infectious disease diagnosis, treatment, and prevention changed the face of medicine and global health. However, challenges persist including high mortality from sepsis, emerging antimicrobial resistance, and globalization that increases pandemic risks. More recently, similar achievements are being made in the field of autoimmune and inflammatory diseases, but gains are slowed by difficulties with non-specific, clinical diagnoses. These challenges can be mitigated through implementation of the tools of precision medicine. This volume covers a broad range of topics where use of these tools holds promise for an exciting era of rapid progress.

    Keywords

    Genomics; Precision; Infection; Inflammatory; Autoimmune; Classifier

    The era of systems biology has leveraged the emergence of genomics, transcriptomics, proteomics, metabolomics, lipidomics, and even microbiomics, opening up new possibilities for understanding the host response to disease, identifying therapeutic targets, and developing diagnostic tools that do not fit the traditional biomarker paradigm. Access to large electronic health databases with granular clinical metadata complements these biological data. To best use these complex, rich datasets, biostatisticians and computational biologists have revolutionized approaches by developing methods to reduce data dimensionality, match phenotype to molecular changes, compare groups and make predictions. Furthermore, the discovery of rare and common genetic variants that may contribute to disease susceptibility and response to therapy have encouraged a precision approach to diagnosis, treatment, and understanding of prognosis for individual patients.

    In particular, this burgeoning of information has led to the hope of improved diagnostics with the ability to distinguish among diseases with similar clinical phenotypes representing heterogeneous etiologies or pathophysiological processes. Similarly, these data may hold the key to improved prognostics to tailor treatment and assign clinical resources more appropriately. This is especially important for infectious and inflammatory disease states. Various statistical algorithms have been utilized to construct these disease classifiers, including sparse factor modeling, Bayesian constructions of the elastic net, sparse principal component analysis, and the molecular distance to health [1]. When derived on an appropriate patient population and then tested in new cohorts, these classifiers demonstrate improved performance when compared to the single analyte biomarkers that are available. In particular, unbiased approaches that utilize machine learning to identify candidate biomarkers have improved classifier sen- sitivity and specificity. Unbiased approaches that utilize machine learning to identify candidate biomarkers have improved classifier sensitivity and specificity.

    Particular attention should be paid to study design for the development of classifiers in both infectious and inflammatory diseases. Previously, many classifiers have been generated relative to healthy controls. Furthermore, patient cohorts for biomarker development often only include patients with phenotypic extremes, such as confirmed microbial etiology and clearly uninfected patients or classical presentations of inflammatory disease states such as lupus nephritis. These clearly defined populations facilitate biomarker development, but bias analysis by only capturing two ends of the clinical disease spectrum. Selection of patients on this basis is also constrained to patient populations in which pathogen-based testing is successful and may exclude some patient populations. Also, the use of omics has the potential to generate a large number of biomarkers from a small patient cohort, which may not extend to a broader population. While being scientifically useful, these signatures will need to be refined and tested in broader, more relevant patient populations to be clinically applied.

    Defining clinical challenges for infectious diseases

    Precision medicine approaches are relevant for most infectious diseases. A classic pharmacogenomic example is the use of the HLA-B*5701 genotype associated with hypersensitivity to abacavir [2]. However, many of the more vexing issues of infectious diseases require acute disease management decisions. This impacts the ultimate design of precision medicine tools. Some relevant examples are listed below.

    Sepsis

    Severe infections can lead to the physiological state known as sepsis, which despite advanced supportive care measures, still carries a mortality of 17–26% [3]. The management of sepsis remains challenging due to its heterogeneous nature, including factors such as the causative pathogen, site of infection, clinical management, and many other identified and unidentified variables interacting to define clinical severity and outcomes. Furthermore, clinical trials of sepsis have been plagued by failures, with diversity in host response likely playing a key role. This highly variable mix of patient, pathogen, and clinical factors means that sepsis is not a single disease. Yet, sepsis treatment is largely uniform and far from personalized. This is likely the major reason that sepsis clinical trials research is considered the graveyard for pharmaceutical and biotechnology industries. However, the use of omics-derived classifiers to allow early definition of pathogen class, host factors, and risk groups might begin to define sepsis sub-categories [4,5]. Doing so can aid in new sepsis clinical trials research and can also help re-evaluate failed studies that could have been effective for certain sepsis sub-types. The ultimate goal is to personalize sepsis diagnosis and treatment, first in clinical research and then in clinical care. A similar approach to sepsis prognosis may lead to an early awareness of risk to progression to severe sepsis and death which could prioritize patients to a more aggressive treatment and monitoring arm [6].

    Antimicrobial resistance and stewardship

    With the increase in antibacterial resistance and a decreasing antibacterial pipeline, there is a need for coordinated efforts to promote appropriate use of antibacterial agents. Such stewardship encourages the appropriate use of antimicrobials by promoting the selection of the optimal drug regimen. In particular, a reliable bacterial versus viral host response diagnostic could prevent millions of unnecessary prescriptions for antibiotics. Precision medicine can help solve the crisis of antimicrobial resistance by changing the way antibacterial agents are prescribed and developed. Ultimately, important tools for combatting resistance are rapid, inexpensive, accurate, point of care diagnostics to guide antimicrobial decision making. Such early disease diagnostics not only can help reduce antibiotic overuse, but also encourage clinicians to identify other causes for the inflammatory response when infection is absent.

    Clinical trial efficiency

    An additional benefit of precision medicine for bacterial infection is that improved diagnostics will help drive new drug development by facilitating clinical trials for new antimicrobials, particularly agents to be reserved for resistant pathogens or agents that are intended for use only against a limited range of pathogens. Without the benefit of rapid, point of care diagnostics, clinical trials for bacterial infections generally enroll patients with relevant clinical symptoms, but only a small percentage of those enrolled patients are infected with the target pathogen (e.g., a particular species or multi-drug resistant pathogens). Targeted therapy will support new business models for antibacterial agents, focusing treatments for the right patients at the right time. For some clinical trials, rapid diagnostics can decrease the cost of a clinical trial, which in turn helps industry maintain research and development for new agents. Precision medicine could also harmonize regulatory guidelines by increasing the comparability of patient populations.

    Pre-symptomatic disease detection

    The genomic revolution has also provided an opportunity to harness the host response to assess the earliest responses to pathogen exposure. Through the use of novel study designs such as human challenge studies with unattenuated live virus, clinical and molecular data can be mined to derive signatures of early signs of infection, even before the onset of symptoms [7,8]. These signatures can be validated in real life models of disease transmission through the use of index:cluster studies. Specifically, viral-exposed subjects can be monitored for emergent disease as demonstrated by the presence of pre-symptomatic signatures which would allow for early isolation and treatment when indicated. One can envision the screening of concentrated groups of people to exonerate them from illness before high-value deployment such as among military personnel.

    Vaccines

    Vaccine development is limited by reliance on animal models that may not always predict human responses. Overall, modeling age-specific human immunity outside the body, coupled with big data approaches, could help developers formulate vaccines for distinct populations. Research teams leading Precision Vaccines Programs are learning how vaccines work in different populations and working to develop new versions that optimally target the most vulnerable. To speed vaccine development, model systems have been developed using human monocytes, which are important contributors to innate immune responses to vaccines antigens or adjuvants. Adjuvant-type vaccines have been shown to turn up or down a different set of biological pathways at different times of life. These differential responses provide clues to how effective a vaccine may be in preventing infections in specific age groups. Further, investigators have shown that proteins expressed by adult or newborn cell types could function as potential predictors of adverse effects such as swelling or fever—leading to new ways to create smarter, more tolerable formulations.

    Precision population health

    Although not directly related to genomic medicine, the emergence of big data techniques combined with geographic information systems allows for refined approaches to identify vulnerable or at-risk populations. These precision public health approaches allow for more efficient and equitable distribution of resources, particularly in resource-limited settings.

    Defining challenges for inflammatory diseases

    Historically, inflammatory diseases were synonymous with rheumatic diseases. These syndromes were highly variable, even among patients who seemingly had the same condition such as systemic lupus erythematosus. Over time, diagnostic criteria were developed to allow patients to be assigned to specific disease states. For example, systemic lupus erythematosus has gone through a series of diagnostic criteria, being updated periodically to reflect new tests and new understanding. The most recently issued diagnostic criteria were offered in 2012 by the Systemic Lupus International Collaborating Clinics (SLICC) [9]. Those criteria require the presence of at least 4 of 17 components, including at least 1 of 11 clinical criteria and 1 of 6 immunologic criteria, or that the patient has biopsy-proven SLE nephritis in conjunction with antinuclear antibodies or anti-double-stranded DNA antibodies. This approach allows for the standardization of clinical treatment as well as research in an otherwise heterogeneous disease. However, this standardization also neutralizes the differences between individuals and can hinder personalized care for patients with lupus. Two patients with SLE may not have any overlapping signs, symptoms, or laboratory abnormalities yet are treated as having the same disease, particularly in the context of research.

    Similar heterogeneity is evident across the spectrum of rheumatic diseases. For example, juvenile idiopathic arthritis encompasses a multitude of conditions. A scheme offered by the International League of Associations for Rheumatology includes systemic arthritis, polyarthritis, oligoarthritis (persistent and extended), enthesitis-related arthritis, and psoriatic arthritis [10]. However, children may meet criteria for more than one subclass, may change from one class to another over time, and can be further stratified by the presence of other conditions such as fibromyalgia [11].

    Beyond these diseases, which classically fall under the rheumatology umbrella, we are learning that many other disease states have an inflammatory component as an intrinsic aspect of their pathology. For example, coronary artery disease, cancer, osteoarthritis, and others have been shown to have inflammatory mediators that are central to pathogenesis [12–14]. This has given rise to a number of immunomodulatory strategies with variable results so far.

    Our recognition of the importance of inflammation in a large diversity of diseases creates both challenges and opportunities. The challenges inherent in this scenario stem largely from the heterogeneity among individuals and even within individuals over time. This complicates efforts to diagnose, treat, and predict outcomes. This heterogeneity, which is not adequately accounted for in current algorithms, runs counter to the goals of precision medicine. However, exciting efforts are underway to embrace this heterogeneity, allowing for more personalized understanding of disease with direct bearing on diagnosis, treatment, and prognosis. Models enabling precision medicine in rheumatic disease may incorporate demographic information (gender, race, ethnicity, age), molecular data (genetics, transcriptomics, functional immunology), and disease specific variables (duration, severity, activity, historical response to treatment). Once defined, these models can be applied to disease prediction, prevention, treatment personalization, and improved patient participation.

    The tools needed to enable these advances in precision medicine are becoming more prevalent and robust. They stem in part from large patient data repositories such as the Childhood Arthritis and Rheumatology Research Alliance (CARRA). CARRA has greater than 50 enrollment sites throughout North America and Puerto Rico that enroll and gather information about children with JIA, SLE, dermatomyositis with additional rheumatic diseases coming. Similar registries have been created for lupus and rheumatoid arthritis (Corrona) among others. Whereas these registries initially focused on patient-reported and routine clinical measures, there is a nascent movement toward more comprehensive biological phenotyping such as genomics, transcriptomics, proteomics, and immune profiling. One innovative example was an effort by DxTerity Diagnostics, which used social media to reach thousands of patients with SLE in the LIFT study. Upon completing an online consent form, subjects were mailed a kit to collect a blood sample, which was then used to generate transcriptomic data. Bypassing the need for enrollment in clinical environments substantially reduced costs and time. Results from this study may help identify SLE subtypes that cluster together based on gene expression, which may itself define distinct disease states, link biomarkers to therapy, and improve prognosis of disease progression.

    These new directions that aggregate clinical data from thousands of patients combined with systems biology measurements create hope and promise for a new era in precision medicine. They come at a time when the number of immunomodulatory agents, particularly biologics, continues to escalate. Yet clinicians have very little to guide their decisions regarding what drug to start, when to switch, and how long to treat.

    Summary

    The chapters presented in this volume will expand on these themes. The authors will explore various infectious and rheumatic diseases as well as conditions that are increasingly recognized as inflammatory in nature. Challenges inherent to the diagnosis and management of these diseases will be reviewed as well as strategies to address shortcomings with a vision for how more precise care may be delivered to the individual, but also for improvement of population health.

    References

    [1] Yang W.E., Woods C.W., Tsalik E.L. Host-based diagnostics for detection and prognosis of infectious diseases. In: Sails A., Tang Y.-W., eds. Methods in microbiology. Elsevier Ltd; 465–500. 2015;vol. 42.

    [2] Mallal S., Phillips E., Carosi G., et al. HLA-B*5701 screening for hypersensitivity to abacavir. N Engl J Med. 2008;358(6):568–579.

    [3] Fleischmann C., Scherag A., Adhikari N.K., et al. Assessment of global incidence and mortality of hospital-treated sepsis. Current estimates and limitations. Am J Respir Crit Care Med. 2016;193(3):259–272.

    [4] Tsalik E.L., Henao R., Nichols M., et al. Host gene expression classifiers diagnose acute respiratory illness etiology. Sci Transl Med. 2016;8(322):322ra311.

    [5] Sweeney T.E., Azad T.D., Donato M., et al. Unsupervised analysis of transcriptomics in bacterial sepsis across multiple datasets reveals three robust clusters. Crit Care Med. 2018;46:915–925.

    [6] Sweeney T.E., Perumal T.M., Henao R., et al. A community approach to mortality prediction in sepsis via gene expression analysis. Nat Commun. 2018;9(1):694.

    [7] McClain M.T., Nicholson B.P., Park L.P., et al. A genomic signature of influenza infection shows potential for presymptomatic detection, guiding early therapy, and monitoring clinical responses. Open Forum Infect Dis. 2016;3(1):ofw007.

    [8] Woods C.W., McClain M.T., Chen M., et al. A host transcriptional signature for presymptomatic detection of infection in humans exposed to influenza H1N1 or H3N2. PLoS One. 2013;8(1):e52198.

    [9] Petri M., Orbai A.M., Alarcon G.S., et al. Derivation and validation of the systemic lupus international collaborating clinics classification criteria for systemic lupus erythematosus. Arthritis Rheum. 2012;64(8):2677–2686.

    [10] Petty R.E., Southwood T.R., Baum J., et al. Revision of the proposed classification criteria for juvenile idiopathic arthritis: Durban, 1997. J Rheumatol. 1998;25(10):1991–1994.

    [11] Nordal E., Zak M., Aalto K., et al. Ongoing disease activity and changing categories in a long-term nordic cohort study of juvenile idiopathic arthritis. Arthritis Rheum. 2011;63(9):2809–2818.

    [12] Libby P. Inflammation in atherosclerosis. Nature. 2002;420(6917):868–874.

    [13] Kroemer G., Senovilla L., Galluzzi L., Andre F., Zitvogel L. Natural and therapy-induced immunosurveillance in breast cancer. Nat Med. 2015;21(10):1128–1138.

    [14] Liu-Bryan R., Terkeltaub R. Emerging regulators of the inflammatory process in osteoarthritis. Nat Rev Rheumatol. 2015;11(1):35–44.

    Chapter 2

    The human genome as a foundation for genomic and precision health

    Huntington F. Willard    Director & Principal, Geisinger National Precision Health, North Bethesda, Maryland, United States

    Abstract

    Understanding the organization, variation, and expression of the human genome is central to the principles of genomic and precision health. Based on the availability of a reference sequence of the human genome, on an emerging appreciation of the extent of genome variation among different individuals and populations, and on a growing understanding of the role of genome variation in disease, it is now possible to begin to exploit the impact of that variation on human health on a broad scale. The comparison of individual genomes underlies the conclusion that virtually every individual has his or her own unique constitution of gene products, produced in response to the combined inputs of the genome sequence and one's particular set of environmental exposures and experiences. This awareness is reminiscent of what the British physician Archibald Garrod termed chemical individuality over a century ago and provides a conceptual foundation for the practice of genomic and precision health.

    Keywords

    Human genome; Genome sequencing; Genome variation; Precision health

    Introduction

    That genetic variation can influence health and disease has been a central, if not yet fully practiced, principle of medicine for over a hundred years. What has limited application of this principle until recently has been the special nature and presumed rarity of clinical circumstances or conditions to which genetic variation was relevant. Now, however—with the availability of a reference sequence of the human genome and a rapidly growing number of genome sequences from both asymptomatic and symptomatic individuals, with emerging appreciation of the extent of genome variation among different individuals and different populations worldwide, and with a growing understanding of the role of common as well as rare variation in disease—we are increasingly able to exploit the impact of that variation on human health on a broad scale, in the context of genomic medicine and precision health [1–3].

    Variation in the human genome has long been the cornerstone of the field of human genetics (Box 1), and its study led to the establishment of the medical specialty of medical genetics some four decades ago. The general nature and frequency of gene variants in the human genome became apparent with work over 50 years ago on the incidence of polymorphic protein variants in populations of healthy individuals, work that is the conceptual forerunner to the much larger and detailed efforts that mark modern human genetics and genomics. Such data underlie the conclusion that virtually every individual has his or her own unique constitution of gene products, the implications of which provide a foundation for what today we call personalized medicine or precision health as a modern application of what the British physician Archibald Garrod called chemical individuality in the very early years of the last century [2].

    Box 1

    Genetics and genomics in precision medicine and health

    Throughout this and the many other chapters in this volume, the terms genetics and genomics are used repeatedly, both as nouns and in their adjectival forms. While these terms seem similar, they in fact describe quite distinct (though frequently overlapping) approaches in biology and in medicine. Having said that, there are inconsistencies in the way the terms are used, even by those who work in the field.

    Here, we provide operational definitions to distinguish the various terms and the subfields of medicine to which they contribute.

    The field of genetics is the scientific study of heredity and of the genes that provide the physical, biological, and conceptual bases for heredity and inheritance. To say that something—a trait, a disease, a code, or an information—is genetic refers to its basis in genes and in DNA.

    Heredity refers to the familial phenomenon whereby traits (including clinical traits) are transmitted from generation to generation, due to the transmission of genes from parent to child. A disease that is said to be inherited or hereditary is certainly genetic; however, not all genetic diseases are hereditary (witness cancer, which is always a genetic disease, but is only occasionally an inherited disease).

    Genomics is the scientific study of a genome or genomes. A genome is the complete DNA sequence, referring to the entire genetic information of a gamete, an individual, a population, or a species. As such, it is a subfield of genetics when describing an approach taken to study genes. The word genome originated as an analogy with the century-old term chromosome, referring to the physical entities (visible under the microscope) that carry genes from one cell to its daughter cells or from one generation to the next. Genomics gave birth to a series of other -omics that refer to the comprehensive study of the full complement of genome products—for example, proteins (hence, proteomics), transcripts (transcriptomics), or metabolites (metabolomics). The essential feature of the -omes is that they refer to the complete collection of genes or their derivative proteins, transcripts, or metabolites, not just to the study of individual entities. The distinguishing characteristics of genomics and the other omics are their comprehensiveness and scale, their integration with and dependence on technology development, an emphasis on rapid data release and availability, and an awareness of the policy and ethical implications of such work in research, in the practice of medicine, and increasingly in the social arena [2, 3].

    By analogy with genetics and genomics, epigenetics and epigenomics refer to the study of factors that affect gene (or, more globally, genome) function, but without an accompanying change in genes or the genome. The epigenome is the comprehensive set of epigenetic changes in a given individual, tissue, tumor, or population. It is the paired combination of the genome and the epigenome that appear to best characterize and determine one's phenotype.

    Medical Genetics is the application of genetics to medicine with a particular emphasis on inherited disease. Medical genetics is a broad and varied field, encompassing many different subfields, including clinical genetics, biochemical genetics, cytogenetics, molecular genetics, the genetics of common diseases, and genetic counseling. Medical Genetics and Genomics is one of 24 medical specialties recognized by the American Board of Medical Specialties, the medical organization overseeing physician certification in the United States.

    Genetic Medicine is a term used to refer to the application of genetic principles to the practice of medicine and thus overlaps medical genetics. However, genetic medicine is somewhat broader, as it is not limited to the specialty of Medical Genetics and Genomics, but is relevant to health professionals in many, if not all, specialties and subspecialties. Both medical genetics and genetic medicine approach clinical care largely through consideration of individual genes and their effects on patients and their families.

    By contrast, Genomic Medicine refers to the use of large-scale genomic information and to consideration of the full extent of an individual's genome and other omes in the practice of medicine and medical decision making. The principles and approaches of genomic medicine are relevant well beyond the traditional purview of medical genetics and include, as examples, gene expression profiling to characterize tumors or to define prognosis in cancer, genotyping variants in the set of genes involved in drug metabolism or action to determine an individual's correct therapeutic dosage, scanning the entire genome for millions of variants that influence one's susceptibility to disease, or analyzing multiple protein or RNA biomarkers to detect exposure to potential pathogens, to monitor therapy and to provide predictive information in presymptomatic individuals.

    Finally, Precision Medicine (or, increasingly, Precision Health) refers to the rapidly advancing field of health care that is informed by each person's unique clinical, genetic, genomic, and environmental information [1]. The goals of precision health/medicine are to take advantage of a molecular understanding of disease—and a vast array of data from sources that range from genomes to Electronic Medical Records to mobile health devices—to optimize preventive health care strategies and drug therapies while people are still well or at the earliest stages of disease. Because these factors are different for every person, the nature of disease, its onset, its course, and how it might respond to drug or other interventions are as individual as the people who have them and the communities in which they live. In order for precision medicine/health to be used by health care providers and their patients, these findings, analyzed with significant input from the data sciences, must be translated into precision diagnostic tests and targeted therapies, using implementation tools that are compatible with the modern health care environment [4]. Since the overarching goal is to optimize medical care and outcomes for each individual, treatments, medication types and dosages, and/or prevention strategies may differ from person to person—resulting in customization and precise targeting of patient care.

    In this introductory chapter, the organization, variation, and expression of the human genome are presented as a foundation for the chapters to follow on approaches in translational genomics and on the principles of genomic and precision health as applied to inflammatory and infectious disease.

    The human genome

    The typical human genome consists of approximately 3 billion (3 × 10⁹) base pairs of DNA, divided among the 24 types of nuclear chromosomes (22 autosomes, plus the sex chromosomes, X and Y) and the much smaller mitochondrial chromosome) (Table 1).

    Table 1

    a From Ensembl, database GRCh38.p12 (version 95.38, accessed February 2019).

    b Mb, megabase pairs.

    c Protein-coding genes only.

    d For example, SNPs, substitutions, in/dels.

    e For example, CNVs, inversions.

    Individual chromosomes can best be visualized and studied at metaphase in dividing cells, and karyotyping of patient chromosomes has been a valuable and routine clinical laboratory procedure for a half century, albeit at levels of resolution that fall well short of most pathogenic DNA variants (Fig. 1). The ultimate resolution, of course, comes from direct DNA sequence analysis, and an increasing number of new technologies have facilitated comparisons of individual genomes with the reference human genome sequence, enabling clinical sequencing of patient samples to search for novel variants or mutations that might be of clinical importance, whether in a diagnostic or screening context [5].

    Fig. 1 Spectrum of resolution in chromosome and genome analysis. The typical resolution and range of effectiveness are given for various diagnostic approaches used routinely in clinical and research practice. FISH, fluorescence in situ hybridization. Source: From Nussbaum RL, McInnes RR, Willard HF. Genetics in medicine. 8th ed. Philadelphia, PA: W.B. Saunders, Co.; 2016. 546 pp, with permission.

    Genes in the human genome

    While the human genome contains an estimated 20,000 protein-coding genes, the coding segments of those genes—the exons—comprise only about 1–2% of the genome; most of the genome consists of DNA that lies between genes, far from genes, or in vast areas spanning several million base pairs (Mb) that appear to contain no genes at all. Despite much progress in gene identification and genome annotation, it is nearly certain that there are some genes, including clinically relevant genes, that are currently undetected or that display characteristics that we do not currently recognize as being associated with genes. Nonetheless, the statement that the vast majority of the genome consists of spans of DNA that are non-genic, of no known function, and of uncertain clinical relevance remains true and remains a challenge for interpreting the clinical relevance of genetic variants across the genome [6].

    In addition to being relatively sparse in the genome, genes are distributed quite nonrandomly along the different human chromosomes. Some chromosomes are relatively gene-rich, while others are quite gene-poor, ranging from approximately 3 genes/Mb of DNA to > 20 genes/Mb (excluding the Y chromosome and the tiny mitochondrial chromosome). And even within a chromosome, genes tend to cluster in certain regions and not in others, a point of clear clinical significance when evaluating genome integrity, dosage, or arrangement in different patient samples.

    Coding and noncoding genes

    There are a number of different types of gene in the human genome. Most genes known or thought to be clinically relevant are protein-coding and are transcribed into messenger RNAs that are ultimately translated into their respective proteins; their products comprise the list of enzymes, structural proteins, receptors, and regulatory proteins that are found in various human tissues and cell types. However, there are additional genes whose functional product appears to be the RNA itself. At least some of these so-called noncoding RNAs (ncRNAs) have a range of functions in the cell, but most do not as yet have any identified function. Overall, the genes whose transcripts make up the collection of ncRNAs could represent as many as a half of the total of ~ 40,000 identified human genes (Table 1).

    Variation in the human genome

    With completion of the initial reference human genome sequence some 20 years ago, attention has turned to the discovery and cataloguing of variation in that sequence among different individuals (including both healthy individuals and those with various diseases) from different populations worldwide [4, 7–11]. Any given individual carries millions of sequence variants that are known to exist in multiple forms (i.e., are polymorphic) in our species. In addition, there are countless very rare variants, many of which probably exist in only a single or a few individuals. In fact, given the number of individuals in our species, essentially each and every base pair in the human genome is expected to vary in someone somewhere around the globe. It is for this reason that the original human genome sequence is considered only a reference sequence, derived as a consensus of the limited number of genomes whose sequencing was part of the Human Genome Project, but likely identical to no individual's genome.

    Types of variation

    Early estimates were that any two randomly selected individuals have sequences that are 99.9% identical or, viewed another way, that an individual genome would be heterozygous at approximately 3–5 million positions, with different bases at the maternally and paternally inherited copies of that particular sequence position. The majority of these differences involve simply a single unit in the DNA code and are referred to as single-nucleotide variants (SNVs) or polymorphisms (SNPs) (Table 1). The remaining variation consists of insertions or deletions (in/dels) of (usually) short sequence stretches, variation in the number of copies of repeated elements or inversions in the order of sequences at a particular locus in the genome (Fig. 2). Any and all of these types of variation can influence disease and thus must be accounted for in any attempt to understand the contribution of genetics to clinical medicine and to precision health (Table 2).

    Fig. 2 Schematic representation of different types of structural polymorphism in the human genome, leading to deletions, duplications, inversions, and CNV changes relative to the reference arrangement. Source: From Estivill X, Armengol L. Copy number variants and common disorders: filling the gaps and exploring complexity in genome-wide association studies. PLoS Genet 2007;3:e190, with permission.

    Table 2

    a bp, base pair; kb, kilobase pair; Mb, megabase pair.

    Copy number variation

    Over the past decade, increasing attention has focused on the prevalence of structural variants in the genome, which, in any given genome, collectively account for far more variation in genome sequence (expressed in terms of the amount of genomic DNA affected) than do SNVs [12]. The most common type of structural variation involves changes in the local copy number of sequences (including genes) in the genome. This variation is based on blocks of different sequences that are present in multiple copies, often with extraordinarily high sequence conservation, in many different locations around the genome. Rearrangements between such duplicated segments are a source of significant variation between individuals in the number of copies of these DNA sequences and these are generally referred to as copy number variants (CNVs) (Fig. 2). When the duplicated regions contain genes, genomic rearrangements can result in the deletion of the region (and the genes) between the copies and thus give rise to disease. It is of considerable ongoing interest to evaluate the role of CNVs and other structural variants in the etiology of a range of clinical conditions.

    De novo mutations

    While much emphasis is placed on inherited genome variation, each such variant had to originate as a de novo or new change occurring in germ cells at some point in time—whether 9 months ago (in the case of a newborn infant's genome) or up to 100,000 years ago (in the case of ancient polymorphisms). At whatever point a de novo change occurred, such a variant would be extremely rare in the population (occurring just once), and its ultimate frequency in the population over time depends on chance and on the principles of Mendelian inheritance and population genetics. The ability to sequence genomes directly provides a robust method for measuring mutation rates genome-wide, by, for example, comparing the sequence of an offspring's genome (or a portion of that genome) with that of his or her parents [13].

    Such studies have shown that every individual carries an estimated 30–70 new mutations per genome that were not present in the genomes of her or his parents. This rate, however, is known to vary from gene to gene around the genome and from individual to individual and is dependent on the age of the parents [14, 15]. Overall, the mutation rate, combined with considerations of population growth and dynamics, predicts that there must be an enormous number of relatively new (and thus rare) mutations in the current worldwide population of 7 billion individuals [12, 13, 16].

    Conceptually similar studies have explored de novo mutations in CNVs, where the generation of a new length variant depends on recombination, rather than on errors in DNA synthesis to generate a new base pair. Indeed, the measured rate of formation of new CNVs is orders of magnitude higher than that of base substitutions [17, 18].

    Variation in individual genomes

    The most extensive current inventory of the amount and type of variation to be expected in any given genome comes from the direct analysis of individual diploid human genomes. Any given human genome typically carries about 5 million SNVs, many of which are previously unknown. This suggests that the number of SNVs described for our species is still incomplete, although presumably the fraction of such novel SNVs will decrease as more and more genomes from more individuals and from more populations are sampled.

    Typically, each genome carries thousands of nonsynonymous SNVs—variants that encode a different amino acid in thousands of protein-coding genes around the genome. These measurements underscore the potential impact of gene and genome variation on human biology and on medicine. A comprehensive and influential international study compared several thousand genome sequences from the 1000 Genomes Project [19]. This study documented that each genome carries 100 or more likely loss-of-function mutations, about 10,000 nonsynonymous changes, and some 500,000 variants that overlap known gene regulatory regions. Ongoing studies continuingly extend and refine these data in the context of individual populations [10, 11].

    These and other findings also indicate—perhaps surprisingly—that thousands of genes in the human genome are highly tolerant to many mutations that appear likely to result in a loss of function [7, 9, 11, 20, 21]. Within the clinical setting, this awareness has important implications for the interpretation of data from sequencing of patient material, particularly when predicting the impact of mutations in genes of currently unknown function and whether they increase or decrease risk relative to the population as a whole [22, 23].

    Notwithstanding the remarkable amount of information on genomes and genome variation over the past decade and its readiness for at least some evidence-driven applications in clinical care, it is clear that we are overall still in a mode of discovery; no doubt many millions of additional SNVs and other variants remain to be uncovered, as does the degree to which any of them might impact an individual's phenotype in the context of wellness and health care. The broad question of what is normal?—an essential concept in human biology and in clinical medicine—remains an open question when it comes to the human genome [24].

    Variation in populations

    Leveraging key technological developments (including whole-exome and whole-genome sequencing) that have greatly increased the throughput of genotyping on a genome-wide scale, a growing number of large-scale projects have gathered genotypic information on millions of SNVs and structural variants in up to many hundreds of thousands of individuals from hundreds of populations worldwide [9, 12, 19, 25]. Two major conclusions emerge: First, some 85–90% of the common variation found in our species is shared among different population groups; a relative minority of common variants are specific to or highly enriched/depleted in genomes from a particular population. And second, most variable sites in the genome are rare, not common, and are private to specific populations or even families rather than ancient and shared among populations [16].

    These findings reflect an explosion of population growth from an ancestral population of likely fewer than 10,000 individuals, with Eurasians diverging from an ancestral African population an estimated 38–64 thousand years ago. It is now possible to trace or reconstruct the history and the genetic/geographic origins of many population groups around the world. These findings are of innate interest to specific groups, but also have profound implications for health care delivery to different groups of individuals worldwide characterized by different DNA variants and thus different susceptibility to different medical conditions, notably evident in the global diversity of inflammatory and infectious diseases, as explored in this volume.

    Expression of the human genome

    A key question in exploring the function of the human genome is to understand how proper expression of our 20,000–40,000 genes is determined, how it can be influenced by either genetic variation or by environmental exposures or inputs, and by what mechanisms such alterations in gene expression can lead to pathology evident in the practice of clinical medicine. The control of gene activity—in development, in different tissues and cell lineages, during the cell cycle, and during the lifetime of an individual, both in sickness and in health—is determined by a complex interplay of genetic, epigenetic, and environmentally-influenced features [2].

    By genetic features, we here refer to those found in the genome sequence (see Box 1), which plays a role, of course, in determining the identity of each gene, its particular form (termed alleles), its level of expression (requiring a consideration of various regulatory elements), and its particular genomic landscape (three-dimensional configuration, base composition, chromatin composition). By epigenetic features, here we mean packaging of the DNA into chromatin, in which it is complexed with a variety of histones as well as innumerable non-histone proteins that influence the accessibility and activity of genes and other genomic sequences. The structure of chromatin—unlike the genome sequence itself—is highly dynamic and underlies the control of gene expression that shapes in a profound way both cellular and organismal function.

    It has been appreciated for decades that there is high variability in gene expression levels among individuals. Much of this is due to differences among the genes themselves, a result consistent with local sequence variation influencing the expression of such genes. It is likely that the ongoing discovery of regulatory variants will correlate with variation in these patterns of gene expression [20], with an anticipated but as-yet-unknown degree of impact on human disease.

    Genes, genomes, and disease

    In the context of genomic medicine and precision health, an overriding question is to what extent variation in the sequence and/or expression of one's genome influences the likelihood of disease onset, determines or signals the natural history of disease, and/or provides clues relevant to the management of wellness or disease. As just discussed, variation in one's constitutional genome can have a number of different direct or indirect effects on gene expression, thus contributing to the likelihood of disease.

    Comprehensive catalogues of genomic and other omic data, from sequence to functional elements encoded in the genome, to interacting networks of RNAs and proteins, and to metabolites, carbohydrates and small molecules in a variety of cell and tissue types, are emerging (Table 3) [26–29]. The integrative nature of physiology and medicine, aided in major part by advances in the data sciences, lends itself well to omic approaches that seek to gather comprehensive datasets that can be queried informatically to gain insights into patterns that promise to reveal distinctive insights about health or disease.

    Table 3

    Enjoying the preview?
    Page 1 of 1