Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Proteomic and Metabolomic Approaches to Biomarker Discovery
Proteomic and Metabolomic Approaches to Biomarker Discovery
Proteomic and Metabolomic Approaches to Biomarker Discovery
Ebook1,304 pages15 hours

Proteomic and Metabolomic Approaches to Biomarker Discovery

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Proteomic and Metabolomic Approaches to Biomarker Discovery, Second Edition covers techniques from both proteomics and metabolomics and includes all steps involved in biomarker discovery, from study design to study execution. The book describes methods and presents a standard operating procedure for sample selection, preparation and storage, as well as data analysis and modeling. This new standard effectively eliminates the differing methodologies used in studies and creates a unified approach. Readers will learn the advantages and disadvantages of the various techniques discussed, as well as potential difficulties inherent to all steps in the biomarker discovery process.

This second edition has been fully updated and revised to address recent advances in MS and NMR instrumentation, high-field NMR, proteomics and metabolomics for biomarker validation, clinical assays of biomarkers and clinical MS and NMR, identifying microRNAs and autoantibodies as biomarkers, MRM-MS assay development, top-down MS, glycosylation-based serum biomarkers, cell surface proteins in biomarker discovery, lipodomics for cancer biomarker discovery, and strategies to design studies to identify predictive biomarkers in cancer research.

  • Addresses the full range of proteomic and metabolomic methods and technologies used for biomarker discovery and validation
  • Covers all steps involved in biomarker discovery, from study design to study execution
  • Serves as a vital resource for biochemists, biologists, analytical chemists, bioanalytical chemists, clinical and medical technicians, researchers in pharmaceuticals and graduate students
LanguageEnglish
Release dateOct 24, 2019
ISBN9780128197882
Proteomic and Metabolomic Approaches to Biomarker Discovery

Related to Proteomic and Metabolomic Approaches to Biomarker Discovery

Related ebooks

Biology For You

View More

Related articles

Reviews for Proteomic and Metabolomic Approaches to Biomarker Discovery

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Proteomic and Metabolomic Approaches to Biomarker Discovery - Haleem J. Issaq

    encouragement.

    Chapter 1

    Biomarker discovery: Study design and execution

    Haleem J. Issaqa; Timothy D. Veenstrab    a Cancer Research Technology Program, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD, United States

    b Department of Applied Sciences, Maranatha Baptist University, Watertown, WI, United States

    Abstract

    The results of a disease are specific changes in the profiles (chemical and biochemical) of biological fluids and tissues before the development of clinical symptoms. Proteomic and metabolomics analysis of biological samples can reveal changes in abundance levels of proteins and metabolites that can function as useful diagnostic and prognostic clinical tests. To become a clinically approved test, a potential biomarker should be confirmed and validated. Confirmation and validation applies to the analytical methodology and the candidate biomarker. A search of the scientific and medical literature indicates that many studies reported the discovery of potential biomarkers without proper validation. In this chapter, the discussion will center on biomarker study design and execution and will point out the needed steps for a successful biomarker discovery.

    Keywords

    Metabolomics; Proteomics, biomarkers; Mass spectrometry; Nuclear magnetic resonance spectroscopy; Chromatography; Electrophoresis

    Outline

    Introduction

    Definitions

    Biomarker

    Sensitivity

    Specificity

    Positive predictive value (PPV)

    Negative predictive value (NPV)

    Proteomics

    Metabolomics

    Profiling

    The current state of biomarker discovery

    Study design and execution

    Study design

    Study execution

    Personnel and instrumentation

    Errors in study design

    The sample

    Errors in study execution

    Sample preparation

    Methods of analysis

    Number of replicates

    Effect of mass spectrometer type on the results

    Effect of separation instrumentation on the results

    Errors in measurements

    Personnel and experimental validation

    Specificity of proteins as biomarkers

    Published results comparison

    Statistical data analysis

    Recommendations

    Concluding remarks and recommendations

    Acknowledgments

    References

    Acknowledgments

    This project has been funded in whole or in part with federal funds from the National Cancer Institute, National Institutes of Health, under Contracts HHSN261200800001E. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the United States Government.

    Introduction

    Diseases result in specific changes in the profiles (chemical and biochemical) of biological fluids and tissue. These changes can be detected by analyzing the samples for genes, proteins, and small molecules (metabolites). Proteomic and metabolomic analysis provides the opportunity to detect diseases as they occur while genetic analyses will identify individuals with predispositions to certain diseases and will determine long-term risk. Therefore, direct measurement of genes, metabolites, and protein expressions is essential for the understanding of biological processes in disease and normal states.¹ Molecules produced by the body's metabolic processes may be able to help distinguish between two different sample sets obtained from, for example, cancer and noncancer-bearing individuals. The distinguishing compounds are known as biomarkers. Since the majority of published studies deal with cancer biomarkers discovery, in this chapter the discussion will be limited to cancer biomarker discovery, which are applicable to other diseases.

    A biomarker is a substance that is overexpressed in biological fluids or tissues in patients with a certain disease. A biomarker can include patterns of single-nucleotide polymorphisms (SNPs), DNA methylation, or changes in mRNA, protein, or metabolite abundances, provided that these patterns can be shown to correlate with the characteristics of the disease.² Biomarkers are used to examine the biological behavior of a disease and predict the clinical outcome. The biomarker should be the result of the disease and not due to environmental conditions or biological perturbations. To be clinically acceptable, a diagnostic biomarker should have a sensitivity and specificity of 100%, and be measured within a noninvasive (urine) or semiinvasive (blood) collected specimen. In addition the test should be accurate, economical, easy to perform, and reproducible by different technicians across different laboratories. Fig. 1 is a description of an ideal description of diagnostic methods. Although some biomarkers have been approved by the Food and Drug Administration as qualitative tests for monitoring specific cancers (e.g., nuclear matrix protein-22 for bladder cancer), unfortunately the majority of discovered potential biomarkers (proteins or metabolites) are not sensitive and/or specific enough to be used for population screening. However, the search for disease biomarkers remains an active area of research, e.g., a search for cancer biomarkers using PubMed search engine resulted in 21,833 hits for 2017 and 14,451 hits for 2018. Also, the search for biomarkers is not limited to cancer but almost to every medical condition known to man.

    Fig. 1 Description of ideal methods for disease diagnosis.

    Definitions

    Biomarker

    A biomarker is a substance that is objectively measured that indicates the presence of an abnormal condition within a patient. A biomarker can be gene (e.g., SNP), protein (e.g., prostate-specific antigen (PSA)), or metabolite-based (e.g., glucose, cholesterol, etc.) that has been shown to correlate with the characteristics of a specific disease.² A biomarker in clinical and medical settings is used for: early disease detection, monitoring response to therapy, and predicting the clinical outcome. They can be categorized according to their clinical applications. Diagnostic markers are used to initially define the histopathological classification and stage of the disease, while prognostic markers can predict the development of disease and the prospect of recovery. Based upon the individual cases, the predictive markers can be used for the selection of the correct therapeutic procedure. The potential biomarker should be confirmed that it is indeed specific to the disease state and is not simply a function of the variability within the biological sample of patients due to differences in diet, genetic background, lifestyle, age, sex, ethnicity, etc. In summary, a biomarker is an agent that can predict the response to therapy, predicting prognosis, monitoring for disease recurrence and assessing response to therapy, in addition to screening for cancer.

    Sensitivity

    Sensitivity of a test or marker is defined as the percentage of positive samples identified by a model as true positive. The false-negative rate is the percent of patients with the disease for whom the test is negative.

    Specificity

    Specificity is the percentage of negative samples (individuals without the disease) identified by a model as true negative. False positive is the number of individuals without the disease in whom the test is positive.

    Positive predictive value (PPV)

    PPV is defined as the percent of individuals in whom the test is positive and the disease is present.

    Negative predictive value (NPV)

    NPV is defined as the percent of individuals in whom the test is negative and the disease is not present.

    Proteomics

    Proteomics is the study of all proteins in a biological sample. The complexity and dynamic concentration range of the proteins that comprise the proteome makes the detection and quantitation of each protein extremely challenging if not impossible.

    Metabolomics

    Metabolomics also known as metabonomics is the study of complete set of small molecules (less than 1500 Da) found within a biological system for the understanding of biological processes in normal and disease states. Direct quantitative measurements of metabolite expressions in urine, serum, plasma and tissue are essential, but extremely difficult due to the complexity and concentration dynamic range of the metabolites in a biological sample. The difference between metabolomics and metabonomics is that metabolomics is the qualitative and quantitative measurement of all metabolites in a system, while metabonomics is the comparison of metabolites levels (profiles) found in two different samples healthy and diseased.

    Profiling

    Profiling is the detection of panels of biomarkers (proteins or metabolites) that may provide higher sensitivities and specificities for disease diagnosis than is afforded with a single marker. Proteomic and metabolomics pattern analysis relies on comparison of differences in relative abundance of a number of polypeptides/proteins and metabolites [mass-to-charge ratio (m/z) and intensity] within the mass spectrum or the NMR spectrum of two sample sets.

    The current state of biomarker discovery

    Examination of the scientific and medical literature clearly indicates that presently most protein and metabolite biomarkers are inadequate to replace an existing clinical test, or their only utility is for detecting advanced stage cancers, where the survival rate is low. Many molecular biomarkers have been suggested for the detection of cancer and other diseases; however, none possess the required sensitivity and specificity. The state of biomarker research may be illustrated by using bladder cancer biomarkers as an example. Bladder cancer is selected because of its recurring nature and the 3-6 months monitoring requirements, making it a very expensive disease to treat. It is disheartening that a lot of effort and funds have been spent on finding a biomarker for bladder cancer without resulting in an acceptable test to replace cystoscopy, voided urine cytology, and imaging studies; the current standards of care for the detection and monitoring of bladder tumors. A literature search indicates the presence of many molecular biomarkers for bladder cancer³; however, none of the molecular markers have proven to be sensitive and specific enough to replace cystoscopy.⁴ Another reason why most published proteomic and metabolomics studies have not provided results that have progressed from the laboratory to the clinic is that the majority of studies stopped at the discovery phase, i.e., the preclinical exploratory studies to identify potentially useful markers without validation.

    The following biomarkers have been approved by the United States Food and Drug Administration (FDA) as qualitative tests for bladder cancer: nuclear matrix protein (NMP22) with 56% sensitivity; bladder tumor antigen (BTAstat) with 58% sensitivity, and UroVysion with 36%–65% sensitivity,⁵ while hyaluronic acid and hyalurodinase measurements have a sensitivity of 92%.⁶ While it is obvious that none of these markers are sensitive enough to be recommended for population screening for bladder cancer, they might be used to monitor the recurring of the disease.

    To get a relative understanding of these levels of sensitivity, it is worth discussing prostate cancer, in which the levels of circulating PSA are used as a diagnostic test. For men aged 50 and older, the presence of PSA levels ≥ 4 ng/mL may indicate the presence of prostate cancer. The diagnostic value of PSA, which has a sensitivity of 86%, specificity of 33%, and PPV of 41%, is not satisfactory, and doctors’ recommendations for PSA screening vary. However, FDA has approved the use of the PSA test along with a digital rectal exam.

    Does the increase in PSA levels give higher sensitivity and specificity? Thompson et al.⁸ reported that for detecting any prostate cancer, PSA cutoff values of 1.1, 2.1, 3.1, and 4.1 ng/mL yielded sensitivities of 83.4%, 52.6%, 32.2%, and 20.5%, and specificities of 38.9%, 72.5%, 86.7%, and 93.8%, respectively. The authors reported that there is no cut point of PSA level with simultaneous high sensitivity and high specificity for monitoring healthy men for prostate cancer. Thus, the majority of PSA elevations between 4 and 10 ng/mL are due to prostatic hyperplasia rather than the malignancy leading to many unnecessary biopsies.

    Cancer antigen CA 15-3 and cancer antigen CA 27.29 are two well-known biomarkers for monitoring breast cancer. CA 15-3 is a blood test given during or after treatment for breast cancer. It is most useful for monitoring advanced breast cancer and response to treatment. CA 15-3 and CA 27.29 are not screening tests; they are tumor marker tests that are helpful in tracking cancers that overproduce CA 15-3 and CA 27.29. Only about 30% of patients with localized breast cancer will have increased levels of CA 15-3,⁹ while many patients with liver and breast diseases show elevated levels. Another malignancy of the reproductive system, ovarian cancer, is detected using a combination of pelvic examination, transvaginal ultrasonography, and laparoscopy.¹⁰ There is no specific and sensitive diagnostic test for ovarian cancer, although cancer antigen 125 (CA 125) is used to distinguish between benign and malignant diseases,¹⁰ it is not a reliable biomarker because it is affected by other factors and 20% of ovarian cancer patients do not express CA 125.¹¹

    The aforementioned selected examples show that to date there are no 100% sensitive and specific biomarkers for different types of cancer. Does a combination of biomarkers give better sensitivity and specificity? The answer is yes, for example Hortsmann et al.¹² studied the effect of using a combination of bladder cancer biomarkers on sensitivity and specificity. Although none of the combinations resulted in 100% sensitivity and specificity, the sensitivity improved over using a single biomarker.

    The question that needs to be addressed is why these and other potential biomarkers failed in achieving adequate sensitivity and specificity and are not accepted as clinical tests. The answer is not an easy one because we are dealing with detecting cancer at an early stage in humans that have different age, sex, and ethnicity. An important fact also needs to be considered: in biomarker studies, the aim is to find a protein or a metabolite (that is probably at an extremely low concentration level) among thousands of proteins and metabolites. This aim is extremely challenging. Examination of the scientific and medical literature clearly indicates that presently most protein and metabolite biomarkers are inadequate to replace an existing clinical test. One of the major reasons that proteomics and metabolomics studies over the past decade have failed to discover molecules to replace existing clinical tests is due to errors in either study design and/or experimental execution.

    Study design and execution

    The search for biomarkers for any disease and especially cancer requires careful consideration of different aspects of a study before its initiation. These include study design, experimental execution, personnel, and instrumentation.

    Study design

    The design of a biomarker discovery project should consider the following steps: what disease to study, the number of patients and matched controls, selection of patients’ sex, age, and ethnicity, type of samples (tissue, blood, serum, plasma) and what class of molecule(s) to search for (proteins, metabolites, or nucleotides), and if the goal of the search is for a profile or a single discriminating molecule. If the search is for a cancer biomarker, the study should also specify the type of cancer (bladder, breast, prostate, etc.) and preferably the stage of the cancer.

    Study execution

    Study execution deals with experimental parameters that can affect the results and need to be considered. These parameters include sample collection, handling and storage conditions, sample preparation, method of analysis, number of replicates, and data analysis.

    Personnel and instrumentation

    A biomarker discovery study requirements include first and foremost a budget, an adequate number of patients and healthy subjects (controls), clinicians (physicians, surgeons, pathologists, and technicians), modern instrumentation, competent analytical chemists, biochemists, and bioinformaticists.

    Errors in study design

    The current procedure for proteomic or metabolomics study in search of biomarkers is depicted in Fig. 2. As the figure indicates, a specimen (urine, blood, or tissue) is taken from two groups: diseased patients and healthy subjects. The specimens are analyzed, the results are compared, and the discriminating factors are determined.

    Fig. 2 General procedure for biomarker discovery using HPLC/MS and statistical data analysis.

    The sample

    Selection and preparation of the sample in biomarker discovery is a crucial step in the success of finding a disease biomarker. There are multiple decisions that should be considered prior to initiating the search because they can affect the integrity of the results. These include:

    1Cancer type and stage

    2Sample type

    3Selection of patients and controls

    4Number of patient and control samples

    5Ethnicity, sex, and age of patients and controls

    6Sample collection, handling, and storage

    7Method of sample analysis

    8Type of sample

    Each of these steps should be given a careful consideration prior to the initiation of any study. They will be discussed separately pointing out their influence on the search for a successful outcome.

    Cancer type and stage

    The first step in any search for a biomarker is to decide which disease condition to study. In this chapter, the discussion is limited to cancer since it is a very complicated and devastating disease that affects thousands of people without any discrimination in age, gender, or ethnicity. Also, early detection of cancer means a higher survivor rate and less suffering. The decision is therefore to decide which cancer type to study, and whether to analyze all stages together as one experiment or each stage separately. It is preferable to carry the experiment on each stage separately in order to find out at what stage the biomarker (protein or metabolite) can be detected. Such findings will be clinically beneficial.

    Sample type

    After the decision has been made as to which cancer and stage to study, the next decision is related to the type of sample: tissue, blood, urine, cells, or other fluid. An important objective of biomarker research is to find a biomarker using a noninvasive (urine, tears, saliva) or minimally invasive (serum or plasma) sample, and to avoid, if at all possible, using invasive procedures (tissue and cerebrospinal fluid). A literature search of the biomedical literature indicates that the most commonly used samples for cancer biomarker discovery are urine, blood (serum and plasma), and tissues.¹³ Blood is preferable to urine because blood flows throughout the body and its composition is stable and reflects the state of the body at the time of collection. Urine’s, although easily accessible, composition is subject to variation and dilution. Urine however is a preferable specimen when studying bladder cancer, especially transitional cell carcinoma, because whatever is shed, leaked, or secreted from the tumor will be found in the urine. Also, the amount of urine produced in 24 h is less than the amount of blood circulating in the body, so, the biomarker molecules gets more diluted in blood than in urine. The best sample for a successful search of a biomarker for a solid tumor, although invasive and not easily accessible, is tumor tissue and its adjacent normal tissue. Blood contains larger amounts of proteins than urine. Also, blood contains albumin, which makes the analysis of the blood proteome difficult, can mask low-abundance proteins and its removal may cause the loss of interacting proteins.

    Careful consideration should be given to blood specimen analysis: should the blood sample be analyzed as blood or converted to serum or plasma prior to analysis. Serum and plasma were mutually incompatible for proteome comparison.¹⁴ A large number of peptides, many of them in rather high abundance, are only present in serum and not detectable in plasma.¹⁵

    The profile of plasma and serum metabolites is different.¹⁶ Another difference between serum and plasma is that plasma is the liquid portion of unclotted blood that is left behind after all the various cell types are removed. To prepare plasma, blood is withdrawn from the patient into a vial in the presence of an anticoagulant and the sample is centrifuged to remove cellular elements. The most commonly used anticoagulants include heparin, ethylenediamine tetraacetic acid (EDTA), or sodium citrate. Serum is blood plasma without fibrinogen or the other clotting factors. It is prepared by collecting blood in the absence of any coagulant. Under these conditions, a fibrin clot forms. This clot is then removed using centrifugation, leaving behind serum.¹⁷ Removal of the clot results in lower protein content in serum than plasma.

    Selection of patients and controls

    Subjects selected for a study should be checked by a physician to ensure the presence or absence of the disease. Tissue samples should be examined by a pathologist prior to analysis. Blood can be analyzed as serum or plasma. Is there a difference in analyzing serum over plasma? A recent metabolomics study showed obvious differences in the GC/MS chromatograms of plasma and serum taken from the same healthy human subjects.¹⁶ Of the 72 identified compounds between the samples, only 36 were common to serum and plasma. Also, the results indicated that some of the common 36 metabolites had different concentrations in serum and plasma. These results highlighted the difficulty in comparing interlaboratory results using different sample types. Generally, the number of patients and control subjects in published studies is very small to give an acceptable statistical value. For cancer biomarker discovery, biofluids and tissues are collected from a group of patients of different cancer stages and compared to a group of healthy persons. The effect of cancer stage on sensitivity of a single biomarker should be taken into consideration as was pointed out in a recent study.¹²

    Number of samples

    The number of samples in biomarker discovery should be adequate to give statistically different results between two sets of samples: cancer and control. The number may vary from 25 to 100 samples in a set; the larger the number of samples, the more accurate are the statistical results. However, for an epidemiological or validation study, the number of diseased samples and controls should be in the hundreds. Unfortunately, most published biomarker discovery studies tested limited number of clinical samples.

    Ethnicity, sex, and age

    To date, a study is normally carried out using biofluids or tissues collected from patients and healthy subjects of different ages, sex, and race. Using samples from patients and controls that are of different ages and sex can influence the results. A recent study of 269 subjects (131 males and 138 females) evaluated the effects of age, sex, and race on plasma metabolites.¹⁸ The patients were of Caucasian, African American, and Hispanic descent and ranged in age from 20 to 65 years. The subjects were divided into three different age groups; 20–35, 36–50, and 51–65. Using GC/MS and HPLC/MS methods, it was reported that more than 300 metabolites were detected of which more than 100 metabolites were associated, with age, many fewer with sex and fewer still with race.¹⁸

    Attention should therefore be paid to the selection of patients and controls for a biomarker study and should not include (a) widely different ages; (b) mix of men and women; and (c) different ethnicities.

    Sample collection, handling, and storage

    Samples are collected from persons that had a physical exam by a physician who determines that the person of interest has the disease or is healthy. Samples should be collected in clean freezer-type tubes and stored in a freezer immediately until time of analysis. Hsieh et al. showed that using different blood collection tubes affects the observable proteome of serum and plasma.¹³ At the time of analysis, samples should be thawed on ice or room temperature and prepared according to the selected method of analysis. The history of the sample is very important; blood and tissue samples used in search of biomarkers may have been obtained from sample storage banks without proper collection, storage, and information about the age and condition of the patient and if cancer the stage of the disease. Also the storage periods may be different. A lack of consistency in sample selection, collection, handling, and storage can doom any study to failure before data collection.

    One issue that is of constant concern in the analysis of serum or plasma samples is the method of collection, preparation, and storage. It is a fact that sample collection, handling, and storage have great impact on the sensitivity, selectivity, and reproducibility of any given analysis. Detailed information on clinical and pathological parameters should be secured before samples are collected. Specimens should be collected by trained personnel. Blood samples should immediately be converted to serum or plasma and stored in the freezer at − 80°C until time of analysis to prevent any enzymatic activity. Two studies have shown a significant effect of freeze/thaw cycles on the proteome profile of serum/plasma¹⁹,²⁰ Also, factors utilized in the preparation of serum, such as the anticoagulant used, the clotting time allowed, and the length of the time period before centrifugation, had a significant effect on the serum proteome. A few studies have been carried out showing that sampling procedures (i.e., fasting, time sample acquired from patient, etc.) had the greatest effects on proteome profiling, while handling procedures and storage conditions had relatively minor effects.²¹ However, everyone agrees that standardized protocols for sample, handling, storage and analysis are required, since the issue is not about which procedure is better but rather about using standardized procedures to obtain comparable and reproducible results between different laboratories.²²,²³

    Detailed information about specimen collection and handling procedures can be found in the two standard operating procedures in this book and at the Food and Drug Administration's web page.²⁴

    Method of sample analysis

    Selection of the method of sample preparation and analysis plays an important role in determining the accuracy of the results, as discussed later.

    Type of sample

    Selection of the sample for biomarker detection involves an important decision, tissue or a fluid (blood, plasm, urine, etc.). There are advantages and limitations to each type of sample. The tissue requires a surgical procedure while it is easier to collect a fluid. The biomarker is more concentrated in the tumor tissue than in patients’ fluid.

    Errors in study execution

    Study execution deals with many experimental parameters that should be carefully considered for a successful experiment with meaningful and reproducible results.

    Sample preparation

    Preparation of the sample for proteomic and metabolomic analysis can introduce errors that will affect the quality of the final results. The search for biomarkers in biological samples involves different steps depending on the sample type and if the analysis is for metabolites or proteins, targeted or global (profiling). Extraction of metabolites from blood, urine, or tissue for a global study is not an easy task. It may require multiple extraction procedures using different solvent systems. It is not always possible to extract all the metabolites from a sample with a single solvent since metabolites have different chemical and physical properties and are present in a wide dynamic concentration range. For details, see the chapters on sample preparation for proteomics and metabolomics.

    Preparation of a blood sample for proteomic study is more complicated than urine, as urine contains fewer proteins and cells, and the high abundant proteins must be depleted from blood prior to HPLC/MS/MS analysis. Approximately 99% of the protein content of blood (both serum and plasma) is made up of only about 20 proteins.²⁵ While depletion of these proteins will allow the detection of low abundant proteins, it may remove proteins that are bound to these 20 proteins resulting in the loss of potentially important information.²⁶ Tissues are homogenized first after which metabolites and proteins are extracted. Incomplete homogenization can lead to losses that can affect the accuracy of the results. For detailed discussion, see the chapters on sample preparation for metabolomics and proteomics.

    Methods of analysis

    Choosing the optimal analysis method is critical in proteomics and metabolomics studies. For example, analyzing the plasma proteome involves protein precipitation and solubilization; therefore, the downstream fractionation method must be either electrophoresis or a liquid-phase method.

    Three different approaches for the global analysis of serum proteins have been used: global serum proteome analysis using two- and three-dimensional HPLC/MS²⁷,²⁸; analysis of low-molecular-weight proteins/peptides²⁹; and investigation of proteins and peptides that are bound to high-abundance serum proteins.³⁰ Unfortunately, studies have shown that the analysis of the plasma proteome by groups using different methods resulted not only in different number of protein identifications but poor overlap between the results.³¹

    Common methods for analysis of a metabolome include GC/MS, HPLC/MS, or CE/MS. Which technique to use depends on the compounds of interest. Each technique has its advantages and limitations. Buscher et al.³² tested the three techniques using a mixture of metabolites covering the pentose phosphate pathway, the tricarboxylic acid cycle, redox metabolism, amino acids, glycolysis, and nucleotides to test the three methods. Out of 75 intermediate standard metabolites, 33 were common to the three methods, 64 by CE, 42 by GC, and 65 by LC. A combination of LC and GC detected 70 metabolites. All metabolites were detected using the three methods. These results prove that the method of analysis is an important part of biomarker discovery.

    Number of replicates

    Analytical chemistry teaches us that a sample should be analyzed in triplicate and to report the mean and standard deviation. Unfortunately, most published proteomic and metabolomics studies analyze each sample only once, which does not permit the error in the measurement to be calculated. Proteomic analysis of a biological sample involves depletion of high-molecular-weight proteins, digestion, fractionation, and HPLC/MS analysis. Each one of these steps can introduce an error. The greatest error is introduced by the final step, the HPLC/MS/MS. It has been pointed out³³,³⁴ that to extract the largest number of protein identifications the sample should be analyzed at least in triplicates, because the complexity of a digest of an entire proteome is such that the analysis, even with a high-resolution LC/MS system, exceeds the systems peak capacity.³⁵ This observation was illustrated by Dr. Sam Hanash and his coworkers in the analysis of a plasma sample using HPLC/MS/MS. Repeat runs resulted in the identification of 32% and 36% more peptides and proteins, respectively.

    Effect of mass spectrometer type on the results

    In proteomic and metabolomic studies, the mass spectrometer plays a central role and the selection of the instrument can affect the results. Gika et al.³⁶ coupled a single ultrahigh-pressure liquid chromatography instrument (UPLC) to a triple quadrupole linear ion trap (Q-TRAP) and a hybrid quadrupole time-of-flight (Qq-TOF) mass spectrometer using both positive and negative electrospray ionization (ESI) to study the metabolic profile of rat urine. The flow from the UPLC column was split equally and the streams of eluent were simultaneously directed to the inlets of the two mass spectrometers. Data from both mass spectrometers were subjected to multivariate statistical analysis. After applying the same data extraction software, a number of ions were found to be unique to either data set.

    The study clearly indicates that not all ions were detected using a Qq-TOF or Q-TRAP. The authors concluded that Given the design differences between instruments this is perhaps not that surprising a finding but nevertheless it raises important questions about how to evaluate data from different laboratories produced on different mass spectrometers even when (nominally) the same sample processing and chromatography have been used.³⁶

    In another study, Elias et al.³⁴ compared the results of triplicate measurements of the yeast proteome by LC-MS/MS using linear ion trap (LTQ) and Qq-TOF mass spectrometers. The data were searched using both Mascot and SEQUEST. The results from the two instruments were different with each search engine providing a different number of identifications. From the LTQ data, 666 and 644 identifications were exclusive to Mascot and SEQUEST, respectively, while 4056 proteins were identified using both algorithms. For the Qq-TOF data, 1012 and 510 identifications were exclusive to Mascot and SEQUEST, respectively, while 1955 proteins were identified using both algorithms.³⁴

    Effect of separation instrumentation on the results

    The most commonly used analytical methods for finding potential biomarkers are SDS PAGE, HPLC/MS, and GC/MS. SDS PAGE is used only for the fractionation and separation of proteins. GC is an excellent technique for the separation of volatile compounds; however, it is not suitable for the separation of proteins. It is a simple, relatively economical, and fast technique that possesses high resolving power and reproducibility. Although GC using a single column can achieve high-resolution separations, two-dimensional (2D) GC is the preferred procedure for the comprehensive separation of a metabolomics mixture.³⁷

    HPLC has been used in both metabolomics and proteomic studies in search of biomarkers. Increased resolution in HPLC is achieved by using smaller packing particles (i.e., ≤ 2 μm) and high pressure (UPLC). Wilson et al.³⁸ reported that UPLC offered significant advantages over conventional reversed-phase HPLC (up to 4000 psi). It more than doubled the peak capacity, giving approximately a 10-fold increase in speed and a 3-5-fold increase in sensitivity compared to that generated with a conventional 3.5-μm stationary phase. Although UPLC MS/MS using a single column possesses a high resolving power, two-dimensional 2D HPLC is the preferred procedure for the comprehensive separation of the proteome.³⁷

    Errors in measurements

    One cannot ignore the experimental and human errors in the measurement of proteins and metabolites in complex mixtures. In a recent metabolic study using GC/MS to search for amino acids markers in urine of 11 bladder cancer patients and 8 controls, the error of reported results was extremely high and ranged from 4%–93% and 6%–94% for the patients with bladder cancer and controls, respectively.³⁹ The high errors and the overlap between cancer patients and controls do not result in a specific and sensitive method nor can they be used for population studies or to replace a clinical test. Therefore, attention should be paid to eliminate human and experimental errors. Errors arise from sample collection and preparation procedures and analysis.

    Personnel and experimental validation

    Any research to be done correctly requires trained and competent personnel using validated and proven methods. Therefore, to avoid any errors, trained personnel should be used in every aspect of the research from sample collection, handling, and storage to sample analysis and results manipulation.

    Specificity of proteins as biomarkers

    The search for a protein biomarker in a biofluids or tissue is like searching for a needle in a haystack; however, the search may result in multiple proteins that are each involved in more than one pathological condition. To decipher the important one is not an easy task. A single biomarker protein may be associated with multiple cancers and diseases. For example, a urine proteomic study revealed 26 proteins that were overexpressed in bladder cancer.⁴⁰ A search using Ingenuity Pathway Systems indicated that each of these proteins is involved in multiple cancers and diseases, suggesting that any of these proteins would result in a biomarker with low sensitivity and specificity. As an example, annexin A1 is involved in cardiovascular disease, endocrine system disorders, gastrointestinal disease, hematological disease, immunological disease, metabolic disease, organismal injury and abnormalities, reproductive system disease, and respiratory disease, in addition to cancer. Annexin A1 protein is reported to be downregulated in ductal carcinoma⁴¹ and squamous cell carcinoma⁴² and upregulated in bladder cancer.⁴⁰ In human laryngeal tumors, annexin A1 was upregulated in the nuclei and cytoplasmic granule matrix from larynx mast cells, and downregulated in larynx epithelial cells.⁴³

    Another example is carcinoembryonic antigen (CEA), which is used mainly to monitor the treatment of cancer patients, especially those with colon cancer. A PubMed search using CEA and cancer indicates that CEA is used as a marker for cancers of the lung, breast, rectum, liver, pancreas, stomach, and ovary. Also, not all cancers produce CEA. Increased CEA levels can indicate some noncancer-related conditions such as inflammation, cirrhosis, rectal polyps, emphysema, ulcerative colitis, peptic ulcer, and benign breast disease. CEA is not recommended for screening a general population. These results indicate that selecting a protein as a biomarker of a single pathological condition is not an easy task.

    Published results comparison

    As mentioned earlier, comparison of results from different sources is challenging due to differing sample preparation and experimental procedures. Another aspect is how to examine the data. The following examples can illustrate this point. Sreekumar et al.⁴⁴ in a study published in Nature identified sarcosine as a potential biomarker for prostate cancer using metabolomics. In a following study, Jentzmik et al.⁴⁵ stated that Our study diminish the hope that the ratio of sarcosine to creatinine will become a successful indicator for prostate cancer management. That might be the case if the comparison of both findings was accurate. Sreekumar et al.⁴⁴ compared the ratio of sarcosine to alanine, while Jentzmik et al.⁴⁵ compared the ratio of sarcosine to creatinine.

    Statistical data analysis

    Multivariate statistical analysis is generally employed to analyze nuclear magnetic resonance (NMR) or MS data to discriminate between two different data sets. Metabolomic as well as proteomic analysis of biological systems using NMR, GC/MS, CE/MS, and HPLC/MS, as with genomics, transcriptomics, and proteomics, results in a wealth of information that can be overwhelming, virtually impossible to analyze manually, and time consuming. For any meaningful interpretation of the data, the appropriate statistical tools must be employed to manipulate the large raw data sets in order to provide a useful, understandable, and workable format. Different multidimensional and multivariate statistical analyses and pattern-recognition programs have been developed to distill the large amounts of data in an effort to interpret the complex metabolic pathway information from the measurements and to search for the discriminating features between two data sets.⁴⁶ The most popular multivariate statistical methods are principal component analysis (PCA),⁴⁷ partial least square discriminate analysis (PLS-DA),⁴⁸ and support vector machines (SVM).⁴⁹ Mehadevan et al.⁵⁰ compared PLS-DA multivariate analysis with SVM for the analysis of NMR data. Their results indicated that SVM were superior to PLS-DA in terms of predictive accuracy with the least number of features. Van et al.⁴⁹ used two-dimensional total correlation spectroscopy NMR and statistical analysis to compare the global metabolic profiles of urines obtained from wild-type and ABCC6-knockout mice. Three statistical methods were used to analyze the NMR spectra: PCA, PLS-DA, and OPLS-DA. The PLS-DA and OPLS-DA gave almost identical results, while PCA gave slightly different results. However, all the three methods could successfully discriminate between the two groups.

    Issaq et al.⁵¹ used PCA and OPLS-DA to analyze HPLC/MS data obtained from the urines of 41 bladder patients and 48 healthy volunteers. The PCA analysis resulted in two separate groups corresponding to normal and cancer urines, and correctly predicted 40 of 41 bladder cancer and 46 of 48 healthy volunteers. The OPLS-DA confirmed the predicted results of the PCA program in terms of sensitivity and specificity; however, OPLS-DA correctly predicted 48 of 48 healthy and 41 of 41 of bladder cancer urines.⁵¹

    Recommendations

    Caution: Biological fluids and tissues should be handled carefully using safe procedures.

    Sample vials when taken out of the freezer should be checked for breakage prior to defrosting. Samples should be thawed at room temperature and not by heating or in placed hot water bath. Standard operating procedures should be followed in the same manner for all samples in a study. Urine specimen may contain different amounts of analytes; therefore, peak intensities should be normalized and aligned. To prevent loss of sample and information, minimum sample steps should be used. In the case of global metabolic studies, different solvents should be used for maximum analytes extraction. The use of internal standard is advised.

    Concluding remarks and recommendations

    Can metabolomic and proteomic studies lead to a cancer biomarker? In short, yes. The ultimate diagnostic biomarker for any disease is one that gives 100% sensitivity and specificity. It seems that this level of accuracy is more an ideal than an attainable goal for discovering biomarkers using metabolomics and proteomics. That does not mean that the search for biomarkers should be stopped; on the contrary, the search should be intensified because of the benefits of detecting cancer or any disease at an early stage. The failure in finding sensitive and specific metabolic and proteomic biomarkers for cancer may be attributed to different factors: the small number of samples that are analyzed; lack of information on the history of the samples; case and control specimen are not age and sex matched; limited metabolomic and proteomic coverage; and the need to follow clear standard operating procedures for sample selection, collection, storage, handling, analysis, and data interpretation. Also, most studies to date used serum, plasma, urine, or tissue from cancer patients and controls. A more sound approach is to search for proteins in the cancer tissue first, then look for the discriminating proteins in the blood or urine as was suggested by Zhang and Chan.⁵² Johan et al.⁵³ study of renal cell carcinoma collected cancer tissue, adjacent normal tissue, and preoperative blood taken from the same patient. To search for a biomarker, the proteomes extracted from the tissues and preoperative plasma were analyzed using 2D-liquid chromatography-mass spectrometry (LC-MS). They identified proteins that were present in the tumor but not the normal tissue. Also discriminating proteins found in the tumor tissue were found in the preoperative plasma. In a recent study of kidney cancer, Ganti et al.⁵⁴ performed a simultaneous multiple matrix (tissue, blood, and urine) metabolomics analysis. The HPLC/MS and GC/MS analysis resulted in the identification of 267 metabolites in tissue, 246 in serum, and 267 in urine, of which 89 were common to the three matrices. The results also indicated that serum analysis is a more accurate proxy for tissue changes than urine.

    When all these factors are resolved, we firmly believe that the search will result in a sensitive and specific biomarker for cancer. It is only a matter of time and effort. Although a single discovered biomarker may not have 100% sensitivity and specificity, it is possible that a combination of biomarkers will minimize the number of false positives and false negatives in population screening. All that is needed to discover more sensitive and specific biomarkers is to correct the mistakes of the past. We believe that with further advancements in MS, separation technologies, NMR (specifically for metabolomics), and the use of reproducible and accurate analytical procedures, more sensitive biomarkers will be discovered.

    References

    1 Bocket C., Coleman M., Collins B., et al. Photoaptamer arrays applied to multiplexed proteomic analysis. Proteomics. 2004;4(3):609–618.

    2 MacNeil J.S. Better biomarkers for the diagnostics labyrinth. Genome Technol. 2004;24–33.

    3 Apolo A.B., Milowsky M., Bajorin D.F. Clinical states model for biomarkers in bladder cancer. Future Oncol. 2009;5:977–992.

    4 Lintula S., Hotakainen K. Developing biomarkers for improved diagnosis and treatment outcome monitoring of bladder cancer. Expert Opin Biol Ther. 2010;10:1169–1180.

    5 Glas A.S., Roos D., Deutekom M., et al. Tumor markers in the diagnosis of primary bladder cancer. A systematic review. J Urol. 2003;169:1975–1982.

    6 Villicana P., Whiting B., Goodison S., Rosser C.J. Urine-based assays for the detection of bladder cancer. Biomark Med. 2009;3:265.

    7 National Cancer Institute. NCI 2009, prostate-specific antigen (PSA) test.http://www.cancer.gov/cancertopics/factsheet/Detection/PSA.

    8 Thompson I.M., Ankerst D.P., Chi C.A., et al. Operating characteristics of prostate-specific antigen in men with an initial PSA level of 3.0 ng/ml or lower. JAMA. 2005;294:66–70.

    9 Ebeling F.G., Stieber P., Untch M., et al. Serum CEA and CA 15-3 as prognostic factors in primary breast cancer. Br. J. Cancer. 2002;86:1217–1222.

    10 van Nagell Jr. J.R., DePriest P.D., Reedy M.B., et al. The efficacy of transvaginal sonographic screening in asymptomatic women at risk for ovarian cancer. Gynecol Oncol. 2000;77:350–356.

    11 Niloff J.M., Knapp R.C., Schaetzl E., et al. CA125 antigen levels in obstetric and gynecologic patients. Obstet Gynecol. 1984;64:703–707.

    12 Horstmann M., Patschan O., Hennenlotter J., et al. Combinations of urine-based tumor markers in bladder cancer surveillance. Scand J Urol Nephrol. 2009;43:461–466.

    13 PubMed search. 2012.

    14 Hsieh S.Y., Chen R.K., Pan Y.H., et al. Systematical evaluation of the effects of sample collection procedures on low-molecular-weight serum/plasma proteome profiling. Proteomics. 2006;6:3189–3198.

    15 Tammen H., Schulte I., Hess R., et al. Peptidomic analysis of human blood specimens: comparison between plasma specimens and serum by differential peptide display. Proteomics. 2005;5:3414–3422.

    16 Liu L., Aa J., Wang G., et al. Differences in metabolite profile between blood plasma and serum. Anal Biochem. 2010;406:105–112.

    17 Issaq H.J., Xiao Z., Veenstra T.D. Serum and plasma proteomics. Chem Rev. 2007;107(8):3601–3620.

    18 Lawton K.A., Berger A., Mitchell M., et al. Analysis of the adult human plasma metabolome. Pharmacogenomics. 2008;9:383–397.

    19 West-Nielsen M., Hogdall E.V., Marchiori E., et al. Sample handling for mass spectrometric proteomic investigations of human sera. Anal Chem. 2005;77(16):5114–5123.

    20 Baumann S., Ceglarek U., Fiedler G.M., et al. Standardized approach to proteome profiling of human serum based on magnetic bead separation and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Clin Chem. 2005;51:973–980.

    21 Banks R.E., Stanley A.J., Cairns D.A., et al. Influences of blood sample processing on low-molecular-weight proteome identified by surface-enhanced laser desorption/ionization mass spectrometry. Clin Chem. 2005;51:1637–1649.

    22 Zolg W. The proteomic search for diagnostic biomarkers: lost in translation?. Mol Cell Proteomics. 2006;5:1720–1726.

    23 Tuck M.K., Chan D.W., Chia D., et al. Standard operating procedures for serum and plasma collection: early detection research network consensus statement standard operating procedure integration working group. J Proteome Res. 2009;8:113–117.

    24 http://www.fda.gov/cdrh/clia.

    25 The Plasma Proteome Institute. http://www.plasmaproteome.org.

    26 Zhou M., Lucas A., Chan K.C., et al. An investigation into the human serum interactome. Electrophoresis. 2004;25:1289–1298.

    27 Xiao Z., Conrads T.P., Lucas D.A., et al. Direct ampholyte-free liquid-phase isoelectric peptide focusing: application to the human serum proteome. Electrophoresis. 2004;25:128–133.

    28 Chan K.C., Lucas D.A., Hise D., et al. Analysis of the human serum proteome. Clin Proteomics. 2004;1:101–112.

    29 Tirumalai R.S., Chan K.C., Prieto D.A., et al. Characterization of the low molecular weight human serum proteome. Mol Cell Proteomics. 2003;2:1096–1103.

    30 Anderson N.L., Polanski M., Pieper R., et al. The human plasma proteome: a nonredundant list developed by combination of four separate sources. Mol Cell Proteomics. 2004;3:311–326.

    31 Buscher J.M., Czernik D., Ewald J.C., et al. Cross-platform comparison of methods for quantitative metabolomics of primary metabolism. Anal Chem. 2009;81:2135–2143.

    32 Liu H., Sadygov R.G., Yates J.R. A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal Chem. 2004;76:4193–4201.

    33 Elias J., Haas W., Faherty B.K., Gygi S.P. Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations. Nat Methods. 2005;2:667–675.

    34 Faca V., Pitteri J., Newcomb L., et al. Contribution of protein fractionation to depth of analysis of the serum and plasma proteomes. J Proteome Res. 2007;6:3558–3565.

    35 Gika H.G., Theodoridis G.A., Earll M., et al. Does the mass spectrometer define the marker? A comparison of global metabolite profiling data generated simultaneously via UPLC-MS on two different mass spectrometers. Anal Chem. 2010;82:8226–8234.

    36 Liu Z.Y., Phillips J.B. Comprehensive two-dimensional gas chromatography using an on-column thermal modulator interface. J Chromatogr Sci. 1991;29:227–231.

    37 Wilson I.D., Nicholson J.K., Castro-Perez J., et al. High resolution ultra performance liquid chromatography coupled to a-TOF mass spectrometry as a tool for differential metabolic pathway profiling in functional genomic studies. J Proteome Res. 2005;4:591–598.

    38 Kim J.W., Lee G., Moon S.M., et al. Metabolomic screening and star pattern recognition by urinary amino acid profile analysis from bladder cancer patients. Metabolomics. 2010;6:202–206.

    39 Issaq H.J. Role of separation science in biomarker discovery: opportunities and pitfalls. In: Pittsburgh conference on analytical chemistry and applied spectroscopy; 2011.

    40 Kim K., Aronov P., Zakharkin S.O., et al. Urine metabolomics analysis for kidney cancer detection and biomarker discovery. Mol Cell Proteomics. 2009;8:558–570.

    41 Kind T., Tolstikov V., Fiehn O., Weiss R.H. A comprehensive urinary metabolomic approach for identifying kidney cancer. Anal Biochem. 2007;363:185–195.

    42 Perroud B., Lee J., Valkova N., Dhirapong A., et al. Pathway analysis of kidney cancer using proteomics and metabolic profiling. Mol Cancer. 2006;5:64.

    43 Sreekumar A., Poisson L.M., Rajendiran T.M., et al. Sarcosine in urine after digital rectal examination fails as a marker in prostate cancer detection and identification of aggressive tumors. Nature. 2009;457:910–914.

    44 Jentzmik F., Stephan C., Miller K., et al. Sarcosine in urine after digital rectal examination fails as a marker in prostate cancer detection and identification of aggressive tumours. Eur Urol. 2010;58:12–18.

    45 Nicholson J.K., Lindon J.C., Holmes E. ‘Metabonomics’: understanding the metabolic responses of living systems to pathophysiological stimuli via multivariate statistical analysis of biological NMR spectroscopic data. Xenobiotica. 1999;29:1181–1189.

    46 Holmes E., Antti H. Chemometric contributions to the evolution of metabonomics: mathematical solutions to characterizing and interpreting complex biological NMR spectra. Analyst. 2002;127:1549–1557.

    47 Keun H., Ebbels T., Antti H., et al. Improved analysis of multivariate data by variable stability scaling: application to NMR-based metabolic profiling. Anal Chim Acta. 2003;490:265–276.

    48 Vapnick V. Estimation of dependences based on empirical data. New York: Springer-Verlag; 1982.

    49 Mehadvan S., Shah S.L., Marrie T.J., Slupsky C.M. Analysis of metabolomic data using support vector machines. Anal Chem. 2008;80:7562–7570.

    50 Van Q.N., Issaq H.J., Jiang Q., et al. Comparison of 1D and 2D NMR spectroscopy for metabolic profiling. J Proteome Res. 2008;7:630–639.

    51 Issaq H.J., Nativ O., Waybright T., et al. Detection of bladder cancer in human urine by metabolomic profiling using high performance liquid chromatography/mass spectrometry. J Urol. 2008;179:2422–2426.

    52 Zhang H., Chan D.W. Cancer biomarker discovery in plasma using a tissue-targeted proteomic approach. Cancer Epidemeol Biomarkers Prev. 2007;16:1915–1917.

    53 Johann Jr. D.J., Wei B.R., Prieto D.A., et al. Combined blood/tissue analysis for cancer biomarker discovery: application to renal cell carcinoma. Anal Chem. 2010;82(5):1584–1588.

    54 Ganti S., Taylor S.L., Aboud O.A., et al. Kidney tumor biomarkers revealed by simultaneous multiple matric metabolomics analysis. Cancer Res. 2012;72:347–349.

    Chapter 2

    Proteomic and mass spectrometry technologies for biomarker discovery

    Andrei P. Drabovicha; Maria P. Pavloub; Ihor Batrucha; Eleftherios P. Diamandisa,b,c,d    a Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, ON, Canada

    b Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada

    c Department of Clinical Biochemistry, University Health Network, Toronto, ON, Canada

    d Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Toronto, ON, Canada

    Abstract

    Protein biomarker discovery is a multiphase process that includes biomarker candidate identification, verification, and validation. In this chapter, we review sample types and analytical methods used at each phase of protein biomarker discovery. The chapter also describes the biomarker development pipeline while focusing on mass spectrometry (MS) as a key technique for qualitative and quantitative analysis of proteins. Details on proteomic sample preparation protocols, protein identification and quantification approaches, posttranslational modification analysis, and MS instrumentation are provided. Proteomics-specific requirements for biomarker verification and validation phases are reviewed along with the major limitations of MS for biomarker discovery and development.

    Keywords

    Proteomics; Biomarker discovery; Biomarker verification; Biomarker validation; Mass spectrometry; Selected reaction monitoring

    Outline

    Introduction

    Protein biomarker discovery and development pipeline

    Proteomic samples

    Protein identification using mass spectrometry

    Protein digestion

    Protein and peptide separation techniques

    Protein and peptide ionization techniques

    Mass spectrometry instrumentation

    Deconvolution and database search of tandem mass spectra

    Posttranslational modifications as disease biomarkers

    Protein quantification using mass spectrometry

    Label-free quantification

    Metabolic and enzymatic labeling

    Chemical labeling

    Selected reaction monitoring assays

    Separation and enrichment strategies for quantification of low-abundance proteins

    Biomarker verification

    Biomarker validation

    Limitations of mass spectrometry for protein biomarker discovery

    Conclusions and future outlook: Integrated biomarker discovery platform

    References

    Abbreviations

    Da 

    Daltons

    ELISA 

    enzyme-linked immunosorbent assay

    ESI 

    electrospray ionization

    FDA 

    the U.S. Food and Drug Administration

    FWHM 

    full width at half maximum

    LC 

    liquid chromatography

    m/z mass-to-charge ratio

    MALDI 

    matrix-assisted laser desorption/ionization

    MS 

    mass spectrometry/spectrometer

    MS1 

    mass spectrum collected for all precursor ions in sample prior to fragmentation

    MS/MS 

    tandem mass spectrometry, or mass spectrum collected for fragment ions

    PTM 

    posttranslational modification

    SILAC 

    stable isotope labeling by amino acids in cell culture

    SRM 

    selected reaction monitoring

    TOF 

    time-of-flight mass spectrometry

    XIC 

    extracted ion chromatogram

    Introduction

    Proteomics is defined as a large-scale study of protein expression, structure, and function in time and space. Relative to genome, transcriptome, or metabolome analysis, the large diversity of protein sequences and multiple posttranslational modifications (PTMs) make proteome analysis an even more challenging undertaking. Unlike the genome, the proteome is dynamic; a static set of genes may result in different proteomic phenotypes depending on the developmental stage of an organism and environmental factors. The dynamic nature of the proteome results in a wide range of protein reference values in healthy individuals, thus complicating the clinical applications of proteomics.

    The last two decades have seen an impressive progress in proteomics, mainly due to significant advances in mass spectrometry (MS), high-throughput antibody production, and bioinformatics and biostatistics algorithms. The Human Proteome Project was launched in September 2010 with a goal to identify and characterize at least one protein product for each of the estimated 20,300 protein-coding genes.¹ Disease-driven initiatives of the Human Proteome Project lay the foundation for clinical and diagnostic applications of proteins, such as development of disease biomarkers.

    Protein biomarker discovery and development pipeline

    Development of protein biomarkers is a multiple-phase procedure, analogous to the drug development process. The biomarker development pipeline includes the formulation of a specific clinical question, identification of proteins, selection of biomarker candidates, verification of candidates in an independent cohort of samples, rigorous validation of candidates, development and validation of a clinical assay, and finally assay approval by regulatory health agencies, such as the U.S. Food and Drug Administration (FDA) or the European Medicines Agency (Fig. 1). The cost of a biomarker development study is estimated to be in the range of 10% of an entire drug development study. In addition, the discovery-to-clinical assay timeline may exceed many years. For example, the cancer biomarker HE4 was cleared by the FDA in 2000,² but its clinical assay was not approved until 2008.³

    Fig. 1 The proteomic biomarker development pipeline. As biomarker candidates proceed through the pipeline, the number of clinical samples increases, while analytical technologies change from complex and low-throughput mass spectrometry methods to straightforward and high-throughput immunoaffinity assays.

    Prior to the launch of a biomarker discovery study, one should first consider unmet clinical needs, decide whether a diagnostic molecule has a potential to answer a specific clinical question with a certain confidence, and predict whether the answer would aid in physicians’ decision making. It should be acknowledged that the clinical decision will be made, based on a biomarker's performance in combination with noninvasive medical imaging techniques, such as magnetic resonance imaging (MRI) and/or ultrasound. Performance of a marker with high area under the receiver operating characteristic (ROC) curve may not be the sole requirement for a biomarker's successful use in clinics. Instead, based on disease character and the cost of the follow-up examination, biomarkers with either higher sensitivity or higher specificity may be

    Enjoying the preview?
    Page 1 of 1