Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Proteomic and Metabolomic Approaches to Biomarker Discovery
Proteomic and Metabolomic Approaches to Biomarker Discovery
Proteomic and Metabolomic Approaches to Biomarker Discovery
Ebook1,225 pages9 hours

Proteomic and Metabolomic Approaches to Biomarker Discovery

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Proteomic and Metabolomic Approaches to Biomarker Discovery demonstrates how to leverage biomarkers to improve accuracy and reduce errors in research. Disease biomarker discovery is one of the most vibrant and important areas of research today, as the identification of reliable biomarkers has an enormous impact on disease diagnosis, selection of treatment regimens, and therapeutic monitoring. Various techniques are used in the biomarker discovery process, including techniques used in proteomics, the study of the proteins that make up an organism, and metabolomics, the study of chemical fingerprints created from cellular processes.

Proteomic and Metabolomic Approaches to Biomarker Discovery is the only publication that covers techniques from both proteomics and metabolomics and includes all steps involved in biomarker discovery, from study design to study execution. The book describes methods, and presents a standard operating procedure for sample selection, preparation, and storage, as well as data analysis and modeling. This new standard effectively eliminates the differing methodologies used in studies and creates a unified approach. Readers will learn the advantages and disadvantages of the various techniques discussed, as well as potential difficulties inherent to all steps in the biomarker discovery process.

A vital resource for biochemists, biologists, analytical chemists, bioanalytical chemists, clinical and medical technicians, researchers in pharmaceuticals, and graduate students, Proteomic and Metabolomic Approaches to Biomarker Discovery provides the information needed to reduce clinical error in the execution of research.

  • Describes the use of biomarkers to reduce clinical errors in research
  • Includes techniques from a range of biomarker discoveries
  • Covers all steps involved in biomarker discovery, from study design to study execution
LanguageEnglish
Release dateMay 20, 2013
ISBN9780123947956
Proteomic and Metabolomic Approaches to Biomarker Discovery

Related to Proteomic and Metabolomic Approaches to Biomarker Discovery

Related ebooks

Biology For You

View More

Related articles

Reviews for Proteomic and Metabolomic Approaches to Biomarker Discovery

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Proteomic and Metabolomic Approaches to Biomarker Discovery - Haleem J. Issaq

    1

    Biomarker Discovery

    Study Design and Execution

    Haleem J. Issaq and Timothy D. Veenstra,    Laboratory of Proteomics and Analytical Technologies, Advanced Technology Program, Frederick National Laboratory for Cancer Research, Frederick, MD, USA

    Outline

    Abbreviations

    Introduction

    Definitions

    Biomarker

    Sensitivity

    Specificity

    Positive Predictive Value (PPV)

    Negative Predictive Value (NPV)

    Proteomics

    Profiling

    Metabolomics

    The Current State of Biomarker Discovery

    Study Design and Execution

    Study Design

    Study Execution

    Personnel and Instrumentation

    Errors in Study Design

    The Sample

    Cancer Type and Stage

    Sample Type

    Selection of Patients and Controls

    Number of Samples

    Ethnicity, Sex, and Age

    Sample Collection, Handling, and Storage

    Method of Sample Analysis

    Errors in Study Execution

    Sample Preparation

    Methods of Analysis

    Number of Replicates

    Effect of Mass Spectrometer Type on the Results

    Effect of Separation Instrumentation on the Results

    Errors in Measurements

    Personnel and Experimental Validation

    Specificity of Proteins as Biomarkers

    Published Results Comparison

    Statistical Data Analysis

    Recommendations

    Concluding Remarks and Recommendations

    Acknowledgments

    References

    Abbreviations

    LC    

    liquid chromatography

    HPLC    

    high performance LC

    UPLC    

    ultra-high-pressure LC

    CE    

    capillary electrophoresis

    GC    

    gas chromatography

    MS    

    mass spectrometry

    SDS-PAGE    

    sodium dodecyl sulfate-polyacryl gel electrophoresis

    Acknowledgments

    This project has been funded in whole or in part with federal funds from the National Cancer Institute, National Institutes of Health, under Contract HHSN261200800001E. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the United States Government.

    Introduction

    Diseases result in specific changes in the molecular profiles of biological fluids and tissue. These changes can be detected by analyzing the genetic, proteomic, or metabolomic composition of samples. Proteomic and metabolomic analysis provides the opportunity to detect diseases as they occur; genetic analyses identify individuals with predispositions to certain diseases and aid in the determination of long-term risk. Therefore, direct measurement of genes, proteins, and metabolites is essential for the understanding of biological processes in disease and normal states.¹ Molecules produced by the body’s metabolic processes may distinguish between two different sample sets obtained from, for example, cancer and non-cancer-bearing individuals. These distinguishing compounds are known as biomarkers. Because the majority of published studies deal with the discovery of cancer biomarkers, the discussions in this chapter are limited to cancer biomarker discovery. Many of the methods described, however, are also applicable to other diseases.

    A biomarker is a substance that is overexpressed in biological fluids or tissues in patients with a certain disease. A biomarker can include patterns of single-nucleotide polymorphisms (SNPs), DNA methylation, or changes in mRNA, protein, or metabolite abundances. The important point is that these patterns correlate with the characteristics of the disease.² Biomarkers are used to examine the biological behavior of a disease and predict the clinical outcome. The biomarker should be disease specific and not due to environmental conditions or biological perturbations. To be clinically acceptable, a diagnostic biomarker should possess a sensitivity and specificity as close to 100% as possible and be measured within a noninvasive (urine) or semi-invasive (blood) collected specimen. In addition, the test should be accurate, economical, easy to perform, and reproducible by different technicians across different laboratories. A description of an ideal description of diagnostic methods is provided in Figure 1. Although some biomarkers have been approved by the Food and Drug Administration (FDA) as qualitative tests for monitoring specific cancers (e.g., nuclear matrix protein-22 for bladder cancer), the majority of discovered potential biomarkers (proteins or metabolites) are not sensitive and/or specific enough to be used for population screening.

    FIGURE 1 Description of ideal methods for disease diagnosis.

    Definitions

    Biomarker

    A biomarker is an objectively measured molecule substance that indicates the presence of an abnormal condition within a patient. A biomarker can be a gene (e.g., SNP), protein (e.g., prostate-specific antigen), or metabolite (e.g., glucose, cholesterol, etc.) that has been shown to correlate with the characteristics of a specific disease.² A biomarker in clinical and medical settings can be used for many purposes, including early disease detection, monitoring response to therapy, and predicting clinical outcome. Biomarkers can be categorized according to their clinical applications. In cancer, diagnostic markers are used to initially define the histopathological classification and stage of the disease, and prognostic markers can predict the development of disease and the prospect of recovery. Based upon the individual cases, the predictive markers can be used for the selection of the correct therapeutic procedure. The potential biomarker should be confirmed that it is indeed specific to the disease state and is not a function of the variability within the biological sample of patients due to differences in diet, genetic background, lifestyle, age, sex, ethnicity, and so on.

    Sensitivity

    Sensitivity of a test or marker is defined as the percentage of positive samples identified by a model as true positive. The false negative rate is the percent of patients with the disease for whom the test is negative.

    Specificity

    Specificity is defined as the percentage of negative samples (individuals without the disease) identified by a model as true negative. False positive is the number of individuals without the disease in whom the test is positive.

    Positive Predictive Value (PPV)

    Positive predictive value (PPV) is defined as the percent of individuals in whom the test is positive and the disease is present.

    Negative Predictive Value (NPV)

    Negative predictive value (NPV) is defined as the percent of individuals in whom the test is negative and the disease is not present.

    Proteomics

    Proteomics is the study of all proteins in a biological sample. The complexity and dynamic concentration range of the proteins, along with the dynamic nature of the proteins that constitute the proteome, makes the detection and quantitation of each protein virtually impossible. In general, most biomarker discovery studies aim to characterize as many proteins as possible.

    Profiling

    Profiling is the detection of panels of biomarkers (proteins or metabolites) that may provide higher sensitivities and specificities for disease diagnosis than is afforded with a single marker. Proteomic and metabolomics pattern analysis relies on comparison of differences in relative abundance of a number of polypeptides/proteins and metabolites (mass-to-charge ratio [m/z] and intensity) within the mass spectrum or the nuclear magnetic resonance (NMR) spectrum of two sample sets.

    Metabolomics

    Metabolomics, also known as metabonomics, is the study of a complete set of small molecules (less than 1,500 Daltons [Da]) found within a biological system for the understanding of biological processes in normal and disease states. Direct quantitative measurements of metabolite expressions in urine, serum, plasma, and tissue are essential but extremely difficult due to the complexity and concentration dynamic range of the metabolites in a biological sample. The difference between metabolomics and metabonomics is that metabolomics is the qualitative and quantitative measurement of all metabolites in a system, and metabonomics is the comparison of metabolite levels (profiles) found in two different samples: healthy and diseased.

    The Current State of Biomarker Discovery

    Examination of the scientific and medical literature clearly indicates that most protein and metabolite biomarkers presently in use are inadequate to replace an existing clinical test, or their only utility is for detecting advanced stage cancers, for which the survival rate is low. Many molecular biomarkers have been suggested for the detection of cancer and other diseases; however, none possess the required sensitivity and specificity. The state of biomarker research may be illustrated by using bladder cancer biomarkers as an example. Bladder cancer is selected because of its recurring nature and the three- to six-month monitoring requirements, making it a very expensive disease to treat. It is disheartening that a lot of effort and funds have been spent on finding a biomarker for bladder cancer without resulting in an acceptable test to replace cystoscopy, voided urine cytology, and imaging studies—the current standards of care for the detection and monitoring of bladder tumors. A literature search indicates the presence of many molecular biomarkers for bladder cancer³; however, none of the molecular markers have proven to be sensitive and specific enough to replace cystoscopy.⁴ Another reason why most published proteomic and metabolomic studies have not provided results that have progressed from the laboratory to the clinic is that the majority of studies stopped at the discovery phase and never progressed onto the necessary verification or validation phases.

    The following biomarkers have been approved by the U.S. FDA as qualitative tests for bladder cancer: nuclear matrix protein (NMP22) with 56% sensitivity; bladder tumor antigen (BTAstat) with 58% sensitivity; and UroVysion with 36% to 65% sensitivity,⁵ and hyaluronic acid and hyaluronidase measurements have a sensitivity of 92%.⁶ Although none of these markers are sensitive enough to be recommended for population screening for bladder cancer, they might be used to monitor the recurrence of the disease.

    To get a relative understanding of these levels of sensitivity, it is worth discussing prostate cancer, in which the levels of circulating prostate-specific antigen (PSA) is used as a diagnostic test. For men age 50 and older, the presence of PSA levels ≥4 ng/ml may indicate the presence of prostate cancer. The sensitivity, specificity, and positive predictive values of PSA (86%, 33%, and 41%, respectively) are not sufficient for widespread acceptance of this marker as a screening tool for prostate cancer. However, the FDA has approved the use of the PSA test along with a digital rectal exam.

    Do increases in PSA levels give higher sensitivity and specificity? Thompson et al.⁸ reported that for detecting any prostate cancer, PSA cutoff values of 1.1, 2.1, 3.1, and 4.1 ng/mL yielded sensitivities of 83.4%, 52.6%, 32.2%, and 20.5% and specificities of 38.9%, 72.5%, 86.7%, and 93.8%, respectively. The authors reported that there is no cut point of PSA level with simultaneous high sensitivity and high specificity for monitoring healthy men for prostate cancer. The majority of PSA elevations between 4–10 ng/ml are due to prostatic hyperplasia rather than the malignancy leading to many unnecessary biopsies.

    Cancer antigen (CA) 15-3 and cancer antigen CA 27.29 are two well-known biomarkers for monitoring breast cancer. CA 15-3 is a blood test given during or after treatment for breast cancer. It is most useful for monitoring advanced breast cancer and response to treatment. CA 15-3 and CA 27.29 are not screening tests; they are tumor marker tests that assist in tracking cancers that overproduce CA 15-3 and CA 27.29. Only about 30% of patients with localized breast cancer will have increased levels of CA 15-3⁹; many patients with liver and breast diseases show elevated levels. Another malignancy of the reproductive system, ovarian cancer, is detected using a combination of pelvic examination, transvaginal ultrasonography, and laparoscopy.¹⁰ There is no specific and sensitive diagnostic test for ovarian cancer; although CA 125 is used to distinguish between benign and malignant diseases,¹⁰ it is not a reliable biomarker because it is affected by other factors and 20% of ovarian cancer patients do not express CA 125.¹¹

    The above-mentioned examples show that to date there are no 100% sensitive and specific biomarkers for different types of cancer. Does a combination of biomarkers give better sensitivity and specificity? The answer is yes. For example, Hortsmann et al.¹² studied the effect of using a combination of bladder cancer biomarkers on sensitivity and specificity. Although none of the combinations resulted in 100% sensitivity and specificity, the sensitivity improved over using a single biomarker.

    The question that needs to be addressed is why these and other potential biomarkers failed in achieving adequate sensitivity and specificity and are not accepted as clinical tests. The answer is not an easy one because we are dealing with detecting cancer at an early stage in humans with different characteristics such as age, sex, and ethnicity. Another important fact should also be considered: in biomarker studies, the aim is to find a protein or a metabolite (that is probably at an extremely low concentration level) among thousands of proteins and metabolites. This goal is extremely challenging. One of the major reasons that proteomic and metabolomic studies over the past decade have failed to discover molecules to replace existing clinical tests is due to errors in either study design and/or experimental execution.

    Study Design and Execution

    The search for biomarkers for any disease and especially cancer requires careful consideration of different aspects of a study prior to its initiation. These include study design, experimental execution, personnel, and instrumentation.

    Study Design

    The design of a biomarker discovery project should consider the following steps: disease of interest, the number of patients and matched controls, selection of patients’ sex, age, and ethnicity, type of samples (tissue, blood, serum, plasma), what class of molecule(s) to measure (proteins, metabolites, or nucleotides), and whether the goal of the search is for a profile or a single discriminating molecule. If the study is focused on cancer, the type and stage of cancer must be specified.

    Study Execution

    Study execution deals with experimental parameters that can affect the results. These parameters include sample collection, handling and storage conditions, sample preparation, method of analysis, number of replicates, and data analysis.

    Personnel and Instrumentation

    A biomarker discovery study requires a budget, an adequate number of patients and healthy subjects (controls), clinicians (physicians, surgeons, pathologists, and technicians), modern instrumentation, competent analytical chemists, biochemists, and bioinformaticists.

    Errors in Study Design

    The current procedure for proteomic or metabolomic studies in search of biomarkers is depicted in Figure 2. As the figure indicates, a specimen (urine, blood, or tissue) is taken from two groups: diseased patients and healthy subjects. The specimens are analyzed, the results are compared, and the discriminating factors are determined.

    FIGURE 2 General procedure for biomarker discovery using HPLC/MS and statistical data analysis.

    The Sample

    Selection and preparation of the sample in biomarker discovery is a crucial step in the success of finding a disease biomarker. There are multiple decisions that should be considered prior to initiating the search because they can affect the integrity of the results. These include:

    1. Cancer type and stage

    2. Sample type

    3. Selection of patients and controls

    4. Number of patient and control samples

    5. Ethnicity, sex, and age of patients and controls

    6. Sample collection, handling, and storage

    7. Method of sample analysis

    Each of these steps should be given a careful consideration prior to the initiation of any study.

    Cancer Type and Stage

    The first step in any search for a biomarker is to decide which disease condition to study. In this chapter, the discussion is limited to cancer because it is a very complicated and devastating disease that affects millions of people of all ages, genders, and ethnicities. Also, early detection of cancer means a higher survivor rate and less suffering. The decision is therefore to decide which cancer type to study and whether to analyze all stages together as one experiment or each stage separately. It is preferable to conduct the experiment on each stage separately in order to find out at what stage the biomarker (protein or metabolite) can be detected.

    Sample Type

    After the decision has been made as to which cancer and stage to study, the next decision is related to the type of sample: tissue, blood, urine, cells, or other fluid. An important objective of biomarker research is to find a biomarker using a noninvasive (e.g., urine, tears, saliva, etc.) or minimally invasive (e.g., serum, plasma, etc.) sample and to avoid, if at all possible, using invasive procedures (e.g., tissue, cerebrospinal fluid, etc.). A literature search of the biomedical literature indicates that the most commonly used samples for cancer biomarker discovery are urine, blood (serum and plasma), and tissues.¹³ Blood is preferable to urine because blood flows throughout the body and its composition is stable and reflects the state of the patient at the time of collection. Although urine is easily accessible, its composition is subject to variation and dilution. Urine is preferable, however, when studying bladder cancer, especially transitional cell carcinoma, because whatever is shed, leaked, or secreted from the tumor will be found in the urine. Also, the amount of urine produced in 24 hours is less than the amount of blood circulating in the body, so the biomarker molecules are more diluted in blood than in urine. Blood contains larger amounts of proteins than urine, making it a much more complicated sample to characterize. Also, blood contains albumin, which makes the analysis of the blood proteome difficult and can mask low-abundance proteins, and its removal may cause the loss of interacting proteins. The best sample for a successful search of a biomarker for a solid tumor, although invasive and not easily accessible, is tumor tissue and its adjacent normal tissue.

    Careful consideration should be given to blood specimen analysis (i.e., whether the blood sample should be converted to serum or plasma prior to analysis). Serum and plasma are not directly comparable in a proteomic analysis.¹⁴,¹⁵

    As mentioned, serum and plasma cannot be directly compared in a proteomic study because their protein profiles differ.¹⁶ Another difference between serum and plasma is that plasma is the liquid portion of unclotted blood that is left behind after all the various cell types are removed. To prepare plasma, blood is withdrawn from the patient into a vial in the presence of an anticoagulant and the sample is centrifuged to remove cellular elements. The most commonly used anticoagulants include heparin, ethylenediaminetetraacetic acid (EDTA), and sodium citrate. Serum is blood plasma without fibrinogen or the other clotting factors. It is prepared by collecting blood in the absence of any coagulant. Under these conditions, a fibrin clot forms. This clot is then removed using centrifugation, leaving behind serum.¹⁷ Removal of the clot results in lower protein content in serum than plasma.

    Selection of Patients and Controls

    Subjects selected for a study should be checked by a physician to ensure the presence or absence of the disease. Tissue samples should be examined by a pathologist prior to analysis. Blood can be analyzed as serum or plasma. Is there a difference in analyzing serum over plasma at the metabolite level? A recent metabolomics study showed obvious differences in the GC/MS chromatograms of plasma and serum taken from the same healthy human subjects.¹⁶ Of the 72 identified compounds between the samples, only 36 were found in both serum and plasma. Also, the results indicated that some of these common metabolites had different concentrations in serum and plasma. These results highlighted the difficulty in comparing interlaboratory results using different sample types. Generally, the number of patients and control subjects in published studies is very small. For cancer biomarker discovery, biofluids and tissues are collected from a group of patients of different cancer stages and compared to a group of healthy persons. The effect of cancer stage on sensitivity of a single biomarker should be taken into consideration.¹²

    Number of Samples

    The number of samples analyzed in a biomarker discovery study should be sufficient to give statistically significant results between the sets of samples being compared. This number may vary from 25 to 100 samples in a set; the larger the numbers of samples the more accurate the statistical results. However, for an epidemiological or validation study the number of diseased samples and controls can be in the hundreds or thousands. Unfortunately, most published biomarker discovery studies only tested a limited number of clinical samples.

    Ethnicity, Sex, and Age

    Today, a study is normally carried out using biofluids or tissues collected from patients and healthy subjects of different ages, sex, and race. Using samples from patients and controls that are of different ages and sex can influence the results. A recent study of 269 subjects (131 males and 138 females) evaluated the effects of age, sex, and race on plasma metabolites.¹⁸ The patients were of Caucasian, African American, and Hispanic descent and ranged in age from 20 to 65 years. The subjects were divided into three different age groups; 20–35, 36–50, and 51–65. Using gas chromatography–mass spectrometry (GC/MS) and liquid chromatography–mass spectrometry (HPLC/MS) methods it was reported that more than 300 metabolites were detected of which more than 100 metabolites were associated, with age, many fewer with sex and fewer still with race.¹⁸ Attention should therefore be paid to the selection of patients and controls for a biomarker study and should not include (a) widely different ages, (b) mix of men and women, or (c) different ethnicities.

    Sample Collection, Handling, and Storage

    Samples are collected from persons who have had a physical examination by a physician who determines that the person of interest has the disease or is healthy. Samples should be collected in clean freezer type tubes and stored in a freezer immediately until time of analysis. Hsieh et al. showed that using different blood collection tubes affects the observable proteome of serum and plasma.¹⁴ At the time of analysis, samples should be thawed on ice or room temperature and prepared according to the selected method of analysis. The history of the sample is very important; blood and tissue samples used in search of biomarkers may have been obtained from sample storage banks without proper collection, storage, and information about the age and condition of the patient and, if there is cancer present, the stage of the disease. Also, the storage periods may be different. A lack of consistency in sample selection, collection, handling, and storage can doom any study to failure before data collection.

    One issue that is of constant concern in the analysis of serum or plasma samples is the method of collection, preparation, and storage. Sample collection, handling, and storage have great impact on the sensitivity, selectivity, and reproducibility of any given analysis. Detailed information on clinical and pathological parameters should be secured before samples are collected. Specimens should be collected by trained personnel. Blood samples should immediately be converted to serum or plasma and stored in the freezer at –80°C until time of analysis to prevent any enzymatic activity that can result in inconsistent protein degradation or metabolite interconversion. Two studies have shown a significant effect of freeze/thaw cycles on the proteome profile of serum/plasma.¹⁹,²⁰ Also, factors utilized in the preparation of serum, such as the anticoagulant used, the clotting time allowed, and the length of the time period before centrifugation, had a significant effect on the serum proteome. A few studies have been carried out showing that sampling procedures (i.e., fasting, time sample acquired from patient) had the greatest effects on proteome profiling, while handling procedures and storage conditions had relatively minor effects.²¹ However, everyone agrees that standardized protocols for sample, handling, storage, and analysis are required, as the issue is not about which procedure is better but rather about using standardized procedures to obtain comparable and reproducible results between different laboratories.²²–²⁴

    Method of Sample Analysis

    Selection of the method of sample preparation and analysis plays an important role in determining the accuracy of the results, as discussed later in this chapter.

    Errors in Study Execution

    Study execution deals with many experimental parameters that should be carefully considered for a successful experiment with meaningful and reproducible results.

    Sample Preparation

    Preparation of the sample for proteomic and metabolomic analysis can introduce errors that will affect the quality of the final results. The search for biomarkers in biological samples involves different steps depending on the sample type and whether the analysis is for metabolites or proteins, targeted or global (profiling). Extraction of metabolites from blood, urine, or tissue for a global study is not an easy task. It may require multiple extraction procedures using different solvent systems. It is not always possible to extract all the metabolites from a sample with a single solvent because metabolites have different chemical and physical properties and are present in a wide dynamic concentration range. For details, see Chapters 3 and 4 on sample preparation for proteomics and metabolomics within this book.

    Preparation of a blood sample for proteomic study is more complicated than urine, as urine contains fewer proteins and cells. The high abundant proteins must be depleted from blood prior to HPLC/MS/MS analysis. Approximately 99% of the protein content of blood (both serum and plasma) is made up of only about 20 proteins.²⁵ Although depletion of these proteins allows for the detection of low abundant proteins, it also removes proteins that are bound to these 20 proteins, resulting in the loss of potentially important information.²⁶ Tissues are initially homogenized prior to extraction of metabolites and proteins. Incomplete homogenization can lead to losses that can affect the accuracy and precision of the results. For detailed discussion, see Chapters 3 and 4 on sample preparation for metabolomics and proteomics.

    Methods of Analysis

    Choosing the optimal analysis method is critical in proteomic and metabolomic studies. For example, analyzing the plasma proteome involves protein precipitation and solubilization; therefore, the downstream fractionation method must be either electrophoresis or a liquid-phase method.

    Three different approaches for the global analysis of serum proteins have been used: global serum proteome analysis using two- and three-dimensional HPLC/MS²⁷,²⁸; analysis of low molecular weight proteins/peptides²⁹; and investigation of proteins and peptides that are bound to high-abundance serum proteins.³⁰ Unfortunately, studies have shown that the analysis of the plasma proteome by groups using different methods resulted not only in different number of protein identifications but poor overlap between the results.³¹,³²

    Common methods for analysis of a metabolome include GC/MS, HPLC/MS, or CE/MS. Which technique to use depends on the compounds of interest. Each technique has its advantages and limitations. Buscher et al.³¹ compared the three techniques using a mixture of metabolites covering the pentose phosphate pathway, the tricarboxylic acid cycle, redox metabolism, amino acids, glycolysis, and nucleotides to test the three methods. Out of 75 intermediate standard metabolites, 33 were common to the 3 methods, 64 by CE, 42 by GC, and 65 by LC. A combination of LC and GC detected 70 metabolites. All metabolites were detected using the three methods. These results prove that the method of analysis is an important part of biomarker discovery.

    Number of Replicates

    The conventional teaching in analytical chemistry is that each sample should be analyzed in triplicate and the mean and standard deviation should be reported. Unfortunately, most published proteomic and metabolomic studies analyze each sample only once, which does not permit the error in the measurement to be calculated. Proteomic analysis of a biological sample involves depletion of high molecular weight proteins, digestion, fractionation, and HPLC/MS analysis. Each one of these steps can introduce an error. The greatest error is introduced by the final step, HPLC/MS/MS. It has been pointed out³³,³⁴ that to extract the largest number of protein identifications, the sample should be analyzed at least in triplicate, because the complexity of a digest of an entire proteome is such that the analysis, even with a high-resolution HPLC/MS system, exceeds the systems peak capacity.³⁵ This observation was illustrated by Dr. Sam Hanash and his coworkers in the analysis of a plasma sample using HPLC/MS/MS. Repeat runs resulted in the identification of 32% and 36% more peptides and proteins, respectively.

    Effect of Mass Spectrometer Type on the Results

    In proteomic and metabolomic studies, the mass spectrometer plays a central role and the selection of the instrument can affect the results. Gika et al.³⁵ coupled a single ultra-high-pressure liquid chromatography (UPLC) instrument to a triple quadrupole linear ion trap (Q-TRAP) and a hybrid quadrupole time-of-flight (Qq-TOF) mass spectrometer using both positive and negative electrospray ionization (ESI) to study the metabolic profile of rat urine. The flow from the UPLC column was split equally and the eluent streams were simultaneously directed to the inlets of the two mass spectrometers. Data from both mass spectrometers were subjected to multivariate statistical analysis. After applying the same data extraction software, a number of ions were found to be unique to either data set.

    The study clearly indicates that not all ions were detected by both the Qq-TOF and Q-TRAP mass spectrometers. The authors concluded that given the design differences between instruments this is perhaps not that surprising a finding but nevertheless it raises important questions about how to evaluate data from different laboratories produced on different mass spectrometers even when (nominally) the same sample processing and chromatography have been used.³⁶

    In another study, Elias et al.³³ compared the results of triplicate measurements of the yeast proteome by LC-MS/MS using linear ion trap (LTQ) and Qq-TOF mass spectrometers. The data was searched using both Mascot and SEQUEST. The results from the two instruments were different, with each search engine providing a different number of identifications. From the LTQ data, 666 and 644 identifications were exclusive to Mascot and SEQUEST, respectively, and 4,056 proteins were identified using both algorithms. For the Qq-TOF data, 1,012 and 510 identifications were exclusive to Mascot and SEQUEST, respectively, and 1,955 proteins were identified using both algorithms.³⁴

    Effect of Separation Instrumentation on the Results

    The most commonly used analytical methods for finding potential biomarkers are SDS-PAGE, HPLC/MS, and GC/MS. SDS-PAGE is used only for the fractionation and separation of proteins. GC is an excellent technique for the separation of volatile compounds; however, it is not suitable for the separation of proteins. It is a simple, relatively economical, and fast technique that possesses high resolving power and reproducibility. Although GC using a single column can achieve high resolution separations, 2D-GC is the preferred procedure for the comprehensive separation of metabolomic mixtures.³⁷

    HPLC has been used in both metabolomic and proteomic studies in search of biomarkers. Increased resolution in HPLC is achieved by using smaller packing particles (i.e., ≤2 μm) and UPLC.³⁷,³⁸ Wilson et al.³⁷ reported that UPLC offered significant advantages over conventional reversed-phase HPLC (up to 4,000 psi). It more than doubled the peak capacity, giving approximately a 10-fold increase in speed and a three- to fivefold increase in sensitivity compared to that generated with a conventional 3.5 μm stationary phase. Although UPLC/MS/MS using a single column possesses a high resolving power, two-dimensional 2D-HPLC is the preferred procedure for the comprehensive separation of the proteome.³⁷,³⁸

    Errors in Measurements

    One cannot ignore the experimental and human errors in the measurement of proteins and metabolites in complex mixtures. In a recent metabolic study using GC/MS to search for amino acid markers in urine of 11 bladder cancer patients and 8 controls, the error of reported results was extremely high and ranged from 4% to 93% and 6% to 94% for the patients with bladder cancer and controls, respectively.³⁹ The high errors and the overlap between cancer patients and controls do not result in a specific and sensitive method, nor can they be used for population studies or to replace a clinical test. Therefore, attention should be paid to eliminate human and experimental errors. Errors arise from sample collection and preparation procedures and analysis.

    Personnel and Experimental Validation

    Any research to be done correctly requires trained and competent personnel using validated and proven methods. Therefore, to avoid any errors, trained personnel should be used in every aspect of the research from sample collection, handling, and storage to sample analysis and results manipulation. Having personnel trained in every aspect of the biomarker research project enables a stronger team of scientists with the capability of troubleshooting any deficiencies throughout the course of the study.

    Specificity of Proteins as Biomarkers

    The search for a protein biomarker in a biofluid or tissue is like searching for a needle in a haystack; however, the search may result in multiple proteins that are each involved in more than one pathological condition. Deciphering the critical molecules is not an easy task. A single biomarker protein may be associated with multiple cancers and diseases. For example, a urine proteomic study revealed 26 proteins that were overexpressed in bladder cancer.⁴⁰ A search using Ingenuity Pathway Systems indicated that each of these proteins is involved in multiple cancers and diseases, suggesting that any of these proteins would result in a biomarker with low sensitivity and specificity. As an example, annexin A1 is involved in cardiovascular disease, endocrine system disorders, gastrointestinal disease, hematological disease, immunological disease, metabolic disease, organismal injury and abnormalities, reproductive system disease, and respiratory disease, in addition to cancer. Annexin A1 is reported to be downregulated in ductal⁴¹ and squamous cell carcinoma⁴² and upregulated in bladder cancer.⁴⁰ In human laryngeal tumors, annexin A1 was upregulated in the nuclei and cytoplasmic granule matrix from larynx mast cells and downregulated in larynx epithelial cells.⁴³

    Another example is carcinoembryonic antigen (CEA), which is used mainly to monitor the treatment of cancer patients, especially those with colon cancer. A PubMed search using CEA and cancer indicates that CEA is used as a marker for cancers of the lung, breast, rectum, liver, pancreas, stomach, and ovary. Also, not all cancers produce CEA. Increased CEA levels can indicate some non-cancer-related conditions such as inflammation, cirrhosis, rectal polyps, emphysema, ulcerative colitis, peptic ulcer, and benign breast disease. CEA is not recommended for screening a general population. These results indicate that selecting a protein that functions as a biomarker for a unique pathological condition is not an easy task.

    Published Results Comparison

    As mentioned earlier, comparison of results from different sources is challenging due to differing sample preparation and experimental procedures. Another aspect is how to examine the data. The following examples can illustrate this point. Sreekumar et al.,⁴³ in a study published in the journal Nature, identified the metabolite sarcosine as a potential biomarker for prostate cancer. In a following study, Jentzmik et al.⁴⁴ stated that Our study diminish[es] the hope that the ratio of sarcosine to creatinine will become a successful indicator for prostate cancer management. That outcome might be the case if the comparison of both findings was accurate. Sreekumar et al.⁴³ compared the ratio of sarcosine to alanine, and Jentzmik et al.⁴⁴ compared the ratio of sarcosine to creatinine.

    Statistical Data Analysis

    Multivariate statistical analysis is generally employed to analyze NMR or MS data to discriminate between different data sets.⁴⁵ Metabolomic, as well as proteomic, analysis of biological systems using NMR, GC/MS, CE/MS, and HPLC/MS, as with genomics, transcriptomics, and proteomics, results in a wealth of information that can be overwhelming. The sizes of these data sets make them virtually impossible to analyze manually. For any meaningful interpretation of the data, the appropriate statistical tools must be employed to manipulate the large raw data sets to provide a useful, understandable, and workable format. Different multidimensional and multivariate statistical analyses and pattern-recognition programs have been developed to distill the large amounts of data in an effort to interpret the complex metabolic pathway information from the measurements and to search for the discriminating features between two data sets.⁴⁶ The most popular multivariate statistical methods are principal component analysis (PCA),⁴⁷ partial least square discriminate analysis (PLS-DA),⁴⁸ and support vector machines (SVM).⁴⁹ Mehadevan et al.⁴⁹ compared PLS-DA multivariate analysis with SVM for the analysis of NMR data. Their results indicated that SVM were superior to PLS-DA in terms of predictive accuracy with the least number of features. Van et al.⁵⁰ used 2D total correlation NMR spectroscopy and statistical analysis to compare the global metabolic profiles of urines obtained from wild-type and ABCC6-knockout mice. Three statistical methods were used to analyze the NMR spectra; PCA, PLS-DA, and OPLS-DA. The PLS-DA and OPLS-DA gave almost identical results, and PCA gave slightly different results. However, all three methods could successfully discriminate between the two groups.

    Issaq et al.⁵¹ used PCA and OPLS-DA to analyze HPLC/MS data obtained from the urines of 41 bladder patients and 48 healthy volunteers. The PCA analysis resulted in two separate groups corresponding to normal and cancer urines and correctly predicted 40 of 41 bladder cancer and 46 of 48 healthy volunteers. The OPLS-DA confirmed the predicted results of the PCA program in terms of sensitivity and specificity; however, OPLS-DA correctly predicted 48 of 48 healthy and 41 of 41 of bladder cancer urines.⁵¹

    Recommendations

    Caution: Biological fluids and tissues should be handled carefully using safe procedures.

    When taken out of the freezer, sample vials should be checked for breakage prior to defrosting. Samples should be thawed at room temperature and not by heating or placed in a hot water bath. Standard operating procedures should be followed in the same manner for all samples in a study. Urine specimens may contain different amounts of analytes; therefore, peak intensities should be normalized and aligned. To prevent loss of sample and information, minimum sample steps should be used. In the case of global metabolic studies, different solvents should be used for maximum analyte extraction. To assess parameters such as extraction efficiency, the use of internal standards is advised.

    Concluding Remarks and Recommendations

    Can metabolomic and proteomic studies lead to a cancer biomarker? In short, yes. The ultimate diagnostic biomarker for any disease is one that provides 100% sensitivity and specificity. It seems that this level of accuracy is more an ideal than an attainable goal for discovering biomarkers using metabolomics and proteomics. That concession does not mean that the search for biomarkers should be stopped; on the contrary, the search should be intensified because of the benefits of detecting cancer or any disease at an early stage. The failure in finding sensitive and specific metabolic and proteomic biomarkers for cancer may be attributed to different factors: the small number of samples that are analyzed; lack of information on the history of the samples; case and control specimens are not age and sex matched; limited metabolomic and proteomic coverage; and the need to follow clear standard operating procedures for sample selection, collection, storage, handling, analysis, and data interpretation. Also, most studies to date used serum, plasma, urine, or tissue from cancer patients and controls. A more sound approach is to search for proteins in the cancer tissue first, then look for the discriminating proteins in the blood or urine as was suggested by Zhang and Chan.⁵² Johann et al.⁵³ studied renal cell carcinoma tissue, adjacent normal tissue, and preoperative blood taken from the same patient. The proteomes extracted from the tissues and preoperative plasma were analyzed using 2D LC-MS. They identified proteins that were present in the tumor but not the normal tissue. Also, discriminating proteins found in the tumor tissue were found in the preoperative plasma. In a recent study of kidney cancer, Ganti et al.⁵⁴ performed a simultaneous multiple matrix (tissue, blood, and urine) metabolomic analysis. The HPLC/MS and GC/MS analysis resulted in the identification of 267 metabolites in tissue, 246 in serum, and 267 in urine, of which 89 were common to the three matrices. The results also indicated that serum analysis is a more accurate proxy for tissue changes than urine.

    When all of the above-mentioned factors are resolved, we firmly believe that continued research will lead to sensitive and specific biomarkers for various cancers. It is only a matter of time and effort. Although a single discovered biomarker may not have 100% sensitivity and specificity, it is possible that a combination of biomarkers will minimize the number of false positives and false negatives in population screening. We believe that with further advancements in MS, separation technologies, NMR (specifically for metabolomics), and the use of reproducible and accurate analytical procedures, more clinically useful biomarkers will be discovered.

    References

    1. Bocket C, Coleman M, Collins B, et al. Photoaptamer arrays applied to multiplexed proteomic analysis. Proteomics. 2004;4:609–618.

    2. MacNeil JS. Better biomarkers for the diagnostics labyrinth. Genome Technol 2004;24–33.

    3. Apolo AB, Milowsky M, Bajorin DF. Clinical states model for biomarkers in bladder cancer. Future Oncol. 2009;5:977–992.

    4. Lintula S, Hotakainen K. Developing biomarkers for improved diagnosis and treatment outcome monitoring of bladder cancer. Expert Opin Biol Ther. 2010;10:1169–1180.

    5. Glas AS, Roos D, Deutekom M, et al. Tumor markers in the diagnosis of primary bladder cancer: A systematic review. J Urol. 2003;169:1975–1982.

    6. Villicana P, Whiting B, Goodison S, Rosser CJ. Urine-based assays for the detection of bladder cancer. Biomark Med. 2009;3:265.

    7. National Cancer Institute, NCI. Prostate-Specific Antigen (PSA) Test 2009; In: http://www.cancer.gov/cancertopics/factsheet/Detection/PSA; 2009.

    8. Thompson IM, Ankerst DP, Chi CA, et al. Operating characteristics of prostate-specific antigen in men with an initial PSA level of 3.0 ng/ml or lower. J Am Med Assoc. 2005;294:66–70.

    9. Ebeling FG, Stieber P, Untch M, et al. Serum CEA and CA 15-3 as prognostic factors in primary breast cancer. Br J Cancer. 2002;86:1217–1222.

    10. van Nagell Jr JR, DePriest PD, Reedy MB, et al. The efficacy of transvaginal sonographic screening in asymptomatic women at risk for ovarian cancer. Gynecol Oncol. 2000;77:350–356.

    11. Niloff JM, Knapp RC, Schaetzl E, et al. CA125 antigen levels in obstetric and gynecologic patients. Obstet Gynecol. 1984;64:703–707.

    12. Horstmann M, Patschan O, Hennenlotter J, et al. Combinations of urine-based tumor markers in bladder cancer surveillance. Scand J Urol Nephrol. 2009;43:461–466.

    13. PubMed search, April 2012.

    14. Hsieh SY, Chen RK, Pan YH, et al. Systematical evaluation of the effects of sample collection procedures on low-molecular-weight serum/plasma proteome profiling. Proteomics. 2006;6:3189–3198.

    15. Tammen H, Schulte I, Hess R, et al. Peptidomic analysis of human blood specimens: comparison between plasma specimens and serum by differential peptide display. Proteomics. 2005;5:3414–3422.

    16. Liu LAJ, Wang G, et al. Differences in metabolite profile between blood plasma and serum. Anal Biochem. 2010;406:105–112.

    17. Issaq HJ, Xiao Z, Veenstra TD. Serum and plasma proteomics. Chem Rev. 2007;107:3601–3620.

    18. Lawton KA, Berger A, Mitchell M, et al. Analysis of the adult human plasma metabolome. Pharmacogenomics. 2008;9:383–397.

    19. West-Nielsen M, Hogdall EV, Marchiori E, et al. Sample handling for mass spectrometric proteomic investigations of human sera. Anal Chem. 2005;77:5114–5123.

    20. Baumann S, Ceglarek U, Fiedler GM, et al. Standardized approach to proteome profiling of human serum based on magnetic bead separation and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Clin Chem. 2005;51:973–980.

    21. Banks RE, Stanley AJ, Cairns DA, et al. Influences of blood sample processing on low-molecular-weight proteome identified by surface-enhanced laser desorption/ionization mass spectrometry. Clin Chem. 2005;51:1637–1649.

    22. Zolg W. The proteomic search for diagnostic biomarkers: lost in translation? Mol Cell Proteomics. 2006;5:1720–1726.

    23. Tuck MK, Chan DW, Chia D, et al. Standard operating procedures for serum and plasma collection: early detection research network consensus statement standard operating procedure integration working group. J Proteome Res. 2009;8:113–117.

    24. http://www.fda.gov/cdrh/clia.

    25. The Plasma Proteome Institute. http://www.plasmaproteome.org.

    26. Zhou M, Lucas A, Chan KC, et al. An investigation into the human serum interactome.. Electrophoresis. 2004;25:1289–1298.

    27. Xiao Z, Conrads TP, Lucas DA, et al. Direct ampholyte-free liquid-phase isoelectric peptide focusing: application to the human serum proteome. Electrophoresis. 2004;25:128–133.

    28. Chan KC, Lucas DA, Hise D, et al. Analysis of the human serum proteome. Clinical Proteomics. 2004;1:101–112.

    29. Tirumalai RS, Chan KC, Prieto DA, et al. Characterization of the low molecular weight human serum proteome. Mol Cell Proteomics. 2003;2:1096–1103.

    30. Anderson NL, Polanski M, Pieper R, et al. The human plasma proteome: a nonredundant list developed by combination of four separate sources. Mol Cell Proteomics. 2004;3:311–326.

    31. Buscher JM, Czernik Ewald JC, et al. Cross-platform comparison of methods for quantitative metabolomics of primary metabolism. Anal Chem. 2009;81:2135–2143.

    32. Liu H, Sadygov RG, Yates JR. A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal Chem. 2004;76:4193–4201.

    33. Elias J, Haas W, Faherty BK, Gygi SP. Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations. Nat Methods. 2005;2:667–675.

    34. Faca V, Pitteri J, Newcomb L, et al. Contribution of protein fractionation to depth of analysis of the serum and plasma proteomes. J Proteome Res. 2007;6:3558–3565.

    35. Gika HG, Theodoridis GA, Earll M, et al. Does the mass spectrometer define the marker? A comparison of global metabolite profiling data generated simultaneously via UPLC-MS on two different mass spectrometers. Anal Chem. 2010;82:8226–8234.

    36. Liu ZY, Phillips JB. Comprehensive two-dimensional gas chromatography using an on-column thermal modulator interface. J Chromatogr Sci. 1991;29:227–231.

    37. Wilson ID, Nicholson JK, Castro-Perez J, et al. High resolution ultra performance liquid chromatography coupled to a-TOF mass spectrometry as a tool for differential metabolic pathway profiling in functional genomic studies. J Proteome Res. 2005;4:591–598.

    38. Kim JW, Lee G, Moon SM, et al. Metabolomic screening and star pattern recognition by urinary amino acid profile analysis from bladder cancer patients. Metabolomics. 2010;6:202–206.

    39. Issaq HJ. Role of Separation science in biomarker discovery: opportunities and pitfalls. Pittsburgh Conference on Analytical Chemistry and Applied Spectroscopy March 2011.

    40. Kim K, Aronov P, Zakharkin SO, et al. Urine metabolomics analysis for kidney cancer detection and biomarker discovery. Mol Cell Proteomics. 2009;8:558–570.

    41. Kind T, Tolstikov V, Fiehn O, Weiss RH. A comprehensive urinary metabolomic approach for identifying kidney cancer. Anal Biochem. 2007;363:185–195.

    42. Perroud B, Lee J, Valkova N, Dhirapong A, et al. Pathway analysis of kidney cancer using proteomics and metabolic profiling. Mol Cancer. 2006;5:64.

    43. Sreekumar A, Poisson LM, Rajendiran TM, et al. Sarcosine in urine after digital rectal examination fails as a marker in prostate cancer detection and identification of aggressive tumors. Nature. 2009;457:910–914.

    44. Jentzmik F, Stephan C, Miller K, et al. Sarcosine in urine after digital rectal examination fails as a marker in prostate cancer detection and identification of aggressive tumours. European Urol. 2010;58:12–18.

    45. Nicholson JK, Lindon JC, Holmes E. Metabonomics: understanding the metabolic responses of living systems to pathophysiological stimuli via multivariate statistical analysis of biological NMR spectroscopic data. Xenobiotica. 1999;29:1181–1189.

    46. Holmes E, Antti H. Chemometric contributions to the evolution of metabonomics: mathematical solutions to characterizing and interpreting complex biological NMR spectra. Analyst. 2002;127:1549–1557.

    47. Keun H, Ebbels T, Antti H, et al. Improved analysis of multivariate data by variable stability scaling: application to NMR-based metabolic profiling. Anal Chim Acta. 2003;490:265–276.

    48. Vapnick V. Estimation of dependences based on empirical data. New York: Springer Verlag; 1982.

    49. Mehadvan S, Shah SL, Marrie TJ, Slupsky CM. Analysis of metabolomic data using support vector machines. Anal Chem. 2008;80:7562–7570.

    50. Van QN, Issaq HJ, Jiang Q, et al. J Proteome Res. 2008;7:630–639.

    51. Issaq HJ, Nativ O, Waybright T, et al. Detection of bladder cancer in human urine by metabolomic profiling using high performance liquid chromatography/mass spectrometry. J Urol. 2008;179:2422–2426.

    52. Zhang H, Chan DW. Cancer biomarker discovery in plasma using a tissue-targeted proteomic approach. Cancer Epidemiol Biomarkers Prev. 2007;16:1915–1917.

    53. Johann Jr DJ, Wei BR, Prieto DA, et al. Combined blood/tissue analysis for cancer biomarker discovery: application to renal cell carcinoma. Anal Chem. 2010;82:1584–1588.

    54. Ganti S, Taylor SL, Aboud OA, et al. Kidney tumor biomarkers revealed by simultaneous multiple matrix metabolomics analysis. Cancer Res.

    Enjoying the preview?
    Page 1 of 1