Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

A Guide to Forensic DNA Profiling
A Guide to Forensic DNA Profiling
A Guide to Forensic DNA Profiling
Ebook1,345 pages13 hours

A Guide to Forensic DNA Profiling

Rating: 0 out of 5 stars

()

Read preview

About this ebook

The increasingly arcane world of DNA profiling demands that those needing to understand at least some of it must find a source of reliable and understandable information.  Combining material from the successful Wiley Encyclopedia of Forensic Science with newly commissioned and updated material, the Editors have used their own extensive experience in criminal casework across the world to compile an informative guide that will provide knowledge and thought-provoking articles of interest to anyone involved or interested in the use of DNA in the forensic context.

Following extensive introductory chapters covering forensic DNA profiling and forensic genetics, this comprehensive volume presents a substantial breadth of material covering:

  • Fundamental material  – including sources of DNA, validation, and accreditation
  • Analysis and interpretation – including, extraction, quantification, amplification and  interpretation of electropherograms (epgs)
  • Evaluation – including mixtures, low template, and transfer
  • Applications – databases, paternity and kinship, mitochondrial-DNA, wildlife DNA, single-nucleotide polymorphism, phenotyping and familial searching
  • Court  - report writing, discovery, cross examination, and current controversies

With contributions from leading experts across the whole gamut of forensic science, this volume is intended to be authoritative but not authoritarian, informative but comprehensible, and comprehensive but concise.  It will prove to be a valuable addition, and useful resource, for  scientists, lawyers,  teachers, criminologists, and judges.

LanguageEnglish
PublisherWiley
Release dateMar 8, 2016
ISBN9781118751510
A Guide to Forensic DNA Profiling

Related to A Guide to Forensic DNA Profiling

Related ebooks

Law For You

View More

Related articles

Reviews for A Guide to Forensic DNA Profiling

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    A Guide to Forensic DNA Profiling - Allan Jamieson

    Contributors

    Foreword

    My contact with Professor Jamieson and Dr Bader (or Allan and Scott as I now know them and shall refer to them) began in the seminal trial of Sean Hoey (in relation to the Omagh Bombing) in Northern Ireland in 2007. This was the first serious challenge in the United Kingdom to the use of low copy number (LCN) DNA profiling, the first form of what has become more generally known as low template DNA profiling. Although Mr Hoey was primarily acquitted as a result of reservations surrounding the way in which key exhibits were seized, stored, and examined, the learned trial judge, the Honorable Mr Justice Weir, raised concerns in relation to the reliability of interpreting LCN DNA. Such concerns were no doubt borne out of the points we advanced on behalf of Mr Hoey, which were in turn borne out of the concerns of those from The Forensic Institute. From the outset, The Forensic Institute expressed the strongest of reservations as to the reliability of interpreting such minute amounts of DNA found on the relevant exhibits (none of which were from the Omagh incident as it happens).

    These concerns caused a seismic response in scientific and legal circles, which continues to this day. Toward the end of 2014, I had the pleasure of working with Allan and Scott in a murder trial at the Old Bailey. They were instructed on behalf of the defense to comment upon the reliability and interpretability of low template DNA recovered from a murder scene. In this recent case, part of the argument focused upon the reliability of the methods employed by the prosecution to quantify the probative value of such low amounts of DNA. One of the methods employed involved the use of software, which was said to overcome the complex nature of the results; another was the counting method. Following the cross-examination of one of the prosecution's lead forensic scientists the Crown withdrew the DNA evidence in the case.

    There is no doubt that for lawyers, DNA profiling can present a daunting challenge. This is not only in understanding the science involved, but also in knowing how best to present the results in a way that can be easily understood. I am indebted to Allan and Scott for guiding our legal teams through the morass of graphs, statistics, and terminology, enabling us to be able to properly represent our clients on the most serious of allegations. I am optimistic that the clarity of their approach and the appreciation of the needs of the non-specialist will be reflected in the content of this book.

    I am also delighted to learn that this will be one of the few books that brings together the scientific and legal aspects of DNA profiling in such a comprehensive approach. That is not an easy task; but I know that the Editors had assistance from Professor Andre Moenssens of the Wiley Encyclopedia of Forensic Science where many of the articles in this book originate.

    Needless to say I write this having not read all of the articles in this work, but I am confident that if the skills I have taken advantage of in our casework are taken into the production of this work then it will provide a valuable resource for both lawyers and experts alike in the continuing quest to tackle the increasingly complex issues involved in forensic DNA profiling.

    A lawyer writing a preface for a book written by scientists? Progress indeed.

    Kieran Vaughan QC

    Garden Court Chambers

    Preface

    Forensic DNA profiling has revolutionized forensic science. However, from relatively simple beginnings using what would now be regarded as huge amounts of sample (e.g., bloodstain), not only has the underlying technology changed (i.e., RFLP to STRs and SNPs) but the complexity of the interpretation of the analytical results has increased in the quest to get more information from smaller, and more complex, samples.

    Most of these developments are published and debated in the scientific literature, although some are guarded for ostensibly commercial reasons, or sometimes it seems simply to avoid showing one's hand to the other side in an adversarial legal system. Much of the scientific and statistical debate remains active and there is no settled position. Indeed, it could be contended that in many of these arguments each side has a rational and reasoned position, simply different to their opponents.

    This book does not seek to provide or claim to have the final answer on any of these, because for many issues there is none. In recognition of the state of flux within parts of the discipline we have not sought to provide only our view, or indeed the view of any author, as the final word and, therefore, no article can be taken to represent the view of anyone other than the authors of the article at the time of writing. Views in some articles may contradict views in others; that is a reflection of the state of the art and is common in science.

    Although some articles in this work were created specifically with this book in mind, the vast majority of articles are from the Wiley Encyclopedia of Forensic Science.¹ The consequence of this is that there is inevitably some duplication of information. However, because we intend that each article can stand alone, we consider that such duplication, as exists, simply adds to the utility of the book.

    Forensic science operates, by definition, within a legal context. This creates several problems in creating a volume like this one. Different jurisdictions may have different legal requirements of the expert, and even the experts may have local practices that differ from other localities nationally or internationally. Even within the United States and United Kingdom, depending on the level of court, there are widely differing expectations and standards for the admissibility of scientific evidence (e.g., Frye, Daubert, or none). We cannot expect to cover all of the variances and so the articles, other than where specifically addressing jurisdictional issues, should be taken as informing on the generality of practices.

    The dichotomy between legal and scientific standards is perhaps best illustrated in the NAS report of 2009;

    The bottom line is simple: In a number of forensic science disciplines, forensic science professionals have yet to establish either the validity of their approach or the accuracy of their conclusions, and the courts have been utterly ineffective in addressing this problem. For a variety of reasons—including the rules governing the admissibility of forensic evidence, the applicable standards governing appellate review of trial court decisions, the limitations of the adversary process, and the common lack of scientific expertise among judges and lawyers who must try to comprehend and evaluate forensic evidence—the legal system is ill-equipped to correct the problems of the forensic science community.

    For those reasons, and others (e.g., availability of other evidence), we would caution (as have others) against using any court decision as validation or invalidation of any scientific test. It is not unknown for different courts within the same jurisdiction to rule both ways on the same science; for example, the use of low template DNA in New York City.

    Thus, this volume sets out to provide a comprehensive introduction to the scientific, statistical, and legal issues within the context of forensic DNA profiling. The rate of development of the field is so great that almost any publication will be out of date within a very short time. However, the information provided here will provide a solid foundation from which future developments can be understood and evaluated.

    Allan Jamieson

    Scott Bader

    July 2015

    ¹ www.wileyonlinelibrary.com/ref/efs

    Glossary

    accreditation recognition of procedural management at an institution allele one of alternative forms of a genetic marker, component/DNA type amplification increase in amount of sample DNA created by PCR process amylase enzyme of saliva, and to lesser extent some other body fluids AP Acid Phosphatase, detected by presumptive test for seminal fluid base pair building block unit of DNA baseline the experimental zero value on the x-axis of analytical results bin part of the epg showing known allelic sizes body fluid usually refers to any biological material from which DNA can be obtained buccal derived from mouth cavity cell smallest living structure of biological organism chromosome structure containing DNA including many genes, inherited as a single unit from cell to cell and generation to generation coancestry coefficient a measure of the relatedness of two people Daubert legal standard for admissibility of expert evidence in some US states degraded DNA partially destroyed DNA, usually indicated by lower or absent amounts of longer DNA components diploid possessing two alleles at each locus drop-in appearance of DNA component in a profile due to background contamination drop-out disappearance of DNA component from a profile due to random sampling of low level quantity electrophoresis movement of chemical through a matrix under the force of electrical field extraction (in DNA casework) the removal of DNA from cells Frye legal standard for admissibility of expert evidence in some US states genotype genetic composition of an individual comprising both alleles at each/all loci haploid possessing only one allele at each locus haplotype genetic composition of an individual comprising one allele at each/all loci, linked together as a inherited group hemizygous only one allele component present at a locus heterozygous two alleles at one locus are different types homozygous two alleles at one locus are the same type HWE Hardy Weinberg Equilibrium, stable frequency of alleles ISO17025 accreditation for the general requirements for the competence to carry out tests and/or calibrations, including sampling ladder (allelic) quality control sample containing alleles of known size and run separately to other samples locus/loci (pl.) specific location/entity of DNA (marker or gene) on a chromosome, area of DNA tested in profile low copy number (LCN) very low amount (of DNA) in sample; specifically also the increased amplification cycle number used for PCR method low template very low amount (of DNA) in sample micro one millionth, 10 −6 milli one thousandth, 10 −3 mitochondrion intracellular structure containing mitochondrial DNA mixture more than one contributor (DNA profiling) multiplex chemistry analysing many loci mutation alteration in genetic component nano one thousand millionth, 10 −9 nucleus intracellular structure containing nuclear DNA (used in standard DNA profiling) odds number of favourable outcomes/number of unfavourable outcomes partial profile one in which all of the components do not appear Phadebas presumptive test for saliva, detects amylase activity phenotype expressed/observed biological characteristic controlled by combination of alleles in genotype pico one million millionth, 10−12 polygenic controlled by several genes polymerase chemical that creates the amplification of DNA by PCR polymorphic many forms population in statistics, any set of items under study presumptive suggestive, not definitive primer chemical that binds to specific site (locus) of sample DNA to enable amplification in PCR probability number of favourable outcomes/number of possible unfavourable outcomes pull-up artifact seen in another part of DNA profile due to presence of a DNA component in one part of the profile quantitation measurement of the amount of a sample rfu relative fluorescence unit, measurement of peak height in an electropherogram saliva body fluid produced by salivary glands in mouth, containing salivary amylase semen body fluid produced by male ejaculation, including seminal fluid and sperm cells seminal fluid nutrient body fluid secreted by prostate gland of males for transmission of sperm cells in ejaculate sensitivity (a) a measure of how small an amount of material a technique can detect (b) the effect on the signal or measurement of a change in an inputability to detect and measure a sample specificity ability to discriminate an individual component of a sample sperm male sexual cell present in semen, produced by testes, carrying haplotype of individual stochastic effect due to random variation caused by sampling of low level sample stochastic threshold approximate level at which random sampling effects can be expected stutter artifact seen in DNA profile as smaller peak adjacent to main peak of real DNA component validation evidence of compliance/efficacy for a process being fit for purpose, with demonstration of capabilities and limits x-axis the horizontal axis of a graph y-axis the vertical axis of a graph

    Abbreviations and Acronyms

    Part A

    Background

    Chapter 1

    Introduction to Forensic Genetics

    Scott Bader

    The Forensic Institute, Glasgow, UK

    The Ideal Forensic Material—Individualization

    Forensic genetics has been touted as the gold standard of forensic analysis. This is because DNA fulfils many of the criteria that make the perfect forensic technology to establish a person's presence at a scene of crime.

    Most forensic disciplines concerned with offences against the person, and some other crimes, try to establish a link between items found at the scene and items found on or associated with a suspect. In other words, to establish whether the recovered items could have originated from the same source. This process can be summarized as

    1. Establishing a match

    2. Calculating the significance of the match

    The perfect conclusion of this exercise is to unequivocally establish that the material from the crime scene could only have come from exactly the same source as that found on or associated with the suspect and no other source. The goal of most forensic matching is to reduce the potential population from which an item could have come, to one individual within the population. This extreme is the definition of identification. The process that we are more interested in, because of its more common application, is that of individualization. This is the process of individualization. Individualization is a population problem as it is necessary to be able to demonstrate how many people in a population may have the match characteristics discovered by the investigator. Therefore, modern scientific individualization techniques recognize that most, if not all, evidence is probabilistic, which is to say that we attempt to establish a probability or likelihood that two items had a common origin. The ideal forensic material must enable matching and probability calculations.

    There are other qualities that a forensically useful material should have. Ideally, the material should be

    1. Unique

    2. Not change over time (i.e., during normal use)

    3. Likely to be left at a scene in sufficient quantity to establish a match

    4. Not change after being left at the scene and during subsequent examination

    In this book, we shall see that DNA meets many, but not all, of these criteria and how the limitations are handled.

    So what makes DNA a good material forensically?

    DNA—The Molecule

    DNA is sometimes called the blueprint of life and has characteristics that are appropriate to its role. Many, if not all, of these characteristics are important in Forensic Genetics, which is simply genetics in a legal context. These characteristics include its simplicity and yet complexity, both of which are incorporated within the polymeric chemical structure of the backbone molecule and the varied sequence of sidechain bases (the so-called letters of its information content), arranged in a double helix (see DNA: An Overview). The molecule is made from a relatively small number of building blocks yet contains a vast amount and range of information that can define the nature of the biological cell, and ultimately the multicellular organism, within which the DNA is located. The double helix structure is relatively stable in time yet is adaptable enough to open up to allow a living cell to use the contained information to go about its life functions (transcription) or to make copies of itself (replication). DNA is stable so as to enable transfer of the genetic information from generation to generation after replication (with cell division and mating where relevant), yet it can also change to varying extents. Some of the changes are important to only an individual organism and may be deleterious (e.g., mutation giving rise to a cancer), or are the basis for individual variation (e.g., mutation giving rise to a new variant, and the haploid segregation of chromosomes in gametes with the return of diploid pairing at fertilization to produce a new individual). Some changes affect a subpopulation (e.g., lineages) and even eventually an entire population (e.g., natural selection of mutations and new diploid combinations leading to evolutionary change).

    The chapter on DNA describes some fundamental concepts about DNA and genetics. In summary, the genetic material of humans comprises about 3 billion nucleotides or building blocks, and is present in two copies per cell, so about 6 billion in total. This DNA is found within the nucleus of all cells other than red blood cells, in total it is called the genome and contains the genes that encode the proteins created by the cell to define the cell's type and characteristics and ultimately the entire organism of the human individual. It also contains other DNA sequences that are regulatory (i.e., affect the temporal or quantitative expression of the genes), structural (i.e., affect intracellular packaging and stability of DNA), or are as yet of unknown function or may even be foreign to a normal human cell (e.g., a viral infection). All of these elements are contained within 23 separate lengths of DNA, the chromosomes.

    The concept that DNA contains the information for biological life using a genetic code encoded within the sequence of bases along the double helix molecule means that if we as forensic scientists can read that code we can question and determine the source of a given sample of DNA. The general DNA structure and constituents are the same so that with the right analytical toolkits, we are able to answer that question. So, we could test not only whether the DNA is from a human, horse, cannabis plant, or soil microbe, but in theory identify the individual human. Scientists are able to take advantage of the adaptable stability of DNA and mimic the process of replication so as to make multiple copies of a DNA sample, using the polymerase chain reaction (PCR, see method). The amplified DNA is then processed and the data interpreted accordingly.

    DNA in Populations

    The first main concept to elaborate upon is that of Mendelian genetics (see Mendel mentioned in DNA). For a simple biological example, I will use the ABO blood group system. Here, there is a single gene involved that defines a person's blood group. The gene controls the production of a chemical on the surface of blood cells. The gene exists within the human population in one of three forms or variants: A, B, and O, and when referring to the gene, it is written italicized. The existence of variable forms within the population is called a polymorphism, and these genetic variants are known scientifically as alleles. They control the production of a protein that exists, respectively, as either protein variant A, variant B, or is not produced (i.e., absent) and thus called O (for null).

    In any individual, the genes that encode everything that eventually produces a human being are present in two copies (not including the X and Y chromosomes), one inherited from mother and one inherited from father. It is the combination of the two copies of all the genes that will determine the final characteristics of the individual. So, while there might be just the one gene for the blood cell protein described above, there will be two copies of the gene in each person. All of the possible genetic combinations seen in different individuals are therefore AA, BB, OO, AB, AO, BO, and where the variants are the same, the person is called homozygous, where they are different, the person is called heterozygous. Going back to the description of the proteins that would be produced from the genetic variants, they are as follows in the table:

    The combination of gene variants possible in any human are called the genotypes (first column) and the final observed biological characteristics (in this instance the blood group) are called the phenotypes. By way of illustrating the difference—the genotypes AO and AA both have the phenotype A because only A is actually observed in blood group testing; the O is silent. When both variants in an individual are the same, the genotype is called homozygous, and when different heterozygous.

    In this example, the gene variants A and B are what is termed codominant, in that they are both observable in the phenotype. The O gene variant is termed recessive, in that when it is present with something else, the something else takes precedence and is the only characteristic observable. Here, the recessiveness is simply because O produces nothing, whereas A and B produce A and B proteins. Note that while we can determine the blood group phenotype of a person from his or her genotype, going in the other direction is not so simple. So, a person of blood group B, for example, may be either BB or BO genotype. Knowing the frequency of the different variants present in the population allows us to predict the percentage chance that a blood group B person has either BB or BO genotype.

    As we now analyze at the DNA level, we do not need to study actual biological traits (although these are under development also, see Phenotype) like the ABO blood group, and can analyze highly polymorphic nongene sequences (see DNA; DNA: An Overview). This has the advantage of comprising a much larger proportion of the genome from which to select for analysis, and showing much greater variability. This greater variability leads to greater individualization such that we can identify a sample as coming from a very small handful of people if not an individual, as opposed to the large sections of the population using previous technology. So, modern DNA profiling kits use multiplex PCR methods to analyze several STR areas (called loci) at the same time. In the United Kingdom, in recent years until 2014, there was a standard kit called AmpFlSTR SGM Plus used to analyze 10 loci as well as the sex-determining region, but this has now been superseded by use of one of several kits (collectively called DNA17) covering the European standard 17 loci. The United States uses a set of 13 loci for criminal justice purposes, the Combined DNA Index System (CODIS), marketed and tested in two main commercial kits (AmpFlSTR Identifiler and PowerPlex 16). For paternity or kinship testing purposes (see Missing Persons and Paternity: DNA), data from these loci are supplemented by other kits (STR or single nucleotide polymorphisms (SNPs)) covering even more than the core sets used in criminal justice systems.

    The Scientific Expert

    This is a time of increasing stringency in the requirements of experts to establish the reliability of the techniques that they use as well as their authority in the use of those. Many systems have been and are being developed that aim to assist the court in assessing such claims. However, most of these have been by a group of experts in the same field forming themselves into some sort of group and deciding for themselves whether it is expertise, and who they will register or endorse. External validation should be a feature of any such system.

    To claim a scientific basis for an expertise, the expert should be able to demonstrate for the technique(s) that they use, as a minimum (see Laboratory Accreditation; Validation):

    1. Reliability studies of analytical technique (validation). The scientific approach to this problem is normally to put a sample through the system to see how often the sample produces the same result. This is the reproducibility of the technique; its ability to provide consistent results when applied to the same samples.

    2. Establishment of false positive and false negative rates.

    Even if the technique is perfect, in many systems, there is not a clear difference in the measurement when it is used to separate two or more groups. A false positive occurs when something that does not have the characteristic being sought is classified as actually having them, whereas a false negative is when something should be placed in a particular class but isn't. Generally, systems are designed to minimize the number of false positives and false negatives. However, in many systems, the two are inextricably linked and as one attempts to minimize one error, the other increases.

    3. Defined match criteria.

    This requires a specification for the degree of similarity that two items must have before we declare a match. The specificity or discriminating power of a technique is, The ability of an analytical procedure to distinguish between two items of different origin.

    4. Probability of any match being a true match.

    Given that all matches are probable matches, this leads to the further requirement to know the probability that the match is to that item or material and no other.

    Forensic DNA

    DNA is present in many types of biological substance (see Biological Stains; Sources of DNA) that can be analyzed by nuclear DNA profiling methods (see DNA: An Overview). These substances include body fluids such as blood or semen that can be seen by the naked eye if in sufficient quantity, as well as invisible amounts of the same fluids or of other substances such as skin cells in sweat or fingermarks (see Sources of DNA).

    The chemical stability of DNA is useful for forensic genetics because it means that the DNA of a biological sample may be analyzed long after it was deposited at a crimescene. This has been very useful, for example, in the re-examination of evidence in cold cases long after storage and with the advent of new analytical methods. If preserved under the right conditions and using the appropriate methods, DNA has even been studied from ancient samples such as Egyptian pharaohs and woolly mammoths. DNA is not stable forever, and again depending on circumstances after deposition at a crimescene and following collection and handling by forensic scientists, it can degrade and affect the ability to get useable results (see Degraded Samples).

    Wherever possible, the nature of a biological deposit is identified or at least suggested so as to assist the evaluation of the significance of a DNA profile once it has been produced. When a body fluid suspected to be blood, semen, or saliva is present in sufficient amount so as to be visible to the naked eye, the location of the substance to be tested is clear and sampling can proceed. After the stain is located, it is usually subjected to a chemical test to try to determine what sort of stain it may be. These tests usually only suggest, rather than indicate, the presence of a biological type, requiring at least one other test to be done to confirm the result. For example, microscopic examination, of a suspected semen sample by acid phosphatase analysis, that finds sperm cells thus confirms a sample to be or to contain semen. Often these confirmatory tests are not done, to save on time and expense, and the evidential weight of a sample is left as probably an X stain using the relevant presumptive test, with a DNA profile matching Mr Y.. When there is little or nothing visible of a biological stain, a search for the possible location of substances must be made.

    The recovered biological substance is then chemically processed so as to release whatever DNA is present (see Extraction). The procedure of extraction must be able to extract the DNA with minimal loss of sample due to the usually small sample collected, and can be done manually or automated. The extracted and purified DNA is then neither in sufficient quantity nor in a form so as to be studied directly, hence the succeeding stages of quantitation and amplification (see Quantitation; Amplification). The quantification step determines how much DNA is present in the sample so that an appropriate amount can be used in the amplification step—sufficient to be analyzed and not so much to overload the system. The amplification step involves a method called PCR, and makes multiple copies of the relevant areas of DNA being profiled while marking them chemically with markers or labels to enable detection by machine.

    The amplified and labeled DNA fragments are at this point in a liquid mixture that must be separated to enable detection and measurement. It is therefore forced by an electrical voltage through a tube containing a molecular sieve or gel, a process called electrophoresis. This process separates the pieces of DNA according to size (length) such that smaller pieces pass more quickly through the gel than large ones. As the pieces pass a specific point in the tube, they are illuminated by light of a defined wavelength. This so-called incident or excitatory light causes the chemical labels added to the sample DNA during amplification to release light of a different wavelength, fluorescence. Different areas of DNA profiled have one of a small number of chemical labels so that depending on which fluorescent light is released at what time during electrophoresis, it is possible to know what area of DNA is being detected. The amount of fluorescence detected is used a measure of the amount of DNA passing through and thus representative of the amount of DNA in the original forensic sample. The information is captured as electronic data and analyzed using software to produce a profile (see Introduction to Forensic DNA Profiling – The Electropherogram (epg)).

    Setting aside some of the difficulties in assessing whether a DNA component is truly present or not, a DNA profile can be matched to another DNA profile with 100% accuracy (interpretation), and with a precision dependent on the number of loci used in the match. The match criteria are defined; the numbers must match exactly.

    Another parameter that may be useful to know when assessing an identification system is the sensitivity. In this context, this means how little of the material from the individual that needs to be available to enable an identification. The technology enables the profiling of the DNA from a single cell, but in forensic genetics, the presence of mixtures and degraded samples can render that ability a vice rather than a virtue (see Interpretation of Mixtures; Graphical, DNA Mixture Interpretation and Degraded Samples).

    Many matches today are enabled by the creation of DNA databases that store the DNA profiles of people selected by a process depending on the jurisdictional rules (see Databases). However, there are a number of scientific and social issues that arise from the use of these databases (see DNA Databases – The Significance of Unique Hits and the Database Controversy; DNA Databases and Evidentiary Issues). Other emerging technologies have also caused debate (see Phenotype; Familial Searching).

    Having established the degree to which a match exists between the crime material and the suspect, the forensic scientist must now evaluate the significance of the match (see Identification and Individualization; Communicating Probabilistic Forensic Evidence in Court). The starting point for all such calculations includes the frequency of the particular components within the population. Those frequencies for the DNA components used in forensic DNA profiling have been measured.

    The use to which those frequencies are put varies with the type of profile obtained and the choice of method for calculating the significance of the evidence. This is the increasingly arcane topic of statistics, which has become so complex that some have introduced probabilistic genotyping software because the calculations are claimed to be too complex to be undertaken manually (see Issues in Forensic DNA).

    An important concept underlying the use of frequencies in forensic work is called the Hardy–Weinberg equilibrium (HWE). The HWE is a state in which the allelic frequencies do not change. Its forensic significance is that in the process of calculating the expected frequencies of genotypes in the population, this equilibrium, or steady state, is assumed as a starting basis.

    Given that any person will normally have two alleles at each locus (although both may be the same), the allelic frequencies can be used to calculate genotype frequencies (paired combinations of alleles) by multiplication.

    In a simple example, if there are only two alleles (A or B) possible at a locus, then there are only three types of individuals: AA, BB, and AB. If the frequency of allele A is p, and the frequency of allele B is q, it is possible to calculate the frequency of each type of person in the population assuming the Hardy–Weinberg rules.

    These rules are

    1. No selection

    2. No mutation

    3. No migration

    4. Random mating

    5. An infinitely large population

    If these rules are not met, the population is not in equilibrium and the allele frequencies will be changing. There is some debate on how to accommodate this, and other possibilities, into calculating the statistics of the random match probability (RMP) for a profile (see DNA Mixture Interpretation). Most commonly a correction factor is incorporated into the calculation for genotype frequencies to allow for population substructures.

    One of the HWE rules is random mating. The alleles exist in pairs within men and women, but the pairs separate in the formation of sperm and egg so, in effect, the next population is a sample of drawing two alleles together, one from the male and one from the female. If p and q are the frequencies of alleles A and B, respectively (and A and B are the only two alleles at that locus), then it is quite simple algebra to calculate the result of one round of random mating in a population where the population of parents have the frequencies p and q for A and B:

    equation

    This gives the frequency of AA children as p², AB children as 2pq, and BB children as q².

    A table can show the same thing perhaps more easily:

    Thus, an allelic frequency database can be used to calculate an RMP to an individual: the probability that one could pick a person who would have the same genotype (profile) at random from a population of unrelated individuals.

    Most professional codes of practice for forensic scientists demand that the scientist is an impartial participant in the legal process. Unfortunately, while science may be impartial in a very restricted sense, it would appear from a considerable body of research that scientists are not. There is an increasing awareness within the forensic community that biases are not only possible but also probable. We merely introduce the topic here (see Observer Effects).

    Similarly, although routine DNA profiling of human DNA dominates the perception of forensic DNA, there are other techniques using different sources of DNA (see Mitochondrial DNA: Profiling; Wildlife Crime; Microbial Forensics) and different technologies (see Single Nucleotide Polymorphism) in addition to the more extreme application of the routine methods (see Issues in Forensic DNA).

    The legal process may be terminated before a case gets to court, but the forensic scientist is frequently required to provide evidence for use in court. The penultimate chapters are devoted to the use of DNA profiling in court.

    Finally, we consider the current debates arising from DNA profiling and consider where the future may lie for this technology that has revolutionized crime investigation and arguably the entire field of forensic science.

    Related Articles in EFS Online

    Short Tandem Repeats: Interpretation

    Chapter 2

    DNA: An Overview

    Eleanor Alison May Graham

    Northumbria University, Newcastle upon Tyne, UK

    History of DNA Profiling

    DNA profiling, as we know it today, was developed, thanks to two independent breakthroughs in molecular biology that occurred at the same time on different sides of the Atlantic. In the United States, the polymerase chain reaction (PCR) was developed by Kary Mullis of Cetus Corporation [1–3]. Almost simultaneously, the individual-specific banding patterns observed after restriction fragment-length polymorphism (RFLP) analysis of repeated DNA sequences were discovered by Professor Sir Alec Jeffreys at the University of Leicester [4–6]. In its earliest incarnation, this technique termed as DNA fingerprinting by its creators was performed by restriction of 0.5–10 µg of extracted DNA using the restriction enzyme HinFI, followed by Southern blotting hybridization with probes termed 33.5, 33.6, and 33.15, designed to bind to multiple minisatellites present in the restricted DNA [6]. This multilocus probing (MLP) technique would result in the binding of probes to multiple independent DNA fragments at the same time, giving rise to the traditional bar code pattern that is often visualized, discussing DNA profiling, even today. Differences in the number of times the probe sequence is repeated in each DNA fragment form the basis of the individual patterns observed on the autoradiogram image.

    The Mendelian inheritance of these markers was established by the pedigree analysis of 54 related persons [4], and the individual nature of the banding pattern was further established by the examination of 20 unrelated persons [6]. The probability of two unrelated individuals carrying the same fingerprint was calculated from these data and was estimated as 3 × 10−11 for probe 33.15 alone, provided 15 bands could be resolved in the 4–20-kb size range on the autoradiogram. The potential application of this technique to maternity/paternity disputes and to forensic investigation was recognized immediately by Jefferys et al., and was demonstrated by DNA fingerprinting of forensic-type samples, such as bloodstains and semen, in the same year [5]. Difficulties in interpretation of MLP images quickly resulted in single locus probes (SLP) with variable number of tandem repeat (VNTR) loci becoming the markers of choice for DNA profiling.

    The first report concerning the use of DNA profiling in a criminal investigation was published in 1987 [7]. This investigation used two unpublished SLPs to link semen stain samples collected from two rape and murder cases that had occurred 3 years apart in 1983 and 1986 in Leicestershire, United Kingdom. The probability of this match occurring by chance was calculated as 5.8 × 10−8. This result not only linked the two crimes but also exonerated an innocent man implicated in the murders and led to the first mass screening project undertaken for DNA profiling in the world [8].

    The potential of DNA analysis for forensic science had now been demonstrated; the technology now required statistical validation by analysis of population frequencies and application to casework samples before it could progress. Early evaluation studies on MLP 33.15 provided optimistic support for the use of DNA for the personal identification and the identification of male rapists from a mixed male/female sample [9]. It does, however, also begin to uncover the limitations of this method. A mean success rate of only 62% for the DNA fingerprinting of donated vaginal swabs was observed and no typing was possible for blood or semen stains that had been stored for 4 years at room temperature, and difficulty in directly comparing related samples run on different gels was also cited as a potential problem [9]. Similar studies and European collaborations were undertaken on SLPs such as YNH24 and MS43a [10, 11]. Difficulties were again observed when interpreting gel images, with only 77.9% of 70 samples distributed between nine laboratories producing matching results when a 2.8% window for size differences between gel runs and laboratories was used [10]. It was recognized that subtle differences between laboratory protocols were responsible for some of the observed discrepancies, leading to a requirement for the standardization of laboratory methodology [10] and DNA profile interpretation [12, 13].

    Such standardizations could improve the reproducibility of DNA typing results for MLP and SLP marker systems, but in order to be applicable to forensic investigation, DNA systems must be robust and must be applicable to samples of a less than pristine nature or that which consists of only a few cells. PCR was first applied to forensic DNA profiling for the investigation of the HLA-DQ-α1 gene, a polymorphic gene that encodes a human leukocyte antigen cell surface protein located in the major histocompatibility complex (MHC) class II region on chromosome 6 [14].

    Two big breakthroughs occurred between the late 1980s and early 1990s that would form the basis of DNA profiling techniques that are recognized today. An alternative class of DNA marker, the microsatellite or short tandem repeat (STR) marker, was described by Weber et al. [15] and an alternative method for DNA visualization, PCR amplification, and fluorescent labeling of VNTR markers was also introduced [16, 17].

    STR Analysis

    STR markers are similar to the VNTR markers that were originally identified and utilized in DNA fingerprinting and SLP profiling. The difference between the classes of DNA marker lies in the length of the tandemly repeated DNA sequence. VNTRs contain 10–33-bp hypervariable repeat motifs [4] that must be observed over a size range of 4–20 kb [9] and SLP markers that are observed over a size range of 1–14 kb [10]. An STR marker repeat is composed of 1–6-bp repeat motifs [18], making the region of DNA that must be scrutinized very short (<1 kb). This length reduction is immediately beneficial to one of the problems encountered in SLP profiling: difficulty in analyzing degraded DNA [5]. The use of the multiplex PCR to amplify target sequences before visualization significantly reduces the amount of DNA required for analysis from microgram to nanogram amounts [18, 19]. The detection and visualization method of polyacrylamide gel electrophoresis and fluorescent detection using an automated DNA sequencer (model 370, Applied Biosystems, Foster City, CA, USA), in combination with an internal size standard (GS2500, Applied Biosystems) and GENESCAN 672 software (Applied Biosystems), also allowed for precise band sizing, answering problems of intralaboratory and interlaboratory allele designation discrepancies that had been observed in SLP analysis [12, 20, 21].

    One of the earliest multiplexed STR systems developed for forensic DNA analysis was a quadruplex reaction that amplified the STR markers HUMVWA31/A, HUMTHO1, HUMF131A1, and HUMFES/FPS [22]. These particular STR markers were selected from the hundreds of STRs identified throughout the human genome [23] based on a number of important parameters. Each STR must have a high level of allelic variability to maximize the discriminating power of each marker. Markers should have short PCR product length (<500 bp) to aid the analysis of degraded DNA. The chromosomal location of any potential marker should be checked to avoid closely linked loci, and tetranucleotide repeat motifs are preferred due to low artifact production during PCR amplification [24]. The overall match probability using this system was calculated as 1.3 × 10−4 for white Caucasian populations [18]. Validation studies carried out on this quadruplex system determined that 1 ng template DNA was optimal for the amplification and analysis, that stutter bands up to 11% were observed at some loci, especially HUMVWA21/A, and that DNA mixtures in a ratio of 1 : 2 or 2 : 1 could be distinguished using this system [22]. Further validation studies were carried out on extremely compromised casework samples collected from the victims of the Waco siege, which resulted in the identification of several individuals, which was not possible by any other means [25, 26].

    The quadruplex system described above was next developed into a septaplex as the application of DNA profiling to forensic casework increased. The new septaplex system coamplified the tetranucleotide STR loci HUMVWFA31/A (vWA), HUMTHO1 (THO1), D8S1179, HUMFIBRA (FGA), D21S11, and D18S51 [27]. The nomenclature used for naming these STR markers had been standardized now to allow for easy comparison between different working groups [28]. This system also included primers for the amplification of a region of the amelogenin gene, which could be used to deduce the sex of the DNA sample being analyzed [29]. Optimization and validation studies on this septaplex system reduced the amount of template required for the generation of full DNA profile to just 500 pg, with partial profiles being generated from as little as 50 pg, the equivalent of just 10 diploid cells [27]. This septaplex became known as the second-generation multiplex (SGM) system [30] and was used to populate the first national criminal intelligence DNA database, which became operational in the United Kingdom in April 1995 [27]. The SGM system was human specific and highly discriminating, with a probability of chance association calculated as 1 × 10−8 [31]. It could also be applied to degraded DNA samples and was capable of detecting and resolving mixed DNA profiles at ratios between 1 : 10 and 10 : 1 [30]. One final evolution would take place, the inclusion of a further four STR markers (D3S1358, D19S433, D16S539, and D2S1338) to produce the STR profiling kit that is currently used in the United Kingdom for forensic DNA profiling, population of the national DNA database (NDNAD), and for paternity testing.

    The AmpFlSTR® SGM Plus™ System

    The AmpFlSTR® SGM Plus™ system, commercially produced by Applied Biosystems (ABI), a division of Perkin Elmer, Foster City, California, USA, was introduced in June 1999, and was validated for use in forensic casework in 2000 [32]. It was designed to replace the SGM system for forensic casework in the United Kingdom to decrease the probability of a chance match occurring from 1 × 10−8 to one in trillions for unrelated individuals [32]. This greatly increased the statistical power of DNA evidence to be taken for scrutiny in the courtroom, while being back compatible with DNA profiles already stored in the NDNAD. To maintain back-compatibility, partial DNA profiles must contain data at a minimum of four of the original SGM loci to be considered for uploading to the NDNAD. The statistical power of this new system was deemed so great that instead of calculating the exact match probability for a full DNA profile, it was recommended that an arbitrary conservative estimate of one in a billion be reported for the match probability between unrelated individuals, one in a million for parent/child relations, and 1 in 10 000 for siblings [33]. The characteristics of each STR marker are detailed in Table 1.

    Table 1 Characteristics of SGM Plus STR locia

    a Adapted from AmpFlSTR SGM Plus PCR Amplification kit manual and [43]

    b Loci included in the original SGM kit

    The AmpFlSTR® SGM Plus™ PCR amplification kit can be analyzed by two DNA separation methods: capillary electrophoresis or polyacrylamide gel electrophoresis. In this article, the analysis was performed by polyacrylamide gel electrophoresis using an ABI Prism™ 377XL DNA Sequencer (ABI). The ABI Prism™ 377XL DNA Sequencer was introduced by Applied Biosystems in 1995 and stopped its use in 2001 [44]. Automated fragment sizing of fluorescently labeled DNA fragments was achieved by the use of a scanning argon ion laser, which tracks back and forth across a read-region at the lower end of a vertical polyacrylamide gel. As each labeled DNA fragment passes the laser, the fluorescent dye is excited, resulting in emission of light. This light is then collected and separated by wavelength onto a charged coupled device (CCD) camera. The camera used on the 377 model is capable of detecting four different wavelengths simultaneously, allowing for the detection of three similarly sized PCR products in a single gel run, with inclusion of a separately colored size standard in each lane. The 377XL model was validated for forensic STR analysis in 1996, using the original SGM septaplex system [45]. The set of validation experiments carried out determined that complete resolution of 1 bp differences between fragments could be achieved up to 350 bp, sizing precision was increased twofold, and sensitivity was increased by one-third compared to the predeceasing 373A DNA sequencer [45].

    Gel electrophoresis has now largely, if not completely, been superseded by the use of capillary electrophoresis in commercial forensic laboratories and many research institutes around the world. The market leader for provision of capillary electrophoresis equipment, Applied Biosystems, now part of Life Technologies, has produced several instruments that have been adopted for forensic use [46]. The current instrument of choice for many laboratories is the 16-capillary 3130XL Genetic Analyser, but this may itself soon be replaced by the 3500/3500XL for 8/24-cappillary capacity. The 3500 series of instruments have been designed specifically for the forensic market and include the addition of radio frequency identification (RFID) tags for more efficient consumable monitoring, among other features. Although not currently operational in commercial forensic laboratories, lab-on-a-chip technologies are being rapidly developed to allow for miniaturization of capillary electrophoresis devices [47], which, in the near future, may allow for rapid at scene DNA profiling to become a reality.

    European Standard Set (ESS) of STRs

    Although STRs are used as the DNA marker of choice for forensic DNA analysis and population of NDNADs throughout the world, they are not all populated with equivalent data due to the adoption of different STR marker sets in different countries. To increase the discriminatory power of each STR system and decrease the probability of adventitious matches on DNA database searches, several new kits have been produced that allow for the analysis of up to 16 STR markers in a single reaction [48, 49]. Although the new multiplex kits have not been introduced to routine casework in the United Kingdom at the time of writing, it is anticipated that they will be adopted in the near future. An additional motivation for the transition to be made comes from the desire to exchange DNA data with member states of the European Union for the resolution of cross-border crime. The Council of the European Union resolution on the exchange of DNA analysis results, also known as the Prüm Treaty, identifies a core set of 12 STR markers that should be used when DNA data exchanges between EU member states (see Official journal). Details of the STR loci included in the new STR multiplex kits are provided in Table 2.

    Table 2 STR loci included in newly developed commercial STR profiling kits

    Alternative DNA Markers

    Autosomal STR markers have become the most utilized ones in both forensic and paternity DNA profiling. However, there are numerous alternative markers that can be interrogated when required. Another class of autosomal marker, the single nucleotide polymorphism (SNP), has been investigated for application to forensic casework and identification projects. SNPs, as the name suggests, are alterations of a single base pair and occur on average every few hundred base pairs throughout the human genome [50, 51]. The major advantage of SNP markers over STRs is the small size of the DNA target, making them very useful for degraded DNA analysis and disaster victim identification projects [52]. Reduced size STR amplicons have, however, been developed to analyze degraded DNA while remaining compatible with current DNA databases to allow easy searching in identification projects [53].

    There are a number of disadvantages associated with the use of SNP markers instead of STRs. The comparatively low discrimination power of each locus requires that approximately 50 SNPs must be investigated to give match probabilities equal to 10 STR loci [54]. Mixture analysis is also complicated, for SNP analyses are usually bi-allelic, but have also been observed to be tri- and even tetra-allelic. The use of a predominantly bi-allelic marker makes the distinction between mixtures and homozygotes difficult, especially for minor contributors, which may also display allelic dropout [54]. Owing to these problems and the financial implications of repopulating NDNADs with SNP profiles, it is unlikely that SNP markers play a major role at the forefront

    Enjoying the preview?
    Page 1 of 1