Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Forensic Practitioner's Guide to the Interpretation of Complex DNA Profiles
Forensic Practitioner's Guide to the Interpretation of Complex DNA Profiles
Forensic Practitioner's Guide to the Interpretation of Complex DNA Profiles
Ebook1,047 pages9 hours

Forensic Practitioner's Guide to the Interpretation of Complex DNA Profiles

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Over the past twenty years, there’s been a gradual shift in the way forensic scientists approach the evaluation of DNA profiling evidence that is taken to court. Many laboratories are now adopting ‘probabilistic genotyping’ to interpret complex DNA mixtures. However, current practice is very diverse, where a whole range of technologies are used to interpret DNA profiles and the software approaches advocated are commonly used throughout the world.

Forensic Practitioner’s Guide to the Interpretation of Complex DNA Profiles places the main concepts of DNA profiling into context and fills a niche that is unoccupied in current literature. The book begins with an introduction to basic forensic genetics, covering a brief historical description of the development and harmonization of STR markers and national DNA databases. The laws of statistics are described, along with the likelihood ratio based on Hardy-Weinberg equilibrium and alternative models considering sub-structuring and relatedness. The historical development of low template mixture analysis, theory and practice, is also described, so the reader has a full understanding of rationale and progression. Evaluation of evidence and statement writing is described in detail, along with common pitfalls and their avoidance.

The authors have been at the forefront of the revolution, having made substantial contributions to theory and practice over the past two decades. All methods described are open-source and freely available, supported by sets of test-data and links to web-sites with further information. This book is written primarily for the biologist with little or no statistical training. However, sufficient information will also be provided for the experienced statistician. Consequently, the book appeals to a diverse audience

  • Covers short tandem repeat (STR) analysis, including database searching and massive parallel sequencing (both STRs and SNPs)
  • Encourages dissemination and understanding of probabilistic genotyping by including practical examples of varying complexity
  • Written by authors intimately involved with software development, training at international workshops and reporting cases worldwide using the methods described in this book
LanguageEnglish
Release dateJun 10, 2020
ISBN9780128205686
Forensic Practitioner's Guide to the Interpretation of Complex DNA Profiles
Author

Peter Gill

Dr. Peter Gill joined the Forensic Science Service (FSS) in 1982. He began his research into DNA in 1985, collaborating with Sir Alec Jeffreys of Leicester University. In the same year they published the first demonstration of the forensic application of DNA profiling. In 1987, Dr. Gill was given an award under the civil service inventor’s scheme for discovery of the preferential sperm DNA extraction technique and the development of associated forensic tests. He was employed as Senior Principal Research Scientist at the Forensic Science Service (FSS). Currently, he hold concurrent positions at Oslo University Hospital and the University of Oslo where he is Professor of Forensic Genetics. Romanovs In 1993-4, Dr. Gill was responsible for leading the team which confirmed the identity of the remains of the Romanov family, murdered in 1918, and also the subsequent investigation which disproved the claim of Anna Anderson to be the Duchess Anastasia (using tissue preserved in a paraffin wax block for several decades). This was an early example of an historical mystery that was solved by the analysis of very degraded and aged material, and was one of the first demonstrations of low-template DNA analysis. Low-template DNA In relation to the above, Dr. Gill was responsible for developing a routine casework-based ‘super-sensitive’ method of DNA profiling that was capable of analysing DNA profiles from a handful of cells. This method was originally known as low-copy-number (LCN) DNA profiling. Now it is known as Low template DNA profiling. New statistical methods and thinking were also developed to facilitate the new methods. National DNA database Dr. Gill was responsible for leading the team that developed the first multiplex DNA systems to be used in a National DNA database anywhere in the world, and for the design of interpretation methods that are in current use (c.1995). Court reporting: Dr. Gill has been involved with giving evidence in several high profile (controversial) cases – including the Doheny / Adams appeals, and the Omagh bombing trial in the UK. Membership of scientific societies Currently, Dr. Gill is a member of the European Network of Forensic Science Institutes and ex-chair of the ‘methods, analysis and interpretation sub-section’ He is chair of the International society for forensic genetics DNA commission on mixtures and has written a number of ISFG recommendations on low-template, mixture interpretation and evaluation of evidence that are highly cited. D. Gill is a member of the European DNA Profiling Group (EDNAP). He has published more than 200 papers in the international scientific literature which have been cited more than 20,000 times – many of these are collaborative papers under the auspices of ISFG, EDNAP and ENFSI. He is the recipient of the 2013 Scientific Prize of the International Society for Forensic Genetics. Affiliations and Expertise Forensic Genetics Research Group, Oslo University Hospital; Institute of Clinical Medicine, University of Oslo, Norwa

Related to Forensic Practitioner's Guide to the Interpretation of Complex DNA Profiles

Related ebooks

Law For You

View More

Related articles

Reviews for Forensic Practitioner's Guide to the Interpretation of Complex DNA Profiles

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Forensic Practitioner's Guide to the Interpretation of Complex DNA Profiles - Peter Gill

    R.

    Chapter 1

    Forensic genetics: the basics

    Abstract

    The interpretation of evidence is rooted in population genetics theory. The fundamental principle that underpins this is the Hardy–Weinberg equilibrium. The inheritance of genes follows laws of probability that enable probability estimates to describe the rarity of a DNA profile. Such calculations rely upon the Hardy–Weinberg equilibrium assumption of independence, i.e., that populations are very large, randomly interspersed and randomly mating. However, such assumptions are rarely justified—populations are structured into sub-populations; the frequencies of their alleles differ. To compensate, the FST statistic is used as a measure of population differentiation. There are initiatives to collate data to provide community population databases that can be accessed, for example, STRidER acts as a platform that is used as a repository that are subject to rigorous quality control before release.

    The likelihood ratio is fundamental to all aspects of interpretation of forensic evidence. Likelihood ratios are very flexible. In particular, they can be used to evaluate DNA mixtures, which are the subject of much of the material to be found in subsequent chapters. The early genetics theory applied to mixtures is described, limited to two-contributors. The calculations are very complex. The need for computer algorithms to take over the burden of calculation are described, and there is a demonstration of the principle of extension of multiple contributors. A list of software available to carry out analysis of mixtures is provided. Finally, extension of the theory to relatedness (kinship) tests is introduced.

    Keywords

    Hardy–Weinberg equilibrium; FST; STRidER; likelihood ratio

    The interpretation of forensic genetic evidence is based upon probability. Probability is expressed as a number that is somewhere between zero and one, representing two extremes: a probability of zero means that something is impossible, whereas a probability of one means that something is certain. In practice, a probability is never exactly zero or one—it is usually somewhere between the two extremes.

    In forensic genetics, a probability is usually equated to the frequency of observation of particular type. Before the DNA era, probabilities were applied to blood groups. One of the first used for forensic typing was the ABO grouping that was credited to the Austrian scientist Karl Landsteiner, who identified the O, A, and B blood types in 1900.

    The inheritance of the ABO blood groups follows the laws of Mendelian genetics. Chromosomes are inherited in pairs. There are two genes, one is inherited from the mother and the other from the father. The genes are inherited via gametes, i.e., sperm of the father and the eggs (ova) of the mother. To ensure that the offspring only has a single pair of genes per cell, the gametes only contain a single copy.

    Some basic definitions follow:

    Gene: The gene is a stretch of DNA positioned on a chromosome. The gene may have a function, such as producing proteins that determine eye colour. However, genes that are currently analyzed in forensic science are sometimes described as junk DNA since they have no known function, although this idea has been challenged.

    Locus: Describes the position of a gene on a chromosome, commonly expressed by a universal identifier, such as D22S11.

    Allele: Genes of forensic interest are variable; this means that there are different versions of genes, where the DNA code differs.

    Therefore for the blood group ABO gene, which is positioned chromosome 9 at the band/locus 9q34.2, which means the long (q) arm of chromosome 9 at position 34.2. Over recent years, the human genetics community has compiled two human genome assemblies called GRCh37 and GRCh38. Both are references in human genome databases, such as the NCBI Genome Browser http://www.ncbi.nlm.nih.gov. Because it is the most up-to-date, GRCh38 is recommended by the DNA Commission of the ISFG [1]. The molecular location of the ABO gene is between 133,250,401 to 133,275,201 base pairs on chromosome 9: https://ghr.nlm.nih.gov/gene/ABO#location. It comprises three common different alleles, namely A, B, and O. In diploid cells, there are six possible combinations called genotypes that are observed in the population. These are AA, AB, AO, BO, OO, BB, and they are described as mutually exclusive to the individual, meaning that a person can only have one genotype, but mutually inclusive with respect to the population, meaning that any given individual selected from the population must have one of these genotypes.¹

    Alleles A and B are both dominant to allele O. If a person has an AO or BO genotype, the O is masked, which results in the person being typed as A or B, respectively. The masked O allele is called recessive. Dominant alleles mask the genotype of the person. People are classed as being A, B, or O phenotypes, where a phenotype can be expressed by more than one genotype.

    With conventional DNA profiling, we do not need to worry about phenotypes, because we deal with well defined genetic sequences.

    The variability of the gene forms the basis of its usefulness to discriminate between individuals. Usually, the aim is to associate crime-stain evidence to some specific individual, typically a suspect who may have been apprehended. He/she may be described as the questioned individual, or the person of interest (POI). Note that the POI is not always the suspect, sometimes he/she may be a victim, for example, where a body fluid stain is found on the clothing of the suspect.

    1.1 Short tandem repeat (STR) analysis

    Short tandem repeats (STRs) are blocks of tandemly repeating DNA sequences found throughout the human genome. Forensic laboratories usually use four base pair repeats, because shorter sequences were much more prone to artefacts known as stutters. For a comprehensive review of STRs currently used in casework, the reader is referred to [2] and the NIST STRBase website https://strbase.nist.gov/index.htm, which lists sequences of common and rare alleles. Refer to Parson et al. [1], supplemental materials, for full details of sequences using up-to-date recommendations of the ISFG DNA commission.

    There are three kinds of repeat sequences defined by Urquhart et al. [3]. Simple repeats contain units of identical length and sequence; compound repeats comprise two or more adjacent simple repeats; complex repeats may contain several repeat blocks of variable unit length, along with variable intervening sequences.

    Simple repeat example The STR HUMTH01 locus is an example of a simple AATG sequence ranging between three to 14 repeats, and it is written shorthand as [AATG]a, where a= the number of repeats. A common microvariant allele is observed that consists of a single base deletion of the seventh repeat in the 10 allele, which results in a partial repeat of just three bases. It is signified as [AATG]6 ATG [AATG]3, and the nomenclature follows as HUMTH01 9.3.

    Compound repeat example HUMVWA is an example of a compound repeat locus² [TCTA]a [TCTG]b [TCTA]0−1, where the final sequence is either not observed or observed once.

    Complex repeat example D21S11 is a highly polymorphic locus with a compound structure of several intervening sequences [TCTA]a, [TCTG]b, [TCTA]c, TA [TCTA]d, TCA [TCTA]e, TCCATA [TCTA]f.

    Repeat unit nomenclature is standardized for capillary gel electrophoresis (CE) applications. This enables universal comparisons between laboratories and national DNA databases to be achieved. The nomenclature used is based upon the number of repeat sequences [4].

    Existing designation systems that are universally applied to national DNA databases are based upon the repeating structure of typical reference alleles that were discovered and characterized in the early to mid 1990s. All new allelic variant designations must fit within the scheme, regardless of sequence, and are strictly conditioned upon the number of bases that are counted in the fragment length. Comparisons are made against an allelic ladder. This means that the length of the STR repeat, and its correspondence to the reference sequenced repeat does not necessarily hold. Let us suppose that there is a deletion of a single base in the flanking region of an amplicon in an allele 27 variant; though the repeating structure may be identical to that listed, the allelic designation must change to 26.3. Consequently, this allele designation no longer reflects the repeat structure of the reference sequence.³

    With the introduction of massively parallel sequencing (MPS), the issue of nomenclature has achieved new prominence [1]. Whereas the sequence information is generally hidden from view with conventional CE applications, with MPS, this information is available. This results in the observation of polymorphisms, where there are sequence differences between alleles, though the STR fragment sizes are identical. With CE, all of these polymorphisms would be classed together, whereas with MPS they can be separated, with the resulting increase in discriminating power. There is continued discussion on nomenclature in relation to MPS in Chapter 13.9.1. The main aim is to devise a new nomenclature to maximise the benefits of MPS, whilst at the same time retaining back-compatibility with existing CE repeat unit nomenclature.

    1.1.1 Historical development of multiplexed systems

    Short tandem repeat (STR) analysis was introduced into forensic casework about 25 years ago. The ability to combine several markers to form multiplexes and to subsequently visualize the results by automated fluorescent sequencing made national DNA databases feasible. The first example was launched in 1995 by the UK Forensic Science Service (FSS). In total there have been three iterations of multiplexes.

    Early multiplexes consisted of relatively few loci based on simple STRs. The four locus quadruplex was the first multiplex to be used in casework, and was developed by the Forensic Science Service (FSS) [5]. Because it consisted of just four STRs, there was a high probability of a random match—approximately 1 in 10,000. In 1995 the FSS re-engineered the multiplex, producing a 6 locus STR system combined with the amelogenin sex test [6]. This acquired the name second generation multiplex (SGM). The addition of complex STRs, D21S11 and HUMFIBRA/FGA [7], which have greater variability than simple STRs, decreased the probability of a random match to about 1 in 50 million. In the UK, the introduction of SGM in 1995 facilitated the implementation of the UK national DNA database (NDNAD) [8]. As databases become much larger, the number of pairwise comparisons increases dramatically, so it became necessary to ensure that the match probability of the system was sufficient to minimize the chance of two unrelated individuals matching by chance (otherwise known as an adventitious match). Consequently, as the UK NDNAD grew in its first four years of operation, a new system known as the AmpFlwas introduced in 1999. This system comprised 10 STR loci with amelogenin, replacing the previous SGM system. To ensure continuity of the DNA database, and to enable the new system to match samples that had been collated in previous years, all six loci of the older SGM system were retained in the new AmpFlSTR SGM Plus system.

    1.2 Development and harmonization of European National DNA databases

    Harmonization of STR loci was achieved by collaboration at the international level. Notably, the European DNA profiling group (EDNAP) carried out a series of successful studies to identify and to recommend STR loci for the forensic community to use. This work began with an evaluation of the simple STRs HUMTH01 and HUMVWFA31 [10]. Subsequently, the group evaluated D21S11 and HUMFIBRA/FGA [11]. Recommendations on the use of STRs were published by the ISFG [4].

    Most, if not all, European countries have legislated to implement national DNA databases that are based upon STRs [12]. In Europe, there has been a drive to standardize loci across countries to meet the challenge of increasing cross-border crime. In particular, a European Community (EC) funded initiative led by the ENFSI group was responsible for co-ordinating collaborative exercises to validate commercially available multiplexes for general use [13]. National DNA databases were introduced in 1997 in Holland and Austria; 1998 in Germany, France, Slovenia, and Cypus; 1999 in Finland, Norway, and Belgium; 2000 in Sweden, Denmark, Switzerland, Spain, Italy, and Czech Republic; 2002 in Greece and Lithuania; 2003 in Hungary; 2004 in Estonia and Slovakia [14].

    A parallel process has occurred in Canada [15,16] and in the US [17]), where standardization was based on thirteen STR loci, known as the Combined DNA Index System (CODIS) core loci.

    An FBI-sponsored CODIS core loci working group recommended an expanded set of loci from the thirteen in use in 2011 [18,19]. There followed an extensive validation study, which resulted in the recommendation that seven new loci were to be adopted [20], resulting in 20 CODIS core loci to be implemented by 2017. The additional seven loci included the five new European ESS markers, plus D2S1338 and D19S433 (see next section). This resulted in comparability of the CODIS core and expanded ESS to have 15 DNA loci in common [21].

    1.2.1 Development of the European set of standard (ESS) markers

    Based on the initial EDNAP exercises and recommendations by ENSFI and the Interpol working party [22], four loci were originally defined as the European standard set (ESS) of loci—HUMTH01, HUMVWFA31, D21S11, and HUMFIBRA/FGA. The identity of these loci was dictated by their universal incorporation into different commercial multiplexes that were utilized by member states. By the same rationale, three additional loci were added to this set—D3S1358, D8S1179, and D18S51. These loci are the same as the standard set of loci identified by Interpol for the global exchange of DNA data.

    A subsequent expansion of ESS loci was motivated by the Prüm treaty of 2005 . The new loci were officially adopted by the European Commission [28] and Interpol in 2010; this led to development of a series of new multiplexes by the major companies (Promega, Life Technologies, and Qiagen). See Fig. 1.1. Practically speaking there are sixteen loci, since D16S539, D19S433, D2S1338, and SE33 are all included European multiplexes in addition to the ESS markers. A complete list of multiplex kits and their loci can be accessed from the NIST website https://strbase.nist.gov/multiplx.htm.

    Figure 1.1 Commonly used multiplex kits showing ESS and CODIS loci relative to molecular weights (bp) using different dye markers.

    New biochemistry has simultaneously increased the sensitivity of tests, to the extent that the once controversial low-level or low-template (LT-) DNA analysis is considered to be routine (Chapter 4). However, this is not without challenge. LT-DNA profiles tend to be complex mixtures, with problems of missing alleles, known as drop-out. This book will explain how statistical methods, based on likelihood ratio (LR) estimation, have been critical to improve the interpretation of evidence.

    1.3 Hardy–Weinberg equilibrium

    There is a fundamental principle that underpins all population genetics: the Hardy–Weinberg equilibrium, named after two scientists who simultaneously discovered the formula in the early 1900's [29].

    To illustrate, consider a simple example that comprises two alleles: a and b, respectively. These two alleles are found in three alternative diploid combinations (genotypes) aa, ab, and bb respectively: two alleles the same are called homozygotes aa, bb whereas two different alleles, ab are heterozygotes.

    individuals. It is relatively straight forward to express the genotype observations in terms of frequencies as this is simply

    It is a law of probability that the sum of all possible outcomes is one. It also follows that the larger the sample size, the better the frequency estimate will be. International Society for Forensic Genetics (ISFG) DNA commission recommendations [30] suggest that a sample size of at least 500 is desirable, although for small discrete populations that are difficult to access, a smaller sample size will suffice.

    Next, the number of alleles in the population is calculated. This is achieved by counting the number of alleles in the observed data. In the example, there are two alleles (and three genotypes). For a homozygote aa, there are two a alleles, whereas for a heterozygote ab, there is one a allele.

    The total number of a and b alleles is twice the number of individuals (n).

    heterozygotes, or 45a alleles and 45b alleles. To find the proportion, we divide by 2n

    The same calculation is carried out for allele b

    and

    Once the allele frequencies are estimated, this information can be used to calculate the expectation that an individual chosen at random will be a particular genotype. The expected genotype proportions are calculated by applying the Hardy–Weinberg equilibrium formula, which describes the relative probabilistic expectations of the genotype proportion in genotypes aa, ab, bb.

    The Hardy–Weinberg formula is important, because it is the basis of the product rule. This relies upon a law of probability, which states that the probability of two independent events occurring together can be estimated by multiplying their individual probabilities. The probability of genotype aa , and the probability of genotype bb .

    Heterozygote genotype ab can occur in two different ways: There are two chromosome strands, which can be labeled c1 and c2. Therefore there are two alternative arrangements whereby an individual can be ab.

    . As before, the sum of the probabilities always equals one.

    The Hardy–Weinberg formula holds true if

    1.  There is no migration in or out of the population.

    2.  There is no natural selection that favours the survival of individuals with certain alleles.

    3.  The population is assumed to be randomly mating, without inbreeding and is very large.

    4.  There is no mutation of alleles.

    5.  Generations are non-overlapping.

    In practice, it is difficult to fulfill the Hardy–Weinberg assumptions, because populations are not discrete or static; there is often much migration, immigration, and interbreeding between different populations. Furthermore, because populations are finite, there is always some inbreeding. The effect of inbreeding is to increase the level of homozygosity (the Wahlund effect [31]), and this means that the multiplication rule used to estimate genotype frequencies is not strictly valid. The implications and solution is examined in greater detail in Section 1.12.

    , the Hardy Weinberg expectations are

    The number of expected aa, ab, bb genotypes is recovered by multiplying their expected frequencies by n (Table 1.1). Note that the observed and expected genotype frequencies are close, but not exactly the same. It is usual to see small deviations of this kind.

    Table 1.1

    Chi-square statistic to test for deviation from Hardy–Weinberg equilibrium.

    1.3.1 Measuring deviation from Hardy–Weinberg equilibrium

    Chi-square test

    The chi-square statistic tests the null hypothesis, which basically states that there is no difference between the observed and expected results. It is calculated as follows (Table 1.1):

    (1.1)

    where o= observed data and e= expected data

    (1.2)

    The result is compared to a chi-square distribution table, e.g., .

    If a result is significant at a chosen level, then the null hypothesis is rejected. Traditionally, the significance level (α, where α is the desired overall significance level and m , making it less likely that any individual test will fail the null hypothesis.

    Fisher's exact test

    ⁴ under HW equilibrium:

    (1.3)

    Raw programming of these formulae lead to huge numbers that lead to errors. An R-package, HardyWeinberg is available: https://cran.r-project.org/web/packages/HardyWeinberg/HardyWeinberg.pdf. This package can be easily utilized to carry out the necessary calculations using the HWExact function. The chi-square test is also accommodated under the HWChisq function.

    Exact tests are important in the quality assurance of frequency databases, and this is described in more detail in Section 1.15.

    1.3.2 Extension to multiple alleles in STRs

    Whereas single nucleotide polymorphisms described in Section 1.3 are described by two alleles, all of the autosomal STR systems currently in use have many more alleles per locus. A compilation of allele frequency databases can be directly accessed from the STRs for identity ENFSI reference database (STRidER): https://strider.online/ [32]. For example, HUMTH01 has eight alleles listed; FGA has 24 alleles listed.

    The extension to multiple alleles and loci is straight forward. A list of all possible genotypes can be obtained by listing the alleles sequentially in the first row and column of Table 1.2. The genotypes, comprised of two alleles, are designated by their intersections in the table. This process is also known as pairwise comparison, can be calculated as

    (1.4)

    is the binomial coefficient giving the number of outcomes to select y elements out of x elements (unordered with replacement). There are eight alleles in HUMTH01, hence there are a total of 36 genotypes, which is the number of elements in the lower triangular matrix, plus the number of elements of the diagonal of Table 1.2.

    Table 1.2

    Depiction of all possible 36 genotypes for HUMTH01 using pairwise comparisons of eight alleles.

    Pairwise comparison is an important part of computer programming, which will be discussed in detail in Section 1.11. All possible genotype combinations are easily listed, hence the probabilities of the genotypes are also easily generated by multiplying their allele frequencies together.

    1.4 Quality assurance of data

    It is desirable to carry out statistical tests to demonstrate if HW expectations are satisfied (Section 1.3) as a step to demonstrate if the data are independent, so that they can be properly utilized in strength of evidence calculations using the product rule.

    On behalf of the European Network of Forensic Science Institutes (ENFSI) group, Welch et al. , otherwise known as theta (θ), (discussed in detail in Section 1.12).

    This collection of European population data was made accessible by the European Network of Forensic Science Institutes (ENFSI) on STRbASE, now superseded by STRidER (STRs for identity ENFSI Reference database) [32] https://strider.online/. It has an integrated approach to assuring the quality of submitted data before acceptance.

    In support of STRidER, the International Society for Forensic Genetics (ISFG) has published guidelines [30]. The recommendations are summarised by the following:

    1.  The minimum requirements are 15 autosomal STR loci, typed from 500 samples.

    2.  The geographical origin of the database is stated.

    3.  Methods of analysis stated, STR typing kit used.

    4.  Information on data analysis and handling.

    5.  Datasets must pass STRidER QC tests before they can be published in FSI: Genetics.

    When databases are submitted, the data are examined for duplicates, close relatives, and transcription errors. Once data have been verified, statistical tests are carried out to show if the data conform to Hardy–Weinberg expectations. STRidER actively accepts new databases (Fig. 1.2) and is working towards collections of data generated by new generation sequencing (NGS).

    Figure 1.2 Figure taken from [32]. The STRidER work flow, showing the integration of the QC platform and the STR database, resulting in high quality data in FSI: genetics publications. Reproduced from [32] with permission from Elsevier.

    1.5 Recap: the laws of statistics

    Before progressing further to explain mixture interpretation, it is necessary to recap two fundamental laws of statistics:

    The product rule: The probability that two independent alleles both occur together is defined by the multiplication rule, so the probability (Pr) of observing a genotype, ab, where a is on chromosome strand c1 and b is on chromosome strand c.

    We do not know the chromosomal arrangement of alleles, but one or the other must be true, they cannot both be true at the same time. The probability of either genotype ab or genotype ba (events A or B) is subject to the addition rule.

    Addition rule: When two events, A and B, are mutually exclusive, the probability that A or B will occur is the sum of the probability of each event.

    We continue with the example. Genotype ab is observed and the chromosomal arrangement is unknown, hence the probability that the order is ab or ba .

    Independent/dependent: If the occurrence of event A changes the probability of event B, then events A and B are dependent. On the other hand, if the occurrence of event A does not change the probability of event B, then events A and B are

    Enjoying the preview?
    Page 1 of 1