Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Reproducibility in Biomedical Research: Epistemological and Statistical Problems and the Future
Reproducibility in Biomedical Research: Epistemological and Statistical Problems and the Future
Reproducibility in Biomedical Research: Epistemological and Statistical Problems and the Future
Ebook1,166 pages14 hours

Reproducibility in Biomedical Research: Epistemological and Statistical Problems and the Future

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Reproducibility in Biomedical Research: Epistemological and Statistical Problems, 2nd Ed. explores the ideas and conundrums inherent in scientific research.

Reproducibility is one of the biggest challenges in biomedical research. It affects not only the ability to replicate results, but the very trust in the findings. Since published in 2019, Reproducibility of Biomedical Research: Epistemological and Statistical Problems established itself as a solid ethical reference in the area, leading to significant reflection on biomedical research. The second edition addresses new challenges to reproducibility in biosciences, namely reproducibility of machine learning Artificial Intelligence (AI), reproducibility of translation from research to medical care, and the fundamental challenges to reproducibility. All current chapters will be expanded to cover advances in the topics previously addressed.

Reproducibility in Biomedical Research: Epistemological and Statistical Problems, 2nd Ed. provides biomedical researchers with a framework to better understand the reproducibility challenges in the area. Newly introduced interactive exercises and updated case studies help students understand the fundamental concepts involved in the area.

  • Includes four new chapters and updates across the book, covering recent developments of issues affecting reproducibility in biomedical research
  • Covers reproducibility of results from machine learning AI algorithms
  • Presents new case studies to illustrate challenges in related fields
  • Includes a companion website with interactive exercises and summary tables
LanguageEnglish
Release dateApr 29, 2024
ISBN9780443138300
Reproducibility in Biomedical Research: Epistemological and Statistical Problems and the Future
Author

Erwin B. Montgomery Jr.

Dr. Montgomery has been an academic neurologist for over 40 years pursuing teaching, clinical and basic research at major academic medical centers. He has authored over 120 peer reviewed journal articles (available on PubMed) and 8 books on medicine (4 on the subject of Deep Brain Stimulation). The last two have been “Reproducibility in Biomedical Research” (Academic Press, 2019) and “The Ethics of Everyday Medicine” (Academic Press, 2019).

Read more from Erwin B. Montgomery Jr.

Related to Reproducibility in Biomedical Research

Related ebooks

Biology For You

View More

Related articles

Related categories

Reviews for Reproducibility in Biomedical Research

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Reproducibility in Biomedical Research - Erwin B. Montgomery Jr.

    Front Cover for Reproducibility in Biomedical Research - Epistemological and Statistical Problems and the Future - 2nd Edition - by Erwin B. Montgomery

    Reproducibility in Biomedical Research

    Epistemological and Statistical Problems and the Future

    Second Edition

    Erwin B. Montgomery Jr.

    Department of Medicine (Neurology), Michael G. DeGroote School of Medicine at McMaster University, Hamilton, ON, Canada

    Table of Contents

    Cover image

    Title page

    Copyright

    Dedication

    Quotes

    Preface to the second edition

    Looking just over the horizon

    Machine learning artificial intelligence

    Translational research

    Biological realism and Chaos and Complexity

    Probability and statistical epistemology

    Randomness as fundamental and foundational

    Finally…

    Preface to the first edition

    Chapter 1. Introduction

    Abstract

    Productive irreproducibility

    The multifaceted notion of reproducibility and irreproducibility

    Turning from the past with an eye to the future

    The fundamental causes of unproductive irreproducibility

    Proceeding from what is certain but not useful to what is uncertain but useful

    Precision versus accuracy

    Dynamics

    Machine learning artificial intelligence and the emperor’s new clothes

    Knowledge is prediction, prediction is reproducibility or productive irreproducibility

    Challenges to prediction and thus biomedical research

    When traditional experimental design and statistics breed unproductive irreproducibility

    Data do not and cannot speak for themselves

    Reductionism and the fundamental problem

    Summary

    Chapter 2. The problem of irreproducibility

    Abstract

    Getting a handle on the scope of unproductive irreproducibility

    The inescapable risk of irreproducibility

    Institutional responses

    Who speaks for reproducibility and irreproducibility?

    Fundamental limits to reproducibility as traditionally defined

    Variability, central tendency, Chaos, and Complexity

    Summary

    Chapter 3. Validity of biomedical science, reproducibility, and irreproducibility

    Abstract

    Science must be doing something right and therein lies reproducibility and productive irreproducibility

    Legacy of injudicious use of scientific logical fallacies

    Science versus human knowledge of it

    The necessity of enabling assumptions

    Special cases of irreproducible reproducibility

    Science as inference to the best explanation

    Summary

    Chapter 4. The logic of certainty versus the logic of discovery

    Abstract

    Certainty, reproducibility, and logic

    Deductive logic—certainty and limitations

    Propositional logic

    Syllogistic deduction

    Centrality of syllogistic deduction and the Fallacy of Four Terms in biomedical research

    Judicious use of the Fallacy of Four Terms

    Partial, probability, practical, and causal syllogisms

    Induction

    The Duhem–Quine thesis

    Summary

    Chapter 5. The logic of probability and statistics

    Abstract

    Probability has always been central, statistics only relatively recently

    Precision versus accuracy, epistemology versus ontology

    The purpose of the chapter

    Continuing legacy of notions of probability

    The value of the logical perspective in probability and statistics

    Metaphysics: ontology versus epistemology and biomedical reproducibility

    Probability

    Statistics

    Key general assumptions whose violation risks unproductive irreproducibility

    Summary

    Chapter 6. Causation, process metaphor, and reductionism

    Abstract

    Renewed need for causation

    Practical syllogism and beyond

    Centrality of hypothesis to experimentation and centrality of causation to hypothesis generation

    Ontological sense of cause

    Reductionism and the Fallacies of Composition and Division

    Other fallacies as applied to cause

    Discipline in the Principles of Causational and Informational Synonymy

    Process metaphor

    Summary

    Chapter 7. Case studies in clinical biomedical research

    Abstract

    Forbearance of repetition

    Purpose of clinical research as the standard

    Clinical importance

    Establishing clinical importance

    Specific features to look for in case studies

    Case study—two conflicting studies of hormone use in postmenopausal women, which is irreproducible?

    Why the dominance of the Women’s Health Initiative Study over the Nurses’ Health Study?

    Aftermath

    Summary

    Chapter 8. Case studies in basic biomedical research

    Abstract

    Forbearance of repetition

    Purpose

    Setting the stage

    The value of a tool from its intended use

    What is basic biomedical research?

    Scientific importance versus statistical significance

    Reproducibility and the willingness to ignore irreproducibility

    Specific features to look for in case studies

    Case study—pathophysiology of parkinsonism and physiology of the basal ganglia

    Summary

    Chapter 9. Case studies in computational biomedical research

    Abstract

    Theorizing versus computational modeling with simulation

    Summary

    Chapter 10. Case studies in translational research

    Abstract

    Translational research as the ultimate goal of basic and clinical research

    Contemporary perspective on translational research

    Summary

    Chapter 11. Case studies in machine learning artificial intelligence

    Abstract

    The current environment of machine learning artificial intelligence

    Machine learning AI in the context of biomedical research

    Example of a neural network machine learning AI

    The game of warmer/colder

    Difference between machine learning AI and other analytic methods

    Similarities between machine learning AI and other methods, such as regression analyses

    Quality assurance in multivariate regression and implications for machine learning AI

    Fallacy of Four Terms

    What is the purpose or goal?

    To whom or what is the machine learning AI algorithm to apply?

    How should one select the training set?

    What is the learning methodology?

    The notion of error

    Only as good as the gold standard

    Error analyses as quality control

    Case study

    The psychology of machine learning AI

    Summary

    Chapter 12. Chaotic and Complex systems, statistics, and far-from-equilibrium thermodynamics

    Abstract

    Chaos and Complexity and the game of pool

    Linearization of complex nonlinear systems

    The Large Number and Central Limit theorems

    Incompleteness

    Self-organization

    Discovering Chaos and Complexity

    Equilibrium and steady-state conditions

    Chaos, Complexity, and the basis for statistics

    Self-organization

    Summary

    Chapter 13. The fundamental problem

    Abstract

    A word at the beginning with an eye to the future

    Not for the faint of heart

    Implications of the epistemic choice of variety as variation

    The necessary transcendental nature of fundamentals

    The argument for nonidentity

    Simplification as means to willfully ignore or hide

    The importance of differences

    Physics of the fundamental ontological problem

    How to know when no two experiences are exactly alike?

    Summary

    Chapter 14. Epilogue

    Abstract

    Implementation science

    De-implementation

    Metacognition and metaphysics

    Reclaiming philosophy

    Scientism

    Some suggestions—Good Manufacturing Practices

    Ethical obligations as applied to biomedical research

    Summary

    Appendix A. An introduction to the logic of logic

    Misperception of what is logic

    Logic is a discipline used to help understand reality

    Proceeding from what is most certain

    Proceeding to what is not certain but useful and dangerous

    Extension to syllogistic deduction

    Where it gets more uncertain, from state-of-being linking verbs to causation

    Appendix B. Introduction to the logic of probability and statistics

    Purpose

    The notion of Introduction

    Probability by enumeration of past experiences

    Combinatorics to avoid uniform probability distributions and assure utility

    The arithmetic mean as probability calculus

    Defined statistical distributions and their models

    Accuracy, precision, population, mean, and variance

    The shaky ground upon which traditional statistics rests

    Measures of randomness as alternatives

    Experimentation is a technical matter; science is altogether different and much more difficult

    Appendix C. Moving away from sample-based analyses for translational research

    Importance of normal distributions in the data

    Mill’s Method of Differences

    Possible only when enough is ignored by what appears to be reasonable presuppositions

    Preserving information of the particular individual subject

    Multidimensional Shannon’s entropy

    Glossary

    Bibliography

    Index

    Copyright

    Academic Press is an imprint of Elsevier

    125 London Wall, London EC2Y 5AS, United Kingdom

    525 B Street, Suite 1650, San Diego, CA 92101, United States

    50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States

    Copyright © 2024 Elsevier Inc. All rights are reserved, including those for text and data mining, AI training, and similar technologies.

    Publisher’s note: Elsevier takes a neutral position with respect to territorial disputes or jurisdictional claims in its published content, including in maps and institutional affiliations.

    No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.

    This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).

    Notices

    Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.

    Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.

    To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.

    ISBN: 978-0-443-13829-4

    For Information on all Academic Press publications visit our website at https://www.elsevier.com/books-and-journals

    Publisher: Mica Haley

    Acquisitions Editor: Andre Wolff

    Editorial Project Manager: Timothy Bennett

    Production Project Manager: Neena S. Maheen

    Cover Designer: Miles Hitchen

    Typeset by MPS Limited, Chennai, India

    Dedication

    First edition

    To Lyn Turkstra for everything…

    And to the Saints Thomas, Hobbes and Kuhn, our scientific consciences

    Second edition

    To Lyn, for whom my appreciation continues…

    Quotes

    An experiment is never a failure solely because it fails to achieve predicted results. An experiment is a failure only when it also fails adequately to test the hypothesis in question, when the data it produces don’t prove anything one way or another.

    —Robert M. Pirsig, Zen and the Art of Motorcycle Maintenance: An Inquiry into Values (1974)

    This day may possibly be my last: but the laws of probability, so true in general, so fallacious in particular, still allow about fifteen years.

    —Edward Gibbon, The Autobiography and Correspondence of Edward Gibbon the Historian (1869)

    Preface to the second edition

    Erwin B. Montgomery, Jr.

    Looking just over the horizon

    Concern regarding reproducibility and irreproducibility in biomedical research continues to increase (Fig. 1). But there have been some striking additions to the range of concerns. Indeed, many of these challenges of reproducibility in biomedical research are now clearer, but new and novel challenges are on the horizon since the publication of the first edition. While past concerns still prevail such that the efforts of the first edition are still relevant, these new challenges offer a unique opportunity to be proactive in the second edition. These challenges include machine learning artificial intelligence (AI), translational research from a new perspective, and opportunities for increasingly realistic approaches to biological phenomena such that traditional approaches are no longer acceptable.

    Figure 1 Number of citations per year in PubMed using keyword irreproducibility or reproducibility in title accessed June 19, 2023.

    In some ways, this second edition is like the two-faced god Janus (Fig. 2). In this case, one face is to the past and was the primary effort of the first edition. Looking again at the past remains critical as there are important lessons to be learned and, at least as importantly, past lessons to be unlearned. That effort continues in the second edition. Also, this second edition is an opportunity to look to the future through the second face.

    Figure 2 Statue representing Janus Bifrons in the Vatican Museums (https://commons.wikimedia.org/wiki/File:Double_herm_Chiaramonti_Inv1395.jpg#/media/File:Double_herm_Chiaramonti_Inv1395.jpg, accessed June 19, 2023).

    But note, the two faces belong to the same head and presumably the same brain and mind. What drove the past perspectives, attitudes, and approaches likewise will drive the future. This is true simply because both what was done in the past and what will be done in the future are responding to the same fundamental problem of all knowledge as discussed in Chapter 13. The fundamental conundrum is that fact that every sense experience, note perception is different. Thus, each sense experience is unique, de novo, and therefore, in the sense experience in itself, uninformative of each other. Then how to predict future experiences and to understand past experiences? One option is to hold that the effective infinite variety of sense experiences is diversity with each unique, de novo, and uninformative of the other. To be sure, taking this position puts one at risk for the Solipsism of the Present Moment.

    The alternative is that variety is a variation over an economical set of fundamentals such that each experience is some combination. Foundationally, this is the choice made by biomedical scientists. But that choice is problematic and the consequent conundrums drive experimental design and analysis. In many ways, the problematic choice of viewing variety as a variation over an economical set of transcendental fundamentals places biomedical research at risk for unproductive irreproducibility. One need look no further than the arithmetic mean as the Central Tendency in a research study. In one way, the arithmetic mean must bear some relation to each and every observation from which the arithmetic mean was extracted, yet the arithmetic mean cannot be the same as each of the observations. One seems forced to conclude that the observations are real as that is what is experienced, but then what of the arithmetic mean? In a critical way, the arithmetic mean becomes the really real. For example, if a doctor were to ask a scientist what is the effect of agent Equation on their patients, the scientist likely would point to the arithmetic mean of the sample in the experiment, not to any particular observation from which the arithmetic mean was generated. How can epistemic realist scientists avoid thinking the real is real and the really real is a useful artifact, that the really real is a fudge factor? Biomedical researchers, and scientists in general, have been very adept at correcting the real to finding the really real or at creating fudge factors, depending on one’s perspective. But each maneuver is an opening to the risk of unproductive irreproducibility.

    Machine learning artificial intelligence

    Machine learning AI has generated great excitement, only perhaps exceeded by fearful concern, and both the excitement and the fear in relation to its potential. Interestingly, much of the concern is mostly driven by the potential for consequential adverse effects. As discussed in detail in Chapter 11, the issue is not whether machine learning AI will produce answers but how will humans know if the answers will be reproducible in relevant and important ways? In computer programming, runtime errors occur when the computer program comports with the syntax of the computer program and produces an answer. Yet, the answer is wrong, perhaps to be discovered at the time or in retrospect. Yet, for a runtime error to be detected, there must be some expectation of what the answer should look like. What it should look like is not derived from the experiment alone and thus the risk is that machine learning AI cannot self-correct. Human responsibility for correcting machine learning AI is inescapable even if ignored. But humans must first know what the answer should look like prior to the computer giving its answer.

    But how does one know what the answer should look like? In comparison, knowing what the answer should look like in traditional experimental design and statistical analysis is relatively straightforward. In fact, in many ways the structure of the experiment and analyses predetermine what the answer will be. This is an example of the logical fallacy Petitio Principi, or begging the question. Indeed, what the answer should look like is consequent to the process rather than any definite empirical finding. This is an example of the Process metaphor. In machine learning AI, the empirical outcome cannot not be predicted and the process is unknowable. Thus, machine learning AI is a mystery that conveys an almost mystic quality to machine learning AI. In important ways, the machine learning AI revolution will force a great reconsideration of what counts as reproducibility. A close examination of the epistemology of machine learning AI will be illuminative itself, but the contrast with traditional approaches likely will require a reconsideration of what scientists think they know about traditional experimental design and analysis. First, it is necessary to demystify machine learning AI.

    Translational research

    Evidence-Based Medicine, in the particular definition where it is derivative from randomized control trials exclusively (Djulbegovic and Guyatt, 2017), has gained considerable force. Indeed, Evidence-Based Medicine often becomes the sole criterium for generating guidelines to medical practice. Thus, the issues regarding Evidence-Based Medicine are relevant to clinical research, as a critique of Evidence-Based Medicine is also a critique of clinical research. However, the implications extend to basic research. Despite widespread adoption of Evidence-Based Medicine by medical academics, Evidence-Based Medicine continues to encounter resistance (Broom et al., 2009; Goldman and Shih, 2011; Pope, 2003). Evidence of concern about resistance to or failure to adopt Evidence-Based Medicine is seen in the special and dedicated efforts to establish Evidence-Based Medicine in routine medical care (Dopson et al., 2003). Various divisions of the National Institutes of Health have developed programs in Implementation science such as the National Cancer Institute, which defines Implementation science as Implementation science (IS) is the study of methods to promote the adoption and integration of evidence-based practices, interventions, and policies into routine health care and public health settings to improve our impact on population health. This discipline is characterized by a variety of research designs and methodological approaches, partnerships with key stakeholder groups (e.g., patients, providers, organizations, systems, and/or communities), and the development and testing of ways to effectively and efficiently integrate evidence-based practices, interventions, and policies into routine health settings (https://cancercontrol.cancer.gov/is/about).

    To be sure, the various factors relating to resistance to Evidence-Based Medicine are diverse and range from political, sociological, or psychological, among others. But there also is an epistemic skepticism. As Evidence-Based Medicine has become synonymous with clinical research, particularly in the form of randomized control trials, the epistemic resistance raises the possibility that clinical research is epistemically unfit for the purposes of medicine. Yet, this would be quite shocking as the National Institutes of Health, perhaps the greatest source of support and leadership in clinical research, holds its mission to "… seek fundamental knowledge about the nature and behavior of living systems and the application of that knowledge to enhance health, lengthen life, and reduce illness and disability [italics added]" (https://www.nih.gov/about-nih/what-we-do/mission-goals#:~:text=NIH’s%20mission%20is%20to%20seek,and%20reduce%20illness%20and%20disability).

    The question demanded by the epistemic skeptic is what is it about clinical research that has been so powerful in amassing biomedical knowledge that proves to be its undoing in the application to prospective particular individual subjects, such as patients? The fact of the matter is that sample-centric studies and analyses can only make predictions to future samples, not individual subjects, which is a central theme in this second edition. It is the use of samples that allows some resolution of the conundrums of the epistemic choice to view the effectively infinite variety as variations over an economical set of transcendental fundamentals. As will be explained in greater detail in this second edition, the sample allows construction of a standard normal distribution to represent biological variability and noise. The arithmetic mean, as the Central Tendency, of the standard normal distribution equals 0 and thus the effect of biological variability and noise zeros out in the sample. However, the effects of whatever mechanisms underlying biological variability and noise do not zero out in the case of the individual observation. Alternatives to sample-based analyses will be discussed, for example, in Appendix C.

    Biological realism and Chaos and Complexity

    As will be discussed in this second edition, so many of the enabling assumptions that allowed basic and clinical research to proceed by traditional methods are unrealistic. To be sure, methods have utility, that is, they work until they don’t. The result is a risk for unproductive irreproducibility. Many enabling assumptions and presumptions, such as simplifications, are instrumental. However, the remarkable advance in scientific technology has greatly expanded the range, resolution, diversity, and consilience of quantification of biological phenomenon. In some ways, the advancing technology has taken away the excuse for continued simplification.

    Intellectual or conceptual tools likewise have contributed to methodologies that provide an answer, even if the answer is a runtime error. These include composite measures for individual observations and composites reflected in the group or sample. For example, in studying an experimental agent for stroke risk reduction, is it realistic to just lump all the confounds together in the experimental and control groups and ignore the variations in the incidence of each confound and to ignore potential interaction effects? To be sure, doing so has utility in getting an answer but the answer can only be unrealistic. Perhaps this was a necessary trade-off in the past. It is not clear that such trade-offs are still necessary. At the very least, the question should be asked.

    Many of the enabling assumptions and presumptions presume linearity and independence among the variables. For example, the conundrum of diabetes mellitus in a study of stroke risk is handled as though it was independent of the conundrum of hypertension. Rather, the incidences are simply added, presuming the Principle of Superposition, and as long as the sums of confounds are the same in the experimental and control groups, the study is held valid.

    It is becoming increasingly clear that the presumptions of linearity and superposition are false. Biological systems are made up of an enormous number of entities that are interrelated in complex and nonlinear manners. While the relevance of highly complex nonlinear systems displaying Chaos and Complexity seems doubtful to many who demand proof that the biological mechanisms under their examination are Chaotic and Complex, how can any realistic understanding of biological phenomenon not be Chaotic and Complex? Thus, a great many assumptions in traditional experimental design and analysis, particularly those that presuppose linearity and superposition, are not only unrealistic, they are misleading, leading to an increased risk of unproductive irreproducibility. This is discussed in greater detail in Chapter 12.

    Probability and statistical epistemology

    The first edition of this book focused on logic as the injudicious use of necessary logical fallacies is an important factor in the crisis of reproducibility in biomedical research. The actual full application of deductive logic, including the judicious use of logical fallacies, requires the deductive propositions and syllogisms to be translated into probability calculus. Thus, the many misadventures in the use of probability calculus are derivative from the corresponding logical fallacies. Further, just as deductive logic can build complex logical arguments, called theorems, probability calculus allows for complex theorems based on the fundamental axioms and rules of inference in the probability calculus. Bayes’ theorem is critical to biomedical research and is an example. One application of Bayes’ theorem relates the probability of a hypothesis, such that biochemical assay Equation indicates biological process Equation , by relating the biochemical assay to the truth and falsehood of process Equation through the specificity and sensitivity of the assay and the prior probability of the biological process. No biological experimentation would be possible without Bayes’ theorem. Unfortunately, the potential of Bayes’ theorem for reducing the risk of unproductive irreproducibility has not been sufficiently leveraged, as will be discussed.

    Newly expanded in this second edition is a similar analysis of statistics. To be sure, statistics was described as the study of the probabilities of a probability in the first edition. But there is more and what is more is a major focus of this second edition. Traditional statistics presume the following logical statement if population implies sample is true and sample is true, then population is true. What is held true of the population is that which is held true of the sample. But this is an example of the Fallacy of Confirming the Consequence and is illustrated by the Venn diagram in Fig. 3. The set sample is considered a subset of the set population and thus whatever is said about the population is true of the sample. But what is true of the sample may not be true of the population. In other words, there are members of the set population that are not the same or equivalent to the members of the set sample.

    Figure 3 Venn diagram representation of the logical statement if population implies sample is true and sample is true, then population is true. As can be appreciated, all the members of the set sample are members of the set population and thus what can be said of the population can be said of the sample. But note, there are members in the set population that are not in the set sample so what can be said about the set sample cannot be said of all the members of the set population. Thus, the logical statement is invalid and is an example of the Fallacy of Confirming the Consequence.

    The logical statement if population implies sample is true and sample is true then population is true is rather useless except in the rarest of cases. In the vast majority of biomedical research, it is not possible to know the population and thus it is impossible to assess the truth or falsehood of the theorem if population implies sample is true and sample is true then population is true. To proceed as though the theorem is valid is to place the experiment in risk of unproductive irreproducibility. What is the biomedical researcher to do?

    Traditional statistics then uses another theorem, population implies sample1 and population implies sample2, therefore sample1 implies sample2 and sample2 implies sample1. Note, this theorem is the basis for reproducibility in biomedical research. But this theorem is the Fallacy of Pseudotransitivity as shown in Fig. 4. Sample1 is used in the first experiment and sample2 is used in an experiment to assess the reproducibility of the first experiment. If sample2 does not imply sample1, then what can be said about sample1 from the second experiment cannot be said of sample1 in the first experiment. The same can be said if sample1 does not imply sample2, thus making both experiments irreproducible.

    Figure 4 Venn diagram representation of the logical statement if population implies sample1 is true and population implies sample2 is true, then sample1 implies sample2 and sample2 implies sample1 is true. As can be appreciated, all members of set sample1 are members of the set population and thus what can be said of the population can be said of the sample1. But note, there are members in the set population that are not in set sample1 so what can be said about set sample1 cannot be said of all the members of the set population. The same holds for sample2. Thus, there may be members of sample1 that are not members of set sample2 and likewise there may be members of sample2 that are not members of set sample1. Consequently, the conclusion that sample1 implies sample2 and sample2 implies sample1 is true is invalid. This is an example of the Fallacy of Pseudotransitivity.

    One response it to propose a tautology of the form population is sample1 and population is sample2, therefore sample1 is sample2. This is certain by the Principle of Transitivity. For this to be the case, sample1 would have to be exhaustive of the entire population and thus sample1 is the population and therefore equal to the population. Similarly, sample2 would have to be exhaustive of the population and therefore equal to the population. Yet, if the population is unknowable, then the theorem fails.

    For the population to be known exactly, the population would have to be finite. However, the Fallacy of Induction argues that it is impossible, in principle, to know whether the population is finite. This generates the need for the Large Number theorem. First, the theorem holds that the population is adequately represented by the Central Tendency, such as the arithmetic mean. Yet, there is no empirical proof and thus the presumption is a metaphysical faith even as it has utility. Second, the Large Number theorem holds that the arithmetic mean will become constant or stable as long as the sample size is sufficiently large enough. Indeed, the stability of the arithmetic mean as sufficient sample sizes is taken as justification even if it is without empirical foundation. The justification is in the form of a Process metaphor. But the notion of sufficiently large is a relative term, as the numerator of the ratio is Equation and where the denominator is the size of the population, but the latter is unknown.

    The response is to use Limit theory and hold that in the limit that the sample size increases, Equation , toward infinity. Thus, accounting for any population unknown size, the mean of the sample, Equation , becomes a constant equal to the mean of the population, Equation . The Large Number theorem can be expressed as Equation . But how can one be certain of the limit if Equation is unknowable? In actual practice, the following limit is used, Equation , and importantly, Equation is just taken as Equation as Equation is unknowable. But how is this possible as no sample has an infinite size? A response is that the sample size, Equation , gets close enough, whatever the latter means. The important point is that close enough provides ample opportunity for the risk of unproductive irreproducibility.

    But note, these theorems relate to the Central Tendency, such as the arithmetic mean, not actual observations. The question demanded is what justifies taking the Central Tendency, such as the arithmetic mean, as the truer representation of the phenomenon than the actual observations, the real? Clearly, the arithmetic mean is transcendental as it is presumed to be informative of all the actual observations, yet not identical to each and every observation. The latter is not possible according to the fundamental problem of ontology as discussed in Chapter 13.

    One response to the question of justification is that the actual observations are contaminated by biological variability, noise, and confounds. The Central Tendency is the true or really real representation of the phenomenon and the claim is made clear by statistical methods to get rid of biological variability, noise, and confounds. The biological variability and noise are disposed of because it is presumed, that is taken as faith, that the populations of both biological variability and noise follow standard normal distributions where the mean effects of the biological variation and noise are 0. Thus, in the mean of the actual observations, the contributions made by biological variability and noise have Equation effects. In other words, they cancel out. Further, experiments are constructed such that the confounds are counterbalanced between the experimental and the control groups such that the mean effect of the confounds becomes Equation . Yet, these approaches are based on presumptions and assumptions that are of questionable justifiable foundations and, consequently, their use places the research at risk for unproductive irreproducibility. These issues are discussed in greater detail throughout this second edition.

    Another response to the question of why the Central Tendency, such as the arithmetic mean, is a truer representation of the ontology or reality of the phenomenon is that actual observations sufficiently close (close enough) to the Central Tendency are far more likely than the other observations. It is almost like reality or truth is a majority vote. Just how close is close enough is given by confidence intervals. But this is a common misconception in that the confidence intervals only provide information as to the precision of calculating the Central Tendency over repeated samples. It is not about the proximity of the actual observations to the Central Tendency.

    These conundrums, resulting from asking inconvenient questions, do not disappear if the questions are not asked. It is said that a good carpenter knows their tools. Note, this is not to say that the carpenter’s tools are without flaws, deficits, and limitation. Rather, a good carpenter recognizes the flaws, deficits, and limitations in order to mitigate their effects. The good carpenter does not just say that their tools are good enough or close enough.

    Consider a research study where agent Equation is applied to a sample and an outcome measure is observed for each of the 30 subjects. The series of 30 outcome measures from a research study are as follows: 1.213065843, -1.408232038, -0.781683411, -0.719130639, 0.420620836, -1.308030733, 0.432603429, 0.90035428, 1.082294148, 0.013502586, 2.24195901, 0.089507921, 0.442114469, -1.093630999, 2.409105946, 0.151316044, -0.598211045, 1.40125394, 2.060951374, -0.734472678, -0.189817229, 0.714876478, -0.763639036, -0.713001782, -1.565367711, -0.416446255, 0.65168706, 1.221249022, 0.268369149, and 0.81556891. In this hypothetical case, the 30 measures were drawn by a random number generator using a standard normal distribution with an arithmetic mean of 0 and a standard deviation of 1 and are shown in Fig. 5. Typically, the next step is to determine the descriptive statistics of this sample as shown in Table 1.

    Figure 5 Equal interval histogram of the distribution of the 30 observations. Note, the 95% confidence interval, 95% CI, does not cover 95% of the observations. Rather, it predicts the range that would be occupied by 95% of the arithmetic means on repeated sampling.

    Table 1

    What can be said about the results? A typical response is that agent Equation produced an arithmetic mean outcome of 0.207957896. But note, none of the observations were equal to 0.207957896. If the observations were real, that is what actually was observed, then what is one to believe about the arithmetic mean of 0.207957896? Now, most scientists would loath to say that the arithmetic mean is nonsensical, artificial, or not meaningful. Indeed, the history of statistics has been that the Central Tendency, in this case the arithmetic mean, is real, perhaps more real than the actual observations. In other words, the arithmetic mean becomes the really real.

    But how is the scientist to have confidence that the arithmetic mean of 0.207957896 is really real? The key is reproducibility. The experiment is repeated on a second sample. In this case, the arithmetic mean of sample2 is not likely to be exactly 0.207957896, the arithmetic mean of sample1. Left at a single repeat, it would be hard to have confidence that either mean is really real. But what if the experiment was repeated 100 times? The distribution of each sample mean would likely be within a specified range 95% of the time. That range provides the 95% confidence interval. The specific percent value for the confidence interval is user defined, it is not a direct consequence of the data. In other words, the data are not screaming out use 95%!

    The question arises, is it feasible or a wise use of resources to repeat the exact same experiment 100 times in order to directly determine the range that will contain 95% of the sample means? One might remark that it is not necessary to repeat the experiment 100 times. One can calculate the 95% confidence interval just from the sample mean, Equation , standard deviation, Equation , and sample size, Equation , after choosing a percentage that translates to a critical value, Equation , for the confidence interval according to the equation Equation .

    Then the question is asked, where did Equation come from? It does not appear to be in the actual data or derivable from the actual data. It turns out Equation is the critical value from an appropriated idealized statistical distribution that is a variation on the standard normal distribution determined by the degrees of freedom, such as Equation . As an aside, there is the Pearson distribution, which is a family of distributions obtained by a modification of descriptive parameters, particularly skewness and kurtosis, that include the continuous distributions of concern here, such as normal, standard normal, Equation -, Equation -, or χ² distributions, among others. Note, increasing the Equation -distribution and binomial distributions approximates a standard normal distribution, as the sample size increases, suggesting a deeper relation among the statistical distributions. But the question is demanded what justifies assuming something like Equation -distribution or normal distribution? Note, it cannot just be assumed from the actual distribution of the observations, then it follows a normal distribution. For example, the distribution in Fig. 5 would take a lot of eye squinting in order to say yes, this is normally distributed.

    An answer comes from history and from utility. It turns out that a lot of measurements of planetary motion, for example, follow a normal distribution, suggesting that normal distribution is a natural phenomenon representing reality and epistemological tools. But how does the fact that measurements that cluster round a central value, subsequently taken as the Central Tendency, such as the arithmetic mean, justify taking that Central Tendency as any truer than any of the other measurements? How does one know if the clustering around the Central Tendency is an artifact of the methods of measurement? The fact that one can ask such questions and the fact that answers are not given by the actual experiences place the science at an epistemic risk, which creates risks for unproductive irreproducibility. This second edition attempts to help resolve or at least shed some light on the conundrum. The hope is that in doing so, biomedical researchers will be able to continue the important work of knowledge building more productively.

    Also, as will be discussed in Chapter 5 and Appendix B, the statistical distributions turn out to be quite useful in getting rid of things like confounds, biological variability, and noise. An example is presuming or constructing biological variability, noise, and confounds to be a standard normal distribution. In a sense, resorting to a normal distribution, standard or incomplete, such as a t-distribution with relatively few degrees of freedom, is like a universal statistical fudge factor. But how is that acceptable? If one commits a crime that has utility in gaining ill-gotten goods, and one does not get caught, is that acceptable? There is a high probability that the thief will be caught someday and stand trial for unproductive irreproducibility.

    Resorting to the normal distribution, standard or incomplete, can be seen as a fudge factor as the factor is extraneous to and not justified by any of the actual observations. The fact that it works, meaning it makes humans more comfortable, is hardly reassuring, either epistemologically or ontologically. But sometimes a fudge factor may be a lucky guess or at least inspire further insights. Consider Albert Einstein’s Cosmological Constant that Einstein introduced into his theory of general relativity in order to prevent his model of the universe from continually expanding (O’Raifeartaigh, 2017). The presumption was that the universe was static. It was not until later when Edwin Hubble demonstrated that the universe is expanding, and strikingly, at an accelerating pace, that the Cosmological Constant was called into question (although to be fair, it was suspect even by Einstein). Einstein was reported to have said that the Cosmological Constant was his biggest blunder. But note, one wonders whether this was the scientific analog of a deathbed confusion once Hubble empirically proved Einstein was wrong. Nonetheless, the jarring contrast between Einstein’s intellect and Hubble’s empirical observations seemed to spur further investigations into what may be Dark Energy and quantum vacuum energy. Maybe the same applies to critiques of traditional experimental design and statistical analyses that will spur future and more realistic approaches.

    Appealing to normal distribution, or any other family members of the Pearson distribution, seems to be the modus operandi of traditional statistical analyses. If the observations are not normally distributed, then normalize the observations by some transformation, although doing so further distances the statistical results from reality. Reducing the risk of stroke would seem more straightforward and informing than reducing the log stroke risk reduction. If the actual observations are not normal and there does not appear any means to normalize the distribution, such as when the distribution is uniform, then just study a derivative statistic, such as the arithmetic mean, standard deviation, Equation -statistic, Equation -statistic, or Equation -statistic, among others. One can always appeal to the Central Limit Theorem.

    These issues are examined throughout this second edition. One challenge that is still much at the thinking-out-loud stage examines the emergence of normal distribution in terms of complex nonlinear systems that display Chaos and Complexity in the context of Information-theoretic Incompleteness, of which the Heisenberg Uncertainty Principle, the Halting Problem in computer science and Gödel’s Number-theoretic Incompleteness theorem are examples. The concept of Brownian motion serves as a bridge between the statistical mechanics, Chaos and Complexity, and Incompleteness. Note, the claim to thinking out loud is not the same or anywhere near saying this is the case. However, if one cannot find the thinking out loud productively irreproducible as in the case of modus tollens or demonstrating Reductio ad Absurdum argumentation, perhaps thinking out loud should continue considering what is at stake—continued unproductive irreproducibility in biomedical research.

    Randomness as fundamental and foundational

    There is another way to think about statistics, where statistical significance comes first from demonstrating that the phenomenon is not random. In other words, the demonstration of nonrandomness demonstrates that structure and Information exist simultaneously in the phenomenon. Methods and approaches can be borrowed from statistical mechanics, particularly ways to study randomness and departures from randomness.

    The physical notion of randomness is inherent in the notion of entropy such as in thermodynamics and methods for quantitating entropy, such as Boltzmann−Gibbs−Shannon entropy, that can be compared among various experimental conditions. One version of Information theory is Shannon’s entropy, Equation , given by Eq. (1),

    Equation (1)

    where there is an array of data points, such as a vector, Equation , in a series of bits of information, Equation , and each bit of information has Equation states, corresponding to Equation . The value of Equation is the probability of that unique state among those set of bits that constitute the observations. Consider a string of bits, such as 10011100101001, where 14 bits are represented by two states, Equation or Equation . In this case, there are seven 1’s and seven 0’s and thus the probability of each state value is Equation . Shannon’s entropy, Equation , is 1 when randomness is complete and is the case when the probabilities of each state are equal, as in the example. Potential uses of Shannon’s entropy are discussed in Appendix C.

    Any statistical distribution that is not uniform will not be random but will have structure and thus Information. The normal distribution of some metric, for example, of the population, is not random, as the probability of all possible states is not equal. Thus, selecting a sample from the nonrandom population presents challenges. The goal of biomedical experimentation is to have a sample that is representative of the population, which can best be obtained by selecting from the population randomly. But note, the selection is not being made from a random distribution. Only the process of selecting data points from the population is held random. This is the notion of stochasticity. But what if there is no such thing as a random selection process and therefore no stochasticity? This issue is addressed in Appendix C.

    There is another possible notion of stochasticity that should not be conflated with randomness but rather reflects unpredictability. This notion sounds counterintuitive. What is unpredictable but is not random? One answer is systems that are Chaotic and Complex and that self-organize. The self-organization assures that the system is not random. The system will have structure, but the structure is not predictable, certainly by the methods of traditional experimental design and analysis. For example, the structure of a snowflake is not random as the formation of a snowflake follows deterministic physics. The physical laws do not contain random variables whose values are held to be determined stochastically in the sense that the value is selected randomly from a nonrandom distribution. Yet, the exact structure or nonrandomness cannot be predicted other than there will be six points in each snowflake.

    As can be appreciated, Chaos and Complexity present extraordinary challenges to traditional experimental designs and analyses in biomedical research. Yet, these challenges cannot be ignored as the principles underlying Chaos and Complexity are far more biologically realistic compared to those presumptions and assumptions underlying traditional experimental design and analysis. As fearful as stepping away from comfortable and useful modes of thinking in the past is, it is hard to see that it is avoidable in the future for the true scientist.

    Finally…

    Let whatever modicum of success I have achieved in the works I have written be a hope, perhaps faint, for those who struggle with the written word as I have. As Albert Einstein was reported to have said, I very rarely think in words at all. A thought comes, and I may try to express in words afterwards. I doubt there are words that can adequately see in Equation -dimensional space or see a chiliagon (a regular polygon with 1000 sides). It is unlikely any artist can draw a chiliagon in spatial dimensions resolvable to the human eye, but there it is in my mind’s eye. Indeed, calculus allowed me to see an infinitely many-sided polygon just prior to and at the moment of becoming a circle. Logic allows me to see the common denominator for all knowledge.

    It should not be hard to imagine, I do not need to plead my case, the hardship a child and then adult faces when their ability to write fails the promise of their intellect. Expectations are low and, correspondingly, so is respect. Beautiful and wondrous ideas in one’s mind are lost in translation to the written word, yet are discoverable with effort, patience, and willingness to discuss. Perhaps the many readers not willing to work past the words to see the beauty and wonder are not to be blamed, but just think of what may have been lost.

    I was fortunate to find refuge in science, mathematics, and logic and then find a profession that demanded relatively little of my otherwise poor writing that was beyond rehabilitation. For me, science, mathematics, and logic greatly expanded what I was able to see in my mind’s eye. It was some solace that I finally was found to have dyslexia and a developmental language disorder, although many years after the fact. At least in my mind, the stigma was reduced, even though working with others was no easier. Strikingly, when applying for grants, I once bucked up the courage to admit my language problems. The response was to give me a few more weeks to submit the written grant application, perhaps with the hope that, in the interim, my dyslexia and developmental language disorder would heal spontaneously. I did not ask for any allowances since.

    Late in my career, I gained the means to get help on my terms. I was able to work with a copy editor who translated my dyslexia efforts into relatively understandable English prose. Melissa Revell has helped me over many years, including this effort, for which I am ever grateful. The great dread of red ink has been relieved by her kindness, generosity, and now friendship.

    I thank Andre Wolf, of Academic Press, as it was his suggestion and encouragement to undertake this second edition. I am not sure that Andre appreciated, prior to his suggestion, the very different and challenging direction this second edition would take, but his encouragement not only persisted but expanded.

    Finally, I would like to recognize and acknowledge that I live and work on the traditional territories of the Mississauga and Haudenosaunee nations for which I am truly grateful. As a new Canadian citizen, I acknowledge my obligations to the aboriginal and treaty rights of First Nations, Inuit and Métis peoples affirmed by section 35 of the Constitution Act, 1982, and I advocate for the Truth and Reconciliation Calls to Action. As I try to be a good physician, teacher, and citizen, I could do no less.

    Preface to the first edition

    There is a problem in biomedical research. Whether it is a crisis and whether it is a new problem or an old endemic problem newly recognized are open questions. Whether it represents an ominous turn of events or merely a hiccup in the self-correcting process of biomedical research also is an open question. At the very minimum, it may be just a crisis of confidence. But it seems to have captured the imagination and concern of journal editors and administrators of research-granting institutions and, consequently, should be of concern to everyone involved in biomedical research.

    The problem is the failure of reproducibility of biomedical research. Indeed, as will be discussed in this book, there is an appreciation of local or within-experiment reproducibility in that virtually every experiment involves more than one observation or trial. Just as virtually every experiment, at least implicitly, appreciates the importance of replication within an experiment, the same issues and concerns apply to the larger issues of reproducibility across different experiments by different researchers.

    Even when the failure of reproducibility is defined narrowly, such as a failure to achieve the same results when other researchers independently replicate the experiment—the narrow sense of irreproducibility—there are concerns. (Note that narrow reproducibility is different from local reproducibility, described previously, and broad reproducibility, described later.) This narrow sense often focuses on issues of fraud, transparency, reagents, materials, methods, and statistical analyses for which better policing would solve the problem. Perhaps these may be the major factors, but they may not be the only factors. None of this is to deny the importance of fraud, transparency, reagents, materials, methods, and statistical analyses, but at the same time it is possible that some causes of irreproducibility may be in the logic inherent in the research studies. Indeed, numerous examples will be presented and, thus, there is an obligation to carefully consider the logical basis of any experiment. Also, absent from the discussion is the fact that irreproducibility in a productive sense is fundamental to scientific progress. Such consideration is the central theme of this book.

    Generally, results of studies are dichotomized into positive and negative studies. Irreproducibility affects positive and negative studies differently. Most of the debate centers on positive claims later demonstrated to be false, examples of a type I error—claiming as true what is false. In statistics, a type I error is occasioned by rejecting the null hypothesis inappropriately, which posits no difference in some measured phenomenon between the experimental and the control samples. Understandably, type I errors shake confidence and risk misdirecting subsequent research.

    Perhaps even more of a problem are type II errors, where negative studies claiming, falsely, that there is no change in a phenomenon as a result of an experimental manipulation or no differences between phenomena just because the null hypothesis could not be rejected. It is just possible that the experiment, by design or statistical circumstance, would have only a low probability of being able to reject the null hypothesis. A better term for these studies are null studies, as using the term negative implies some inference with a degree of confidence even if undefined. The null study contrasts to situations of a negative study where the null hypothesis is rejected in a setting of sufficient statistical power and thus a positive claim of no difference can be made with confidence (equivalence, noninferiority, or nonsuperiority studies). The latter is termed a negative study and provides confidence just as the modus tollens form of the Scientific Method provides certainty, as will be discussed in the text. The extent of the problem of null studies is magnified by the bias toward not publishing null studies and, when published, are typically confused as negative studies. Type II errors in null studies may result in lost opportunities by discouraging further investigations.

    Replicability, the narrow sense of reproducibility, is of primary importance as it is the first requirement. The author agrees with those who are concerned about irreproducibility in the narrow sense and indeed supports proposals for the policing of fraud; transparency; reporting of reagents, materials, and methods; and robust statistical analyses—but it does not and should not stop there. There is the temptation to dismiss concerns about irreproducibility in the narrow sense, believing it acceptably low and little more than the cost of doing scientific business. Perhaps so, if these irreproducible studies were merely outliers or flukes and that the vetting process for awarding grants and accepting papers for publication is otherwise effective. However, one must remind oneself that many of the papers describing research subsequently found to be irreproducible were vetted by experts in the field. What does this say about such expertise or the process?

    There may be another view, which is the broader sense of irreproducibility, that does not focus on the exact replication of a specific experiment, but rather the failure to generalize or translate. These may be called conceptually irreproducible. Of particular concern is the failure to generalize or translate from nonhuman studies to the human condition; after all, is that not the raison d'être of the National Institutes of Health? But even further, is it not a founding pillar of modern science’s Reductionism where there is faith in the ability to generalize and translate from the reduced and simplified? When considered in the broader sense, there is far more evidence for and concern about these forms of irreproducibility. One need to look no further than the postmarket withdrawal of drugs, biologics, and devices by organizations such as the US Federal Food and Drug Administration (FDA), whose preapproval positive clinical trials were supposedly vetted by experts and found valid. Only later were the efficacy and safety conclusions found irreproducible. This is not without consequences for patients.

    The lives of biomedical researchers would be made much easier if irreproducibility was merely the result of fraud; improper use of statistics; lack of transparency; or failure to report reagents, materials, and methods. But what if there are other causes of irreproducibility fundamental to the paradigms of biomedical research, such as the Scientific Method (hypothesis-driven research) and statistics? As will be argued in this text, much of scientific progress requires the use of logical fallacies in order to gain new knowledge. Strict deduction while providing the greatest certainty in the conclusions in an important sense does not create new knowledge and induction to new knowledge is problematic. Traditional valid logical deduction is the logic of certainty, it is not the logic of discovery. Indeed, the Scientific Method is an example of the logical Fallacy of Confirming the Consequence (or Consequent), also known as the Fallacy of Affirming the Consequence (or Consequent). Also, the logic of the Scientific Method is referred to as abduction. This claim itself is not controversial as both scientists and philosophers have pointed this out for decades. What is novel here will be the demonstration that this fallacy is a cause of irreproducibility in biomedical research. Fortunately, if recognized, there are methods to blunt the effect of the fallacy, thereby reducing the risk of unproductive irreproducibility. What is needed, it will be argued, is the judicious use of logical fallacies.

    Perhaps novel and counterintuitive, at least from the perspective of some scientists, is that the discipline of logic, typically in the domain of philosophy, could be considered relevant to empirical biomedical research. Scientists from the beginning of modern science, as seen in the founders of the Royal Society, rejected philosophy by taking aim against scholastic metaphysician natural philosophers. While scientists as late as the early 1900s invoked past philosophers in scientific discussions as Sir Charles Sherrington did with Rene Descartes in his text The Integrative Action of the Nervous System in 1906, such discussions are virtually absent in any literature typically read by today’s biomedical researcher. Thus, it is understandable that biomedical researchers would be skeptical of any argument that proceeds from anything that smacks of philosophy, such as logic. But lack of experience in logic, epistemology (concerning how knowledge is gained rather than the content of knowledge), more generally, is a poor basis for skepticism. All this author can do is ask for forbearance, as the author is confident that such patience will be rewarded.

    This author’s ambitions for any role of logic in this particular text are circumscribed. It is critical to appreciate that logic alone does not create biomedical scientific knowledge. Science is fundamentally empiric and its success or failure ultimately relies on observation, data, and demonstration. All logic can do is provide some degree of certainty to the experimental design and analytical methods that drive claims of new scientific knowledge. Yet, the issues of reproducibility fundamentally involve issues of certainty, as reproducibility is at its core a testament to certainty. On this basis alone, logic has a role to play in concerns about scientific reproducibility. The discussions in this text are intended to strengthen, perhaps by just a bit, the already strong and important position of empirical biomedical research.

    It may well be that an experiment is so obvious that there are no concerns as to the underlying logic raised. However, the success of such an experiment is not evidence that logic is not operating. For example, consider the Human Genome Project, which has been called a descriptive research program in contrast to a hypothesis-driven program. Perhaps it could be argued that the domain of the research was clearly defined and marked, that is, the human genome, thereby obviating any inductive ambiguity. Essentially, the Human Genome Project consisted of turning the crank. Interestingly, the project set the stage for subsequent hypothesis-driven research (Verma, 2002), for example, that identified gene A causes disease B such that affecting gene A cures disease B. In that regard, hypothesis-driven research has not fared well given only two FDA-approved gene therapies (as opposed to genetic testing), despite the estimated $3 billion spent on the Human Genome Project. It is important to note that this is not a criticism of the Human Genome Project and that there is every reason to believe that the results will change medical therapies dramatically, but it will take time because it is difficult to go from data collection to cause and effect required of a hypothetico-deductive approach critical to biomedical research.

    Further attesting to the potential contributions of logic is the fundamental fallacy inherent in statistics as used in biomedical research, which is the Fallacy of Four Terms. The experimental designs typically involve hypothesis testing on a sample thought representative of the population of concern with inferences from the findings on the sample transferred to the population through a syllogistic deduction. The sample is the entities studied directly, for example, a group of patients, in an effort to understand all patients, which is the population. There are many reasons why all patients cannot be studied and thus scientists have little choice but to study a sample. Consider the example: disease A in a sample of patients is cured by treatment B, the population of those with disease A is the same as the sample of patients; therefore, the population of patients with disease A will be cured with treatment B. The syllogism is extended to my patient is the same as the population of patients with disease A and therefore my patient will be cured with treatment B. However, it is clear that there are many examples where my patient was not cured with treatment B. Indeed, this would be an example of irreproducibility in the broad sense, and one needs only to look to the drugs, biologics, and devices recalled by the FDA to see this is true. Something must be amiss. The majority of pivotal phase 3 trials that initially garnered FDA approval and later abandoned were not likely to have been type I errors based on fraud; lack of transparency; failure to report reagents, materials, or methods; or statistical flaws within the studies.

    There are a number of other fallacies to which the scientific enterprise is heir to. These include the Fallacy of Pseudotransitivity that affects the formulation of hypotheses critical to

    Enjoying the preview?
    Page 1 of 1