Trolley Crash: Approaching Key Metrics for Ethical AI Practitioners, Researchers, and Policy Makers
()
About this ebook
The prolific deployment of Artificial Intelligence (AI) across different fields has introduced novel challenges for AI developers and researchers. AI is permeating decision making for the masses, and its applications range from self-driving automobiles to financial loan approvals. With AI making decisions that have ethical implications, responsibilities are now being pushed to AI designers who may be far-removed from how, where, and when these ethical decisions occur. Trolley Crash: Approaching Key Metrics for Ethical AI Practitioners, Researchers, and Policy Makers provides audiences with a catalogue of perspectives and methodologies from the latest research in ethical computing. This work integrates philosophical and computational approaches into a unified framework for ethical reasoning in the current AI landscape, specifically focusing on approaches for developing metrics. Written for AI researchers, ethicists, computer scientists, software engineers, operations researchers, and autonomous systems designers and developers, Trolley Crash will be a welcome reference for those who wish to better understand metrics for ethical reasoning in autonomous systems and related computational applications.
- Presents a comparison between human oversight and ethical simulation in robots
- Introduces approaches for measuring, evaluating, and auditing ethical AI
- Investigates how AI and technology are changing human behavior
Related to Trolley Crash
Related ebooks
Meta Learning With Medical Imaging and Health Informatics Applications Rating: 0 out of 5 stars0 ratingsSemantic Models in IoT and eHealth Applications Rating: 0 out of 5 stars0 ratingsContemporary Digital Forensic Investigations of Cloud and Mobile Applications Rating: 0 out of 5 stars0 ratingsHuman-Centered Artificial Intelligence: Research and Applications Rating: 0 out of 5 stars0 ratingsBio-Inspired Computation and Applications in Image Processing Rating: 0 out of 5 stars0 ratingsDigital Image Enhancement and Reconstruction Rating: 0 out of 5 stars0 ratingsAcademic Press Library in Signal Processing, Volume 6: Image and Video Processing and Analysis and Computer Vision Rating: 0 out of 5 stars0 ratingsArtificial Intelligence in Behavioral and Mental Health Care Rating: 4 out of 5 stars4/5Cognitive Systems and Signal Processing in Image Processing Rating: 0 out of 5 stars0 ratingsMathematics Applied to Engineering Rating: 5 out of 5 stars5/5Leveraging Artificial Intelligence in Global Epidemics Rating: 0 out of 5 stars0 ratingsLife Cycle Sustainability Assessment for Decision-Making: Methodologies and Case Studies Rating: 0 out of 5 stars0 ratingsSystems Factorial Technology: A Theory Driven Methodology for the Identification of Perceptual and Cognitive Mechanisms Rating: 0 out of 5 stars0 ratingsComputational Learning Approaches to Data Analytics in Biomedical Applications Rating: 5 out of 5 stars5/5Big Data Analytics for Sensor-Network Collected Intelligence Rating: 5 out of 5 stars5/5Smart Sensors Networks: Communication Technologies and Intelligent Applications Rating: 0 out of 5 stars0 ratingsMetaheuristic Optimization Algorithms: Optimizers, Analysis, and Applications Rating: 0 out of 5 stars0 ratingsSoft Computing Based Medical Image Analysis Rating: 0 out of 5 stars0 ratingsBehavior Change Research and Theory: Psychological and Technological Perspectives Rating: 0 out of 5 stars0 ratingsModel Management and Analytics for Large Scale Systems Rating: 0 out of 5 stars0 ratingsMental Health in a Digital World Rating: 0 out of 5 stars0 ratingsAssistive Technology Service Delivery: A Practical Guide for Disability and Employment Professionals Rating: 0 out of 5 stars0 ratingsAdvances in Computational Techniques for Biomedical Image Analysis: Methods and Applications Rating: 0 out of 5 stars0 ratingsBig Data Analytics for Intelligent Healthcare Management Rating: 0 out of 5 stars0 ratingsArtificial Intelligence in Earth Science: Best Practices and Fundamental Challenges Rating: 0 out of 5 stars0 ratingsDeep Learning for Medical Image Analysis Rating: 4 out of 5 stars4/5Federated Learning: Theory and Practice Rating: 0 out of 5 stars0 ratingsData Analytics in Biomedical Engineering and Healthcare Rating: 0 out of 5 stars0 ratingsAssistive Technology for the Elderly Rating: 0 out of 5 stars0 ratingsModeling, Identification, and Control for Cyber- Physical Systems Towards Industry 4.0 Rating: 0 out of 5 stars0 ratings
Intelligence (AI) & Semantics For You
Midjourney Mastery - The Ultimate Handbook of Prompts Rating: 5 out of 5 stars5/5Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates Rating: 4 out of 5 stars4/5AI for Educators: AI for Educators Rating: 5 out of 5 stars5/5Mastering ChatGPT: 21 Prompts Templates for Effortless Writing Rating: 5 out of 5 stars5/5101 Midjourney Prompt Secrets Rating: 3 out of 5 stars3/5Killer ChatGPT Prompts: Harness the Power of AI for Success and Profit Rating: 2 out of 5 stars2/5How To Become A Data Scientist With ChatGPT: A Beginner's Guide to ChatGPT-Assisted Programming Rating: 5 out of 5 stars5/5ChatGPT For Dummies Rating: 0 out of 5 stars0 ratingsChatGPT For Fiction Writing: AI for Authors Rating: 5 out of 5 stars5/5ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology Rating: 0 out of 5 stars0 ratingsTensorFlow in 1 Day: Make your own Neural Network Rating: 4 out of 5 stars4/5Artificial Intelligence: A Guide for Thinking Humans Rating: 4 out of 5 stars4/5ChatGPT Rating: 3 out of 5 stars3/5A Quickstart Guide To Becoming A ChatGPT Millionaire: The ChatGPT Book For Beginners (Lazy Money Series®) Rating: 4 out of 5 stars4/5Make Money with ChatGPT: Your Guide to Making Passive Income Online with Ease using AI: AI Wealth Mastery Rating: 0 out of 5 stars0 ratingsChat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures Rating: 4 out of 5 stars4/5The Secrets of ChatGPT Prompt Engineering for Non-Developers Rating: 5 out of 5 stars5/5Dark Aeon: Transhumanism and the War Against Humanity Rating: 5 out of 5 stars5/52084: Artificial Intelligence and the Future of Humanity Rating: 4 out of 5 stars4/5Enterprise AI For Dummies Rating: 3 out of 5 stars3/5
Reviews for Trolley Crash
0 ratings0 reviews
Book preview
Trolley Crash - Peggy Wu
Chapter One: Introduction
Michael R. Salpukasa; Peggy Wub; Shannon Ellswortha; Hsin-Fu ‘Sinker’ Wuc aRaytheon | an RTX Business, Woburn, MA, United States
bRTX Technology Research Center, East Hartford, CT, United States
cRaytheon | an RTX Business, Tucson, AZ, United States
Keywords
Artificial intelligence; assurance; ethical; metrics; strategy; regulation; requirements; robustness; government; safety; fairness
1.1 Ethical AI introduction
This compendium contains the proceedings and selected articles from the Association for the Advancement of Artificial Intelligence (AAAI) Spring Symposium 2022, Ethical Computing: Metrics for Measuring AI's Proficiency and Competency for Ethical Reasoning. The Symposium attracted a diverse assemblage of academic disciplines, ranging from computer science, engineering, and mathematics to sociology, law, and moral philosophy, to address the challenge of how to evaluate the ethical performance of artificial intelligence (AI). This comprehensive collection of expertise is required to extract the nuances of how human, societal, institutional, and artificial intelligence ethics are coevolving and how to create metrics to track their interaction.
1.2 Why ethical AI metrics?
The public reaction to AI gone rogue
has highlighted the dangers of data-driven modeling with imbalanced reward functions. Examples of adverse events occurring across a swath of major companies (Google Image Recognition, Facebook/Cambridge Analytica, Microsoft Tay [1,2], COMPAS, etc.) indicate that the problem is both widespread and difficult to solve. Even if there is no evil intent, the perception and liability of damaging impact have increased the demand of commercial and government entities to address the ethics of AI. AI ethics problem bounties similar to software bug bounties are one proposed solution, but whereas software faults are easy to evaluate once detected, ethics faults are much less so. Ethical AI metrics would enable the development of an accurate ethics bounty [3] system, though the bounty model has its own ethical traps.
A more formal need for ethical AI metrics arises from the US government, which declared that ethical AI would be a strategic pillar via the National AI R&D Strategic Plan (2019 Update) [4]:
Strategy 3: Understand and address the ethical, legal, and societal implications of AI. We expect AI technologies to behave according to the formal and informal norms to which we hold our fellow humans. Research is needed to understand the ethical, legal, and social implications of AI, and to develop methods for designing AI systems that align with ethical, legal, and societal goals.
Discussions with government contracting agencies suggested that ethical AI could be a formal requirement for future programs. This implies that there will be formal testing procedure to confirm that the AI system developed conforms to ethical standards prior to acceptance and fielding. Tests must be developed and agreed to far in advance of the test events, which implies that developers and government test reviewers need ethical AI metrics on a schedule that supports new contract development. These were just two of the increasingly urgent demands for ethical AI metrics which prompted the original discussions that culminated in this Symposium.
1.3 Ethical AI metric development
Beyond the already daunting technical complexity of measuring AI against static accuracy and robustness criteria, ethical AI is evaluated against the formal and informal norms to which we hold our fellow humans
[4], which naturally drift over time. In addition, there exists an ongoing feedback loop of accelerating social acceptance of more invasive AI as exposure, interaction, and optimized reward hacking increase [5–7]. Measuring and mitigating this effect was the focus of one of the papers from the 2022 AI Ethics Symposium: Boiling the Frog: Ethical Leniency due to Prior Exposure to Technology.
The experimental subjects were surprisingly willing to use AI-based emotion detectors to their advantage. This drift in ethical attitudes towards the use of AI imposes a need to measure changes in social norms over time and possibly model the feedback between AI and the humans it (currently) serves.
The scope of this ethical drift and feedback is easier to detect when viewed as discrete changes between longer gaps in time. If we reach back to the 2009 AAAI Asilomar Study on Long-Term AI Futures [8], the participants were themselves looking back on the order of decades to prior milestones to spot long-term trends, and we can expect that future AAAI Ethics Symposia will continue this temporal induction as they look back at these proceedings. The Asilomar authors quote a 1994 AAAI Ethics paper: Society will reject autonomous agents unless we have some credible means of making them safe!
[9]. They seem to accept this statement without irony as they introduce their safety discussion. Viewed with the advantage of hindsight, by 2009, the first iPhone had been released 2 years earlier, Facebook had launched 5 years earlier, and society was already accepting greater invasions of data privacy. Predator drones had been in production since 1997, and drone pilots were discovering the discordance of flying combat missions remotely and returning home at night. The difference between the level of oversight expected by the conference goers and the public was addressed by three sections in our proceedings: Meaningful Metrics for Demonstrating Ethical Supervision of Unmanned Systems; Initial Thoughts on How to Define, Model, and Measure Them,
Risk-Based Continuous Audit Approach to AI Systems Ethical Compliance,
and Meaningful Human Control and Ethical Neglect Tolerance.
Society was already accepting these nascent autonomous agents, even though we certainly did not have the safeguards demanded by Weld and Etzioni. As experts in their field, the 1994 and 2009 authors were likely projecting their own level of concern onto the general masses, who were evolving to become much more permissive than expected. The social reward hacking had already begun in social media, and the increased physical safety of remote pilots over live pilots offset initial autonomy concerns. The opportunities afforded by technical innovation, especially following the landmark success of AlexNet in 2012 [10], likely also increased the rate of change through fear of missing out (FOMO).
Both the 2009 Asilomar authors and the 1994 The First Law of Robotics (a call to arms)
paper quote Isaac Asimov's Three (later Four) Laws of Robotics heavily.¹ The AI ethical challenges in Asimov's Robot corpus were generally simpler than those analyzed by AAAI, but Asimov places a lot more focus on the impact of robots on humans and society and on the struggle of autonomous robots to judge the ethics of their own actions. The robots' ethical debates that serve as a literary device are echoed and extended in the following studies from this year's program: Automated Ethical Reasoners Must Be Interpretation-Capable; Towards Unifying the Descriptive and Prescriptive for Machine Ethics
and Building Competent Ethical Reasoning in Robot Applications: Inner Dialog as a Step Towards Artificial Phronesis.
This ability to evaluate ethical dilemmas and explain outcomes is particularly important as AI becomes more involved in engineering design and can eventually self-replicate and self-evolve: …[A]n ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion’, and the intelligence of man would be left far behind
[11].
Against this historical backdrop, the rate of current real-world, real-time autonomous AI tests and evaluations might shock the more conservative participants from prior decades. Autonomous vehicle testing in the real world accelerated much faster than most expected, given the legal and liability concerns. Using simple metrics, Tesla claims its cars using its Autopilot features are safer than others, reporting drivers using Autopilot got in an accident once every 4.31 million miles in the fourth quarter of 2021, far outperforming the NHTSA's national average of an accident every 484,000 miles
[12]. The miles driven by Autopilot are likely less complex than general mileage, so the comparison is questionable, but the performance is still quite an achievement. Walter Huang, who died in a Tesla Autopilot crash while distracted by a video game, provides a cautionary tale of small AI ethics failures compounding [13]. The scenario where the reward hacking of the video game was compelling enough to convince Mr. Huang to completely trust in Autopilot, where even minor driver oversight might have prevented his premature demise, is clearly one to be recounted for future testing. The evaluation of this level of trust is explored in Autonomy Compliance with Doctrine and Ethics Ontological Frameworks
and A Tiered Approach for Ethical AI Evaluation Metrics.
This complex situation begs the more nuanced debate of balancing the life of someone who opts in
to using AI vs. the lives of others affected by their decision. Add to this debate that the user who opts in is likely also the customer of the company that produced the AI introduces another compounding level of conflict of interest. Convincing a company to devalue the life of a paying customer over others may take some work, even in the event that it is judged to be the right thing to do.
Even in the social and human–machine interaction space, the potentials for ethical harm are great. COMPAS, an attempt to build an AI tool to remove bias from trial sentencing guidelines, likely did the opposite [14]. If we think back to the earlier mistrust of AI, one might wonder at the willingness of the COMPAS purchasers to back such a project, except for the concerns about the quality of current sentencing. Challenges of evaluating the outcomes of AI decisions are explored in Evidence of Fairness: On the Uses and Limitations of Statistical Fairness Criteria
and Obtaining Hints to Understand Language Model-based Moral Decision Making by Generating Consequences of Acts.
The complexity and evolution of AI solutions make testing for performance and safety difficult enough. The challenge of evaluating ethical AI seems at times beyond state of the art, except that the lessons learned from performance, safety, and field tests each provide reach-back from which to learn. As the impact of these field tests is analyzed, new metrics, boundaries, and response frameworks will be developed. Measuring the social ethics drift and advances in artificial ethical modeling will help, but constant monitoring and adjustment will likely be necessary. As human–machine teaming and autonomy grow in adoption, there will be a resulting tension as failures and their perception drive development. The ability of AI to assist in measuring both itself and its effects on society will be hard-tested, as will the skill of the engineers who build it.
References
[1] D. Murphy, Microsoft apologizes (again) for Tay chatbot's offensive tweets, PC Magazine 25 March 2016.
[2] L. Eliot, AI ethics cautiously assessing whether offering AI biases hunting bounties to catch and nab ethically wicked fully autonomous systems is prudent or futile, Forbes 16 July 2022.
[3] J. Cohn, An ethics bounty system could help clean up the web, Wired 3 November 2021.
[4] UGSC AI, The National Artificial Intelligence Research and Development Strategic Plan: 2019 Update. Washington, DC: National Science and Technology Council; 2019.
[5] UNESCO, Recommendation on the Ethics of Artificial Intelligence. UNESCO; 2021.
[6] B. Mittelstadt, Principles alone cannot guarantee ethical AI, Nature Machine Intelligence 2019;1:501–507.
[7] E. Kazim, A.S. Koshiyama, A high-level overview of AI ethics, Patterns 2021;2(9), 100314.
[8] C.-C. Eric Horvitz, Highlights of the 2008–2009 AAAI study: Presidential panel on long-term AI futures, Asilomar Study on Long-Term AI Futures. Pacific Grove, CA. 2009.
[9] D. Weld, O. Etzioni, The first law of robotics (a call to arms), Proceedings of the 12th National Conference on Artificial Intelligence (AAAI-94). 1994:1042–1047.
[10] A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems. Curran Associates, Inc.; 2012;vol. 25:1097–1105.
[11] I.J. Good, Speculations concerning the first ultraintelligent machine, Advances in Computers 1966;6:31–88.
[12] D. Saul, Nearly 400 crashes in past year involved driver-assistance technology—most from Tesla, Forbes 15 June 2022.
[13] B. News, Tesla autopilot crash driver ‘was playing video game’, BBC News 26 Feb 2020.
[14] Julia Angwin, Jeff Larson, Surya Mattu, Lauren Kirchner, Machine bias, ProPublica 23 May 2016.
¹ It is interesting to note that Asimov's works of fiction are cited often in scholarly articles: I, Robot alone has 1928 citations according to Google Scholar at the time of this writing, not including the separate robot novels and individual short story references which would boost this number significantly. Many of the citing articles are on AI ethics and human–machine interaction. In comparison, Google Scholar indicates that Asimov's scientific papers from his original academic chemistry career appear to have no more than seven citations.
Chapter Two: Terms and references
Working definitions of terms and references
Shannon Ellsworth Raytheon | an RTX Business, Woburn, MA, United States
Abstract
The words used to describe ethical computing concepts can be as controversial as the concepts themselves. To standardize language for shared understanding between the readers and authors, key terms and references used throughout this book are defined in this chapter. These definitions use descriptions and characterizations as seen by the authors and editors and are not meant to be exhaustive definitions.
Keywords
Answer set programming; artificial ethical agent; artificial intelligence (AI); artificial intelligence (AI) ethical risks; artificial intelligence (AI) ethics; artificial moral agent (AMA); artificial phronesis (AP); auditing; audit trails; autonomous system certification; autonomous underwater vehicle (AUV); autonomous vehicle command language (AVCL); bidirectional encoder representations from transformers (BERT); bottom-up artificial intelligence (AI); checkpoint tasks; competency; compliance assurance; consequentialism; curse of dimensionality; defining issues test; Delphi; deontic cognitive event calculus (DCEC); deontological ethics; descriptive ethics; dimensions of autonomous decision making (DADMs or DADs); elder care robotics; emotion detection; epistemic luck; ethical connotations; ethical hazard analysis; ethical impact agent (EIA); ethical leniency; ethical mores; ethical reasoners; ethical reference distribution; ethics; ethics-based auditing; event calculus framework; explainable artificial intelligence (AI); finite state machine (FSM); first principles; generalized outcome assessment (GOA); GPT-2 language model; governance; habituation effects; human ethos; Hume's guillotine; inner speech; interpretation-capable reasoners; Kohlberg's moral stage theory; Likert scale survey; machine ethics; machine learning; machine morality; machine wisdom; Markov decision process (MDP); meaningful human control (MHC); meaningful human involvement; minimally defeasible argument; mission execution ontology; Monte Carlo simulation; moral-conventional transgression (MCT); moral foundations questionnaire; morality; natural language inference (NLI); neglect tolerance (NT); non-RT control; norm; norm grounding problem; normative belief; normative knowledge; ontology; open-textured terms; paired t-test; Pearson chi-square test; Piaget's observations and theories of constructivist moral development; Plato's cave; prescriptive ethics; prisoner's dilemma game; real-time (RT) control; reinforcement learning; repeated measures ANOVA; responsible use of artificial intelligence; reward hacking; risk; robot consciousness; robot inner dialog; robot trust; rule of engagement (ROE); sentiment analysis; split–steal game; statutory interpretive reasoning; technology acceptance model (TAM); technology adoption propensity; thematic analysis; three waves of AI according to DARPA; top-down artificial intelligence (AI); trolley problem; trust; trust in artificial intelligence (AI); trust in robotics; value iteration; verification and validation; veritic epistemic luck; virtue ethics; wisdom of crowds (WoC); Wizard of Oz design; zero-shot learning