Data Democracy: At the Nexus of Artificial Intelligence, Software Development, and Knowledge Engineering
By Feras A. Batarseh and Ruixin Yang
()
About this ebook
Data Democracy: At the Nexus of Artificial Intelligence, Software Development, and Knowledge Engineering provides a manifesto to data democracy. After reading the chapters of this book, you are informed and suitably warned! You are already part of the data republic, and you (and all of us) need to ensure that our data fall in the right hands. Everything you click, buy, swipe, try, sell, drive, or fly is a data point. But who owns the data? At this point, not you! You do not even have access to most of it. The next best empire of our planet is one who owns and controls the world’s best dataset. If you consume or create data, if you are a citizen of the data republic (willingly or grudgingly), and if you are interested in making a decision or finding the truth through data-driven analysis, this book is for you. A group of experts, academics, data science researchers, and industry practitioners gathered to write this manifesto about data democracy.
- The future of the data republic, life within a data democracy, and our digital freedoms
- An in-depth analysis of open science, open data, open source software, and their future challenges
- A comprehensive review of data democracy's implications within domains such as: healthcare, space exploration, earth sciences, business, and psychology
- The democratization of Artificial Intelligence (AI), and data issues such as: Bias, imbalance, context, and knowledge extraction
- A systematic review of AI methods applied to software engineering problems
Feras A. Batarseh
Feras A. Batarseh is an Associate Professor with the Department of Biological Systems Engineering at Virginia Tech (VT) and the Director of A3 (AI Assurance and Applications) Lab. His research spans the areas of AI Assurance, Cyberbiosecurity, AI for Agriculture and Water, and Data-Driven Public Policy. His work has been published at various prestigious journals and international conferences. Additionally, Dr. Batarseh published multiple chapters and books, his two recent books are: "Federal Data Science", and "Data Democracy", both by Elsevier’s Academic Press. Dr. Batarseh is a senior member of the Institute of Electrical and Electronics Engineers (IEEE), the Agricultural and Applied Economical Association (AAEA), and the Association for the Advancement of Artificial Intelligence (AAAI). He has taught AI and Data Science courses at multiple universities including George Mason University (GMU), University of Maryland - Baltimore County (UMBC), Georgetown University, and George Washington University (GWU). Dr. Batarseh obtained his Ph.D. and M.Sc. in Computer Engineering from the University of Central Florida (UCF) (2007, 2011), a Juris Masters of Law from GMU (2022), and a Graduate Certificate in Project Leadership from Cornell University (2016). He currently holds courtesy appointments with the Center for Advanced Innovation in Agriculture (CAIA), National Security Institute (NSI), and the Department of Electrical and Computer Engineering at VT.
Related to Data Democracy
Related ebooks
Persistent Fools: Cunning Intelligence and the Politics of Design Rating: 0 out of 5 stars0 ratingsThe Big Disconnect: Why the Internet Hasn't Transformed Politics (Yet) Rating: 4 out of 5 stars4/5Building a European Digital Public Space: Strategies for taking back control from Big Tech platforms Rating: 0 out of 5 stars0 ratingsThe Automation of Society is Next: How to Survive the Digital Revolution Rating: 0 out of 5 stars0 ratingsDecision Systems Theory Rating: 0 out of 5 stars0 ratingsBuild Your Own Blockchain: A Practical Guide to Distributed Ledger Technology Rating: 0 out of 5 stars0 ratingsThe Democratization of Artificial Intelligence: Net Politics in the Era of Learning Algorithms Rating: 0 out of 5 stars0 ratingsMetasystems: How trust can change the world Rating: 0 out of 5 stars0 ratingsThe Economics of Artificial Intelligence: An Agenda Rating: 0 out of 5 stars0 ratingsNetwork Society: How Social Relations rebuild Space(s) Rating: 0 out of 5 stars0 ratingsBig Data and Ethics: The Medical Datasphere Rating: 0 out of 5 stars0 ratingsProtecting the Commons: A Framework For Resource Management In The Americas Rating: 0 out of 5 stars0 ratingsSmart City Governance Rating: 0 out of 5 stars0 ratingsCampus Strategies for Libraries and Electronic Information Rating: 0 out of 5 stars0 ratingsLibrary 3.0: Intelligent Libraries and Apomediation Rating: 0 out of 5 stars0 ratingsAcademic Libraries and Public Engagement With Science and Technology Rating: 0 out of 5 stars0 ratingsFederal Data Science: Transforming Government and Agricultural Policy Using Artificial Intelligence Rating: 0 out of 5 stars0 ratingsComputational Learning Approaches to Data Analytics in Biomedical Applications Rating: 5 out of 5 stars5/5Propaganda and the Internet Rating: 0 out of 5 stars0 ratingsManaging Scientific Information and Research Data Rating: 0 out of 5 stars0 ratingsThe Role of the Electronic Resources Librarian Rating: 0 out of 5 stars0 ratingsAccidental Information Discovery: Cultivating Serendipity in the Digital Age Rating: 0 out of 5 stars0 ratingsImplementing Analytics: A Blueprint for Design, Development, and Adoption Rating: 0 out of 5 stars0 ratingsGrowing Your Library Career with Social Media Rating: 0 out of 5 stars0 ratingsEnsuring Digital Accessibility through Process and Policy Rating: 5 out of 5 stars5/5Innovative Data Integration and Conceptual Space Modeling for COVID, Cancer, and Cardiac Care Rating: 0 out of 5 stars0 ratingsBeyond the Bones: Engaging with Disparate Datasets Rating: 1 out of 5 stars1/5Service Science and the Information Professional Rating: 0 out of 5 stars0 ratingsDigital Information Strategies: From Applications and Content to Libraries and People Rating: 0 out of 5 stars0 ratingsDiscover Digital Libraries: Theory and Practice Rating: 5 out of 5 stars5/5
Science & Mathematics For You
Outsmart Your Brain: Why Learning is Hard and How You Can Make It Easy Rating: 4 out of 5 stars4/5Becoming Cliterate: Why Orgasm Equality Matters--And How to Get It Rating: 4 out of 5 stars4/5Activate Your Brain: How Understanding Your Brain Can Improve Your Work - and Your Life Rating: 4 out of 5 stars4/5A Letter to Liberals: Censorship and COVID: An Attack on Science and American Ideals Rating: 3 out of 5 stars3/5The Big Fat Surprise: Why Butter, Meat and Cheese Belong in a Healthy Diet Rating: 4 out of 5 stars4/5The Dorito Effect: The Surprising New Truth About Food and Flavor Rating: 4 out of 5 stars4/5The Systems Thinker: Essential Thinking Skills For Solving Problems, Managing Chaos, Rating: 4 out of 5 stars4/5The Invisible Rainbow: A History of Electricity and Life Rating: 4 out of 5 stars4/5Memory Craft: Improve Your Memory with the Most Powerful Methods in History Rating: 3 out of 5 stars3/5How Emotions Are Made: The Secret Life of the Brain Rating: 4 out of 5 stars4/5Born for Love: Why Empathy Is Essential--and Endangered Rating: 4 out of 5 stars4/5The Big Book of Hacks: 264 Amazing DIY Tech Projects Rating: 4 out of 5 stars4/5Homo Deus: A Brief History of Tomorrow Rating: 4 out of 5 stars4/5Why People Believe Weird Things: Pseudoscience, Superstition, and Other Confusions of Our Time Rating: 4 out of 5 stars4/5The Wisdom of Psychopaths: What Saints, Spies, and Serial Killers Can Teach Us About Success Rating: 4 out of 5 stars4/5Metaphors We Live By Rating: 4 out of 5 stars4/52084: Artificial Intelligence and the Future of Humanity Rating: 4 out of 5 stars4/5Ultralearning: Master Hard Skills, Outsmart the Competition, and Accelerate Your Career Rating: 4 out of 5 stars4/5Free Will Rating: 4 out of 5 stars4/5On Food and Cooking: The Science and Lore of the Kitchen Rating: 5 out of 5 stars5/5Oppenheimer: The Tragic Intellect Rating: 5 out of 5 stars5/5The Great Mortality: An Intimate History of the Black Death, the Most Devastating Plague of All Time Rating: 4 out of 5 stars4/5The Psychology of Totalitarianism Rating: 5 out of 5 stars5/5Hunt for the Skinwalker: Science Confronts the Unexplained at a Remote Ranch in Utah Rating: 4 out of 5 stars4/5Fantastic Fungi: How Mushrooms Can Heal, Shift Consciousness, and Save the Planet Rating: 5 out of 5 stars5/5No Stone Unturned: The True Story of the World's Premier Forensic Investigators Rating: 4 out of 5 stars4/5Other Minds: The Octopus, the Sea, and the Deep Origins of Consciousness Rating: 4 out of 5 stars4/5Lies My Gov't Told Me: And the Better Future Coming Rating: 4 out of 5 stars4/5The Misinformation Age: How False Beliefs Spread Rating: 4 out of 5 stars4/5
Related categories
Reviews for Data Democracy
0 ratings0 reviews
Book preview
Data Democracy - Feras A. Batarseh
Data Democracy
At the Nexus of Artificial Intelligence, Software Development, and Knowledge Engineering
Editors
Feras A. Batarseh
Ruixin Yang
Table of Contents
Cover image
Title page
Copyright
Dedication
To: Aaron Swartz—the creator of the Open Access Manifesto.
Contributors
A note from the editors
Foreword
Preface
Section I. The data republic
1. Data democracy for you and me (bias, truth, and context)
1. What is data democracy?
2. Incompleteness and winning an election
3. The story and the alternative story
4. Nothing else matters
2. Data citizens: rights and responsibilities in a data republic
1. Introduction
2. A paradigm for discussing the cyclical nature of data–technology evolution
3. Use cases explaining the black–red–white paradigm of data–technology evolution
4. Preparing for a future data democratization
5. Practical actions toward good data citizenry
6. Conclusion
3. The history and future prospects of open data and open source software
1. Introduction to the history of open source
2. Open source software's relationship to corporations
3. Open source data science tools
4. Open source and AI
5. Revolutionizing business: avoiding data silos through open data
6. Future prospects of open data and open source in the United States
4. Mind mapping in artificial intelligence for data democracy
1. Information overload
2. Mind mapping and other types of visualization
3. Conclusions
5. Foundations of data imbalance and solutions for a data democracy
1. Motivation and introduction
2. Imbalanced data basics
3. Statistical assessment metrics
4. How to deal with imbalanced data
5. Other methods
6. Conclusion
Section II. Implications of a data democracy
6. Data openness and democratization in healthcare: an evaluation of hospital ranking methods
1. Introduction
2. Healthcare within a data democracy—thesis
3. Motivation
4. Related works
5. Hospitals' quality of service through open data
6. Hospital ranking—existing systems
7. Top ranked hospitals
8. Proposed hospital ranking: experiment and results
9. Conclusions and future work
7. Knowledge formulation in the health domain: a semiotics-powered approach to data analytics and democratization
1. Introduction
2. Conceptual foundations
3. A semiotics-centered conceptual framework for data democratization
4. Conclusion
8. Landsat's past paves the way for data democratization in earth science
1. Introduction
2. Landsat overview
3. Machine learning for satellite data
4. Satellite images on the cloud
5. Landsat data policy
6. Conclusion
9. Data democracy for psychology: how do people use contextual data to solve problems and why is that important for AI systems?
1. Introduction and motivation
2. Understanding context
3. Cognitive psychology and context
4. The importance of understanding linguistic acquisitions in intelligence
5. Context and data, how important?
6. Neuroscience and contextual understanding
7. Context and artificial intelligence
8. Conclusion
10. The application of artificial intelligence in software engineering: a review challenging conventional wisdom
1. Introduction and motivation
2. Applying AI to SE lifecycle phases
3. Summary of the review
4. Insights, dilemmas, and the path forward
Index
Copyright
Academic Press is an imprint of Elsevier
125 London Wall, London EC2Y 5AS, United Kingdom
525 B Street, Suite 1650, San Diego, CA 92101, United States
50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States
The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom
Copyright © 2020 Elsevier Inc. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.
Library of Congress Cataloging-in-Publication Data
A catalog record for this book is available from the Library of Congress
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
ISBN: 978-0-12-818366-3
For information on all Academic Press publications visit our website at https://www.elsevier.com/books-and-journals
Publisher: Mara Conner
Acquisition Editor: Chris Katsaropoulos
Editorial Project Manager: Gabriela Capille
Production Project Manager: Punithavathy Govindaradjane
Cover Designer: Matthew Limbert
Typeset by TNQ Technologies
Dedication
To the sons and daughters of the digital prison; as we may give you freedom through a data democracy, you must not inherit our thoughts or our ways.
To: Aaron Swartz—the creator of the Open Access Manifesto.
I remember one day visiting the John Crerar Library in Chicago. My father (Aaron's grandfather) had spoken to me about it many times, about how he had done research there when he was a young man. The library is now on the campus of the University of Chicago. It is an interesting place, as it is a science library whose mission is that it be open to the public. Although the University of Chicago does not encourage public use any more, if you are assertive, they will let you in.
Aaron's grandfather taught me how to do library research when I was very young. We had many reference books at home and used our local public library often. To him, the ability to do research was a fundamental skill to be passed on from father to son.
So I took Aaron to the Crerar Library and showed him around and showed him the stacks and all the books that were there. I remember clearly taking a random book off the shelf and discovering that it was from the 19th century and explaining to him how important it was having access to the world's knowledge. Aaron understood the importance of written knowledge and, just as Crerar wanted his library open to the public, how it was vital that everyone should be able to easily access the world's research and knowledge. As Wikipedia points out: Because the library was incorporated under the 1891 special law, court approval was required for the merger, a condition of the merger was that the combined library would also remain free to the public.
We forget that in the last century, all the world's knowledge was available in the libraries—there books and journals were accessible and open to everyone. Aaron fought so that in this world of bits and bytes we could once again return to a place where everyone could have access to the world's knowledge and research.
Robert Swartz (Aaron's father)
2019
Contributors
Feras A. Batarseh, Graduate School of Arts & Sciences, Data Analytics Program, Georgetown University, Washington, D.C., United States
Justin Bui, Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, United States
Deri Chong, Volgenau School of Engineering, George Maon University, Fairfax, VA, United States
Sam Eisenberg, Department of Mathematics, University of Virginia, Charlottesville, VA, United States
Jay Gendron, United Services Automobile Association (USAA), Chesapeake, VA, United States
José M. Guerrero, Infoseg, Barcelona, Spain
Debra Hollister, Valencia College – Lake Nona Campus, Orlando, FL, United States
Dan Killian, Massachusetts Institute of Technology Operations Research Center, Cambridge, MA, United States
Erik W. Kuiler, George Mason University, Arlington, VA, United States
Ajay Kulkarni, The Department of Computational and Data Sciences, College of Science, George Mason University, Fairfax, VA, United States
Abhinav Kumar, Volgenau School of Engineering, George Mason University, Fairfax, VA, United States
Kelly Lewis, College of Science, George Mason University, Fairfax, VA, United States
Connie L. McNeely, George Mason University, Arlington, VA, United States
Rasika Mohod, Volgenau School of Engineering, George Mason University, Fairfax, VA, United States
Patrick O'Neil
College of Science, George Mason University, Fairfax, VA, United States
BlackSky Inc., Herndon, VA, United States
Chau Pham, College of Science, George Mason University, Fairfax, VA, United States
Diego Torrejon
College of Science, George Mason University, Fairfax, VA, United States
BlackSky Inc., Herndon, VA, United States
Ruixin Yang, Geography and GeoInformation Science, College of Science, George Mason University, Fairfax, VA, United States
Karen Yuan, College of Science, George Mason University, Fairfax, VA, United States
A note from the editors
If you consume or create data, if you are a citizen of the data republic (willingly or grudgingly), and if you are interested in making a decision or finding the truth through data-driven analysis, this book is for you. A group of experts, academics, data science researchers, and industry practitioners gathered to write this book about data democracy.
Multiple books have been published in the areas of data science, open data, artificial intelligence, machine learning, and knowledge engineering. This book, however, is at the nexus of these topics. We invite you to explore it and join us in our efforts to advance a major cause that we ought to debate.
The chapters of this book provide a manifesto to data democracy. After reading this book, you are informed and suitably warned! You are already part of the data republic, and you (and all of us) need to ensure that our data fall in the right hands. Everything you click, buy, swipe, try, sell, drive, or fly is a data point. But who owns/should own that data? At this point, not you! You do not even have access to most of it. The next best empire of our planet is one that owns and controls the world's best dataset.
This book presents the data republic (in Section 1), introduces methods to democratizing data (in Section 2), provides examples on the benefits of open data (for healthcare, earth science, and psychology), and describes the path forward. Data democracy is an inevitable pursuit, let us begin now.
Feras A. Batarseh, Assistant Teaching Professor, Graduate School of Arts & Sciences, Data Analytics Program, Georgetown University, Washington, D.C., United States, Research Assistant Professor, College of Science George Mason University, Fairfax, VA, United States
Ruixin Yang, Geography and GeoInformation Science, College of Science, George Mason University, Fairfax, VA, United States
2019
Foreword
The data crisis is an operational and ethical litmus test which the monolithic technology giants have badly failed. The corporations whose profit models depend on data—Facebook, Google, Amazon, and others—have proven inept at safeguarding consumers' personal data and have so outraged the public by sharing and selling personal information that politicians can credibly advocate that they should be torn apart like Ma Bell, a tech monopoly from an earlier era. At the same time, most of the artificial intelligence (AI)–centered digital corporations that dominate the tech industry regard the data they collect as protected intellectual property. They go to great pains to collect the data and consider their aggregation a more than fair trade for free
services such as Internet search, social networking, and shopping. They will neither acknowledge that consumers have a legitimate claim to their own data nor give up their proprietary stake and make the data open
and available to everyone; but they have proven again and again that they are not able, or fit, to manage it.
The short-term penalty for their data mishandling is lost customers. Generation Z-ers, the first truly native digital citizens, are abandoning social media; 34% say they will leave it entirely, while 64% say they are taking a break. Privacy concerns are high on their list of reasons why [1]. The longer-term penalties are much worse and threaten us all; if the tech giants cannot ethically manage or control services using machine learning, how will they safeguard the world's most sensitive technology as machine intelligence ineluctably grows? And how will they do so when we share the planet with computers that can outsmart us all?
How did we get to this dangerous state of affairs? To understand, we have to go back a decade, when big data was all the rage. Then, companies with a lot of transactional data could analyze it using data-mining tools and extract useful information like inefficiencies and fraud. Wall Street used big data algorithms to seek out investment opportunities and make trading decisions. A big data technique called affinity analysis let companies discover relationships among consumers and products and offer suggestions for movies, shoes, and other goods. Big data still serves up these enterprise-friendly insights. But around 2009, something really big happened to big data.
Three scientists—Hinton, LeCun, and Bengio, all of whom would later win the prestigious A. M. Turing Prize—revealed that training learning algorithms on big data yields predictive abilities that exceed hand-coded programs [2]. Soon this technique, called deep learning, fueled amazing breakthroughs in speech recognition, computer vision, and self-driving cars. Corporations everywhere caught on. Since 2009, thanks in large part to deep learning, investment in AI has doubled each year, and now stands at about $30 billion. AI implementation in enterprise grew 270% over the last four years, mostly, again, thanks to deep learning applications. By 2030, AI will add an estimated $15 trillion to the Global GDP [2].
Just the way that electricity powered the 20th century, this century's economic opportunities are driven by AI.
To get the latest AI applications to work, high quality datasets are mandatory. While hackneyed, the aphorism Data is the new oil
gets truer all the time. As data gain value, their acquisition, ownership, and use grow more controversial. To understand why, we must consider what data are, and where data come from.
Data are discrete pieces of information, such as numbers, words, photographs, measurements, and descriptions. Big data refers to a collection of data so large that it cannot be stored or processed with traditional database or software techniques. For example, Snapchat users share 527,760 photos every minute. Also every minute, 456,000 tweets are sent on Twitter. All these data require hundreds of thousands of terabytes of storage (1 terabyte equals 1024 gigabytes; 1 gigabyte equals 1024 megabytes; 1 megabyte equals 1024 bytes). It's estimated that Google, Amazon, Microsoft, and Facebook together store 1.2 million terabytes among them [3].
Who produces all these data? You! Or rather, your use of the Internet, social media, digital photos, communications like phone calls and texts, and the IoT, or Internet of Things. Your digital activity generates mountains of data, more than half a gigabyte per day for an average user [4]. AI's recent boom can be partly explained by the fact that for the first time enough large-scale data are available for high functioning machine learning systems (the other two drivers of the AI revolution are GPU and AI-specific processor chips, and key insights, i.e., deep learning).
How do the tech giants profit from data? In two ways. First, companies including Facebook, Amazon, and Google make money by offering their clients curated ad positioning. Based on your digital profile, they target you for their clients' ads. Second, whenever you buy their product or use their service, the tech giants gather data about you and your web activity. These data feed the development of profitable, data-hungry applications and products. Whenever you comment on an Amazon product, tweet on Twitter, or like
a notice on your tennis league's Facebook page, you are helping mint cash for the world's richest corporations.
Because we, the users, generate the data that are the lifeblood of these companies, we should be paid for its use, right? Don't you own your data?
On the Internet, you do not own your data in a traditional sense, the way, say, a photographer owns her photograph. If you publish a photograph on Facebook, it's still yours for personal use, of course. But by electrically signing Facebook's Terms of Use, you give FB permission to use your photograph as they see fit and to share your photograph with their business partners and other entities. And there is a lot to share. For each user, on average, Facebook has as much as 400,000 MS Word documents worth of data. Google has much more, about 3 million MS Word docs per user [5].
Google makes the case that they divorce your identity from your data, and so your privacy is safe with them. Their advertising clients target your digital identity
with ads, without ever knowing your name or other personal information. In essence, who you are does not matter to Google. What matters are your photos, texts, and browsing history, and they highly prize their access to it. They do not want you to have data rights that will restrict their unimpeded use, and they certainly do not want to make your data open and free to anyone.
And despite the tech corporations' promises, they do shamefully little to secure your data or to honor their own Terms of Use. The greatest example so far of how low companies can stoop is a tale of big data, foreign intrigue, and the most important election in a decade: Facebook's Cambridge Analytica scandal.
Briefly, in 2011, due to past failures in keeping user data private, Facebook made an agreement with the Federal Trade Commission (FTC). It required that, among other things, Facebook receive prior affirmative consent
from users before it shared their data with third parties. This consent decree
was to last 20 years. Three years later, in 2014, Facebook allowed an app developer to access the personal data of some 87 million users and their Facebook friends.
The developer worked closely with the election consulting firm Cambridge Analytica. It acquired these data and then created an algorithm that could determine personality traits connected to voting behavior for the affected users. In conjunction with a Russian firm, Cambridge Analytica targeted users with ad and news campaigns meant to impact their vote in the United States' 2016 Presidential Election.
For breaches of its 2011 agreement with the FTC, Facebook may be fined up to $5 billion [6]. The previous record for fines related to privacy violations belongs to Google; in 2012, it paid $22.5 million to settle FTC charges that it misrepresented privacy assurances to users of Apple's Safari Internet browser. More recently, in the early 2019, France fined Google $57 million for failing to tell users how their data were being collected and failing to get users' consent to target them with personalized ads.
For Facebook and Google, which in 2018 earned $55 billion and $136 billion, respectively, these fines are little more than slaps on the wrist [7]. The tech giants' own history tells us that modifying their behavior is difficult indeed. Google, now Alphabet, it seems, would prefer to be sued than to change their business practices or protect user privacy. Alphabet employs some 400 lawyers because, among other things, it has been sued in 20 countries for everything from privacy and copyright violations to predatory business practices. In the United States, 38 states sued then-Google when it was discovered that the cars working in its Street View mapping project did more than take pictures. Without permission they hoovered up emails, passwords, and other personal information from computers in houses they passed [8].
Facebook is of course no better. Just weeks after April 2018, when founder Mark Zuckerberg answered questions on Capitol Hill about the Cambridge Analytica scandal and promised to impose harsh new restrictions on third-party use of user data, Facebook shared more user data with at least 50 device manufacturers, including four Chinese companies. The manufacturers were able to access personal data even if the Facebook user denied permission to share their data with third parties [9].
No business entities in world history have possessed wealth compared to that of the tech giants. The profits they earn put them in unique category of human enterprise somewhere between corporations and nations. If they were nations, Alphabet and Facebook would rank in the top richest 30% and 41%, respectively. Consequently, they behave with nation-like arrogance, flying above normal corporate constraints of ethics and law, and paying taxes in low cost tax havens instead of where they make their wealth. As sole providers of their respective services, they act monopolistically. In fact, they are de facto utilities and should be subject to stringent regulations aimed at preserving competition and innovation, or broken up, as the Bell System of telephone companies was in 1983.
The tech giants profit from personal data and bulldoze the competition, but their greatest transgression still lies ahead. They are setting up the human race for an AI disaster many have seen coming for years: the intelligence explosion.
The formula for the intelligence explosion was laid out in 1963 by English statistician I.J. Good. He wrote
Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion,' and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control [10].
I like to put Good's theorem in a contemporary context. We have already created machines that are better than humans at chess, go, Jeopardy!, and many tasks such as navigation, search, theorem proving, and more. Scientists are rapidly developing resources that fuel AI design, including AI-specific processors, large datasets, and key insights such as, but not limited to, deep learning and evolutionary algorithms. Eventually, scientists will create machines that are better at AI research and development than humans are. At that point, they will be able to improve their own capabilities very quickly. These machines will match human level intelligence, then become superintelligent—smarter in a rational, mathematical sense than any human—in a matter of days or weeks, in a recursive loop of self improvement [11].
Many experts who consider the future of AI, including myself, have argued that the intelligence explosion is not merely possible, but probable, and will occur in this century [12]. A great number of factors have gone into that conclusion, including the durability of Moore's Law, potential defeaters of AI development, the limitations of existing AI techniques, and much more. It seems inescapable that barring a cataclysmic disaster or war, scientists will create the basic ingredients of the intelligence explosion—a smarter-than-human machine—in the normal course of developing AI. Its cost will limit the competitors for this dangerous distinction. Open AI, a nonprofit founded to create beneficial general intelligence free of market pressures, recently revealed their best estimate of the price of this endeavor, and how long it will take: at least $2 billion, and more than 10 years [13].
The intelligence explosion will be the most sensitive event in human history for the simple reason that we have no experience with machines that can outwit us; we cannot be sure their development would not be disastrous. Computer scientists and philosophers refer to this with the masterfully understated term the control problem.
I think there is ample evidence to conclude the tech giants are not fit to guide the development of superintelligence to a safe