From Big Data to Intelligent Data: An Applied Perspective
()
About this ebook
This book addresses many of the gaps in how industry and academia are currently tackling problems associated with big data. It introduces novel concepts, describes the end-to-end process, and connects the various pieces of the puzzle to offer a holistic view. In addition, it explains important concepts for a wide audience, using accessible language, diagrams, examples and analogies to do so. The book is intended for readers working in industry who want to expand their knowledge or pursue a related degree, and employs an industry-centered perspective.
Related to From Big Data to Intelligent Data
Titles in the series (100)
Valuing Intellectual Capital: Multinationals and Taxhavens Rating: 0 out of 5 stars0 ratingsAn Insider's Guide to Place Branding: Shaping the Identity and Reputation of Cities, Regions and Countries Rating: 0 out of 5 stars0 ratingsPrinciples of Chinese Management Rating: 0 out of 5 stars0 ratingsMastering Disruption and Innovation in Product Management: Connecting the Dots Rating: 0 out of 5 stars0 ratingsXR Case Studies: Using Augmented Reality and Virtual Reality Technology in Business Rating: 0 out of 5 stars0 ratingsToolbox for Marketing and Management: Creative Concepts, Forecasting Methods, and Analytical Instruments Rating: 0 out of 5 stars0 ratingsRun IT: Dominating Information Technology Rating: 0 out of 5 stars0 ratingsSolution Business: Building a Platform for Organic Growth Rating: 0 out of 5 stars0 ratingsDigitalization Cases: How Organizations Rethink Their Business for the Digital Age Rating: 0 out of 5 stars0 ratingsThe Rise of the African Multinational Enterprise (AMNE): The Lions Accelerating the Development of Africa Rating: 0 out of 5 stars0 ratingsSix Sigma Green Belt Certification Project: Identification, Implementation and Evaluation Rating: 0 out of 5 stars0 ratingsStrategy Scout: How to Deal with Complexity and Politics During Strategy Development Rating: 0 out of 5 stars0 ratingsProduct Information Management: Theory and Practice Rating: 5 out of 5 stars5/5Topics of Family Business Governance: Insights on Structures, Strategies, and Executives Rating: 0 out of 5 stars0 ratingsCreating Innovation Spaces: Impulses for Start-ups and Established Companies in Global Competition Rating: 0 out of 5 stars0 ratingsIT Management in the Digital Age: A Roadmap for the IT Department of the Future Rating: 0 out of 5 stars0 ratingsSuccessful International Negotiations: A Practical Guide for Managing Transactions and Deals Rating: 0 out of 5 stars0 ratingsInnovation and Transformation: Basics, Implementation and Optimization Rating: 0 out of 5 stars0 ratingsManaging Complexity in Social Systems: Leverage Points for Policy and Strategy Rating: 0 out of 5 stars0 ratingsManaging Business Family Dynasties: Between Family, Organisation, and Network Rating: 0 out of 5 stars0 ratingsCasinonomics: The Socioeconomic Impacts of the Casino Industry Rating: 0 out of 5 stars0 ratingsHands-On Value-at-Risk and Expected Shortfall: A Practical Primer Rating: 0 out of 5 stars0 ratingsBPM - Driving Innovation in a Digital World Rating: 0 out of 5 stars0 ratingsThe Nature of Purchasing: Insights from Research and Practice Rating: 0 out of 5 stars0 ratingsAuthentic Governance: Aligning Personal Governance with Corporate Governance Rating: 0 out of 5 stars0 ratingsStart-up: A Practical Guide to Starting and Running a New Business Rating: 0 out of 5 stars0 ratingsFrom Big Data to Intelligent Data: An Applied Perspective Rating: 0 out of 5 stars0 ratingsCase Studies in Strategic Management: How Executive Input Enables Students’ Development Rating: 0 out of 5 stars0 ratingsSport Entrepreneurship: Developing and Sustaining an Entrepreneurial Sports Culture Rating: 0 out of 5 stars0 ratingsGlobal Manufacturing Management: From Excellent Plants Toward Network Optimization Rating: 0 out of 5 stars0 ratings
Related ebooks
The Real Work of Data Science: Turning data into information, better decisions, and stronger organizations Rating: 0 out of 5 stars0 ratingsData Science Fundamentals for Python and MongoDB Rating: 0 out of 5 stars0 ratingsDeveloping Analytic Talent: Becoming a Data Scientist Rating: 3 out of 5 stars3/5Business Analytics for Managers Rating: 0 out of 5 stars0 ratingsBusiness Analytics: A Practitioner’s Guide Rating: 0 out of 5 stars0 ratingsGetting Data Science Done: Managing Projects From Ideas to Products Rating: 0 out of 5 stars0 ratingsPractical Data Science: A Guide to Building the Technology Stack for Turning Data Lakes into Business Assets Rating: 0 out of 5 stars0 ratingsSupervised Learning with Python: Concepts and Practical Implementation Using Python Rating: 0 out of 5 stars0 ratingsIn-Memory Data Management: Technology and Applications Rating: 5 out of 5 stars5/5Domain-Specific Knowledge Graph Construction Rating: 0 out of 5 stars0 ratingsPython Data Science Essentials Rating: 0 out of 5 stars0 ratingsCreating Good Data: A Guide to Dataset Structure and Data Representation Rating: 0 out of 5 stars0 ratingsDeploying AI in the Enterprise: IT Approaches for Design, DevOps, Governance, Change Management, Blockchain, and Quantum Computing Rating: 0 out of 5 stars0 ratingsIT Manager's Handbook: Getting your New Job Done Rating: 0 out of 5 stars0 ratingsBusiness Value in an Ocean of Data: Data Mining from a User Perspective Rating: 0 out of 5 stars0 ratingsDeep Learning: Convergence to Big Data Analytics Rating: 0 out of 5 stars0 ratingsThe Decision Maker's Handbook to Data Science: A Guide for Non-Technical Executives, Managers, and Founders Rating: 0 out of 5 stars0 ratingsPractical DataOps: Delivering Agile Data Science at Scale Rating: 0 out of 5 stars0 ratingsPractical Java Machine Learning: Projects with Google Cloud Platform and Amazon Web Services Rating: 0 out of 5 stars0 ratingsApplying Data Science: How to Create Value with Artificial Intelligence Rating: 0 out of 5 stars0 ratingsPractical Data Science with Python 3: Synthesizing Actionable Insights from Data Rating: 0 out of 5 stars0 ratingsData Analysis Simplified: A Hands-On Guide for Beginners with Excel Mastery. Rating: 0 out of 5 stars0 ratingsBig Data: Statistics, Data Mining, Analytics, And Pattern Learning Rating: 0 out of 5 stars0 ratingsInformation Management: Strategies for Gaining a Competitive Advantage with Data Rating: 0 out of 5 stars0 ratingsSmarter Data Science: Succeeding with Enterprise-Grade Data and AI Projects Rating: 0 out of 5 stars0 ratingsPredictive Analytics and Machine Learning for Managers Rating: 0 out of 5 stars0 ratingsData Science Career Guide Interview Preparation Rating: 0 out of 5 stars0 ratings
Business For You
The Intelligent Investor, Rev. Ed: The Definitive Book on Value Investing Rating: 4 out of 5 stars4/5Your Next Five Moves: Master the Art of Business Strategy Rating: 5 out of 5 stars5/5The Richest Man in Babylon: The most inspiring book on wealth ever written Rating: 5 out of 5 stars5/5Emotional Intelligence: Exploring the Most Powerful Intelligence Ever Discovered Rating: 5 out of 5 stars5/5Becoming Bulletproof: Protect Yourself, Read People, Influence Situations, and Live Fearlessly Rating: 4 out of 5 stars4/5Confessions of an Economic Hit Man, 3rd Edition Rating: 5 out of 5 stars5/5Tools Of Titans: The Tactics, Routines, and Habits of Billionaires, Icons, and World-Class Performers Rating: 4 out of 5 stars4/5The Everything Guide To Being A Paralegal: Winning Secrets to a Successful Career! Rating: 5 out of 5 stars5/5How to Write a Grant: Become a Grant Writing Unicorn Rating: 5 out of 5 stars5/5Carol Dweck's Mindset The New Psychology of Success: Summary and Analysis Rating: 4 out of 5 stars4/5The Five Dysfunctions of a Team: A Leadership Fable, 20th Anniversary Edition Rating: 4 out of 5 stars4/5Crucial Conversations: Tools for Talking When Stakes are High, Third Edition Rating: 4 out of 5 stars4/5The Book of Beautiful Questions: The Powerful Questions That Will Help You Decide, Create, Connect, and Lead Rating: 4 out of 5 stars4/5Crucial Conversations Tools for Talking When Stakes Are High, Second Edition Rating: 4 out of 5 stars4/5Set for Life: An All-Out Approach to Early Financial Freedom Rating: 4 out of 5 stars4/5Robert's Rules Of Order Rating: 5 out of 5 stars5/5Capitalism and Freedom Rating: 4 out of 5 stars4/5The Catalyst: How to Change Anyone's Mind Rating: 4 out of 5 stars4/5Just Listen: Discover the Secret to Getting Through to Absolutely Anyone Rating: 4 out of 5 stars4/5Collaborating with the Enemy: How to Work with People You Don’t Agree with or Like or Trust Rating: 4 out of 5 stars4/5Law of Connection: Lesson 10 from The 21 Irrefutable Laws of Leadership Rating: 4 out of 5 stars4/5How to Get Ideas Rating: 5 out of 5 stars5/5Buy, Rehab, Rent, Refinance, Repeat: The BRRRR Rental Property Investment Strategy Made Simple Rating: 5 out of 5 stars5/5
Reviews for From Big Data to Intelligent Data
0 ratings0 reviews
Book preview
From Big Data to Intelligent Data - Fady A. Harfoush
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021
F. A. HarfoushFrom Big Data to Intelligent DataManagement for Professionalshttps://doi.org/10.1007/978-3-030-76990-1_1
1. Introduction
Fady A. Harfoush¹
(1)
CME Business Analytics Lab, Loyola University Chicago, Chicago, IL, USA
Keywords
Business intelligenceBusiness analyticsBig dataParadigm shiftIntelligent informationSingularity
1.1 The Business Value Proposition
Twitter generates an average of 500 million tweets per day. This number is increasing by the day. Getting access to Twitter’s full daily tweets (the firehose) is costly, and limited to those who really need it, and can afford it. It is estimated that access to the firehose costs somewhere in the few hundred thousands of dollars per year. Is this a reasonable price to pay? What do we get in return? Is this a good business value proposition for Twitter? Do the benefits or business rewards outweigh the costs? What is the value added for a business to access the full firehose of tweets if the company does not have the infrastructure and the know-how to capture, and analyze the tweets to extract the valuable insights—and do it in almost real time? Not a small undertaking. Before rushing and jumping on the big data bandwagon these are some of the questions every business should address first, both from the technological and from the business perspectives.
It is safe to assume that only 1% of all daily tweets contain valuable or intelligent information that can be translated into actionable insights.
What is the added business value to access more than 500 Million tweets per day and what is a good price? If 99% of big data is dirty data, which remaining 1% is good data?
Finding the 1% is like searching for a needle in a haystack. How is a business supposed to assess the price of a subscription to the tweets, and the return on the investment if one cannot easily distinguish between a valuable or true tweet and a junk or fake tweet? Should a consumer be charged for the 1% only and which 1%? Who decides the right percentage? We should be charged for the quality of service we receive. Not the case with data. In almost all cases the data provider or vendor will have a disclaimer of the sort "XYZ assumes no responsibility for errors or omissions. The user assumes the entire risk associated with its use of these data." A good comparison, albeit in a different context, is when purchasing fruits from a grocery store. The price is set by the weight, and not by how much juice can be extracted from the fruit. In principle the price should be set by the amount of juice content. This may sound like a far-fetched scenario but thanks to evolving new sensing technologies, it is possible in the foreseeable future we could be charged by the content of juice, and not by the weight.
Clearly a business model based on providing access to the tweets’ firehose is not a good business value proposition for everyone. The real value proposition resides in the analytics and actionable items that can be extracted from the tweets. For these reasons, Twitter has partnered with few in the industry among which is my co-founded SMA (www.socialmarketanalytics.com) to help businesses in different sectors access the analytics derived from the tweets.
1.2 The Enhanced Value
A good introduction to the topic is a scene from the movie "The Circle" (2017). In the scene the lead actor, played by Tom Hanks, is presenting the company’s new product (a cheap tiny camera with real-time broadcasting capabilities) to the employees and the new recruiters. During the presentation, he quotes two details critical to creating the enhanced value.
The first quote is about linking the information from different sources to create a unified view by which an enhanced level of intelligent information is obtained. This ties very well, as we will see in later chapters with the topic of IoT (Internet of all Things).
Knowing is good, but knowing everything is better
The second quote emphasizes the need to process the data and run the analytics in real time.
"Real Time Analytics Process"
Both quotes have major implications we will discuss throughout this book. They represent the biggest challenges and the most compelling competitive edge: the ability to link the data from the different sources to create a unified view, and to run the analytics in almost real time.
The world has evolved from having limited and controlled information, to having unlimited and open information. Interestingly both outcomes are equivalent considering that most of big data (99%) is dirty data. In many applications (i.e., engineering) the signal-to-noise ratio (SNR) is a good metric to measure performance and assess quality. Simply put, we want to enhance the signal (the numerator) and reduce the noise (the denominator) to achieve a high SNR. The SNR can be viewed as a representative measure of good-to-bad data, with the signal representing the good data and the noise the bad data. The two extreme cases of very small limited (close to zero) good data and the case of very large unlimited (close to infinity) bad data lead to a SNR close to zero with little or no real business value.
Without the proper tools and the know-how to separate the good data from the bad data, having very limited (close to zero) data or unlimited (close to infinity) data both provide little or no intelligent information.
What industry needs are the tools and the know-how to mine the unlimited information, connect the dots, and do it almost instantaneously. Those able to do so will gain the competitive edge, maintain the industry superiority, and eventually create a monopoly. A familiar example is Google. A less publicly known company called Palantir ( www.palantir.com ) has for years specialized in creating a unified view leveraging data from different data sources. It is used mainly by government agencies. The company went public in September 2020.
Throughout this book we will address the fundamental question described in Fig. 1.1: how to turn big data into intelligent data, to extract the actionable insights, and to do it fast. Information here is used in general terms to convey data, to share an opinion, describe an event or an observation. It is not a statement of intelligence. We will later explain the distinction.
../images/513481_1_En_1_Chapter/513481_1_En_1_Fig1_HTML.pngFig. 1.1
The business value proposition
Information does not necessarily imply intelligent information!
Questions raised earlier about Twitter’s business value proposition apply to other social media channels such as Facebook, Instagram, and LinkedIn. What is their business value proposition? Is it the platform or is it the data? The data represent a major part of these businesses and their valuations.
Data is a dirty business, and content is king. But how to monetize data? What data and what content are we competing and paying for?
1.3 The Age of Big Data
Welcome to the age of big data. Big data has been described as the new oil, the new gold rush, and at times compared to the advent of the Internet revolution. While to some extent these analogies are correct there is a level of exaggeration, coupled with the lack of historical context, and a hype associated with the rush to capture the market opportunities. To begin we need to set the record straight on what big data is and is not and put matters in the right perspectives.
Many private and national research labs have for years been working with what we now call big data. Drawing from my own experience, examples can be cited from research work in high energy physics at places such as Fermi National Accelerator Lab (FNAL) in Batavia, Illinois, and the European Centre National des Recherches Nucléaires (CERN) located on the border between Switzerland and France. Experiments conducted at these particle physics accelerator labs have the mission to search for new particles and confirm (or reject) integrity of established theories (i.e., the standard theory in physics). The amount of data collected from experiments conducted at these labs easily range in the hundreds of petabytes (2⁵⁰ bytes) per year. It can sometimes take more than a year to analyze the data using high performance computing servers to detect any traces of a new particle. It is like looking for a needle in a haystack. Many other national labs in the USA (Sandia, Lawrence Livermore, Argonne, Jet Propulsion Lab, etc.) and globally (CERN) have also for years been working with big data. Similar experiments, large data collection and analysis are conducted in astronomy in the study of cosmos, and in searching for signs of extraterrestrial life. In non-government private industries, Walmart, for example, has been in the business of collecting and analyzing big data for years looking at transactional purchases. The financial industry has been working with big data analyzing years of historical stock tick
data collected sometimes in the microsecond time interval range. One can quickly appreciate the large amount of data collected. In summary it is safe to say that big data is not a new phenomenon and has been around for many years.
So, what is new? Like many knowledge and technology transfer between research labs and industry, what makes big data new is its democratization and its commercialization. Its wide adoption has been facilitated by the rapid advances in technology making it cheaper and easier to generate, collect, and analyze data.
Big data is not a new phenomenon. What is new is the democratization, the commercialization, and the wide adoption of big data.
As depicted in Fig. 1.2 data is now generated by many sources like social networks, mobile devices, and smart sensor technologies used in IoT (Internet of Things). Quite often the data collected is made available freely for everyone to view and to analyze. The real value resides not in the data itself, but in the intelligent information extracted from the