Making Big Data Work for Your Business: A guide to effective Big Data analytics
By Sudhi Sinha
()
About this ebook
Bridge the gap between data and decision.
Big Data has brought about a revolution in the way we do business. Essential business decisions can today be informed by the wealth of data now at our disposal. However, while Big Data may appear to be the answer to every business problem, for many, gaining real value from data – gaining business insights is a difficult task. Big Data, for many, is a big problem itself, with many struggling to reap the rewards that it promises. In this accessible and stimulating guide, Sudhi Sinha, management, technology and sustainability expert, provides a unique perspective on Big Data and how to derive maximum value from it – with sharp and careful analytics.
“this is a perfect starter for any manager who wants to understand and explore Big Data… The Big Data field is evolving quickly, and this book serves as a quick and practical introduction to the field.”
AnHai Doan, Professor, University of Wisconsin; Chief Scientist at WalmartLabs USA
This insightful and engaging book demonstrates that Big Data, to be most effective, needs to be weaved within the fabric of organization strategy. Without it, you are simply left with numbers and statistics, lacking purpose – lacking potency. Beginning with the essential stage of building a strategy framework for you Big Data analytics project, and integrating it within your wider business strategy, Sudhi provides you with the knowledge and insight to help you build a big data strategy that gets results.
Big data is one of the biggest buzzwords in the world of business today. And while it is true that it has opened up huge opportunities for businesses of all sizes, it is nevertheless difficult for many businesses to turn the reserves of numbers and statistics at their disposal into clear insights that can inform important business decisions. Beginning with the creation of a Big Data strategy and the identification of the key opportunities that it has the potential to unlock, the book will then demonstrate how to implement and manage your project with the best team and the right technology for your needs. Once this is in place you will then find out how to get the most from your Big Data insights, with effective organizational alignment and change management.
Related to Making Big Data Work for Your Business
Related ebooks
Big Data Analytics: From Strategic Planning to Enterprise Integration with Tools, Techniques, NoSQL, and Graph Rating: 5 out of 5 stars5/5Big Data: Opportunities and challenges Rating: 0 out of 5 stars0 ratingsBuilding Big Data Applications Rating: 0 out of 5 stars0 ratingsUnderstanding Big Data: A Beginners Guide to Data Science & the Business Applications Rating: 4 out of 5 stars4/5Big Data: Understanding How Data Powers Big Business Rating: 2 out of 5 stars2/5Business Analytics: Leveraging Data for Insights and Competitive Advantage Rating: 0 out of 5 stars0 ratingsThe Big Data-Driven Business: How to Use Big Data to Win Customers, Beat Competitors, and Boost Profits Rating: 0 out of 5 stars0 ratingsPractical Predictive Analytics Rating: 0 out of 5 stars0 ratingsBig Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses Rating: 0 out of 5 stars0 ratingsThe Analytics Revolution: How to Improve Your Business By Making Analytics Operational In The Big Data Era Rating: 0 out of 5 stars0 ratingsData Risk Management Rating: 0 out of 5 stars0 ratingsBig Data, Data Mining, and Machine Learning: Value Creation for Business Leaders and Practitioners Rating: 3 out of 5 stars3/5The Cloud-Based Demand-Driven Supply Chain Rating: 0 out of 5 stars0 ratingsInformation Management: Strategies for Gaining a Competitive Advantage with Data Rating: 0 out of 5 stars0 ratingsThriving in a Data World: A Guide for Leaders and Managers Rating: 0 out of 5 stars0 ratingsBig Data: Using SMART Big Data, Analytics and Metrics To Make Better Decisions and Improve Performance Rating: 4 out of 5 stars4/5Predictive Analytics Using Rattle and Qlik Sense Rating: 0 out of 5 stars0 ratingsBusiness Intelligence: The Savvy Manager's Guide Rating: 4 out of 5 stars4/5Learning Tableau 2019 - Third Edition: Tools for Business Intelligence, data prep, and visual analytics, 3rd Edition Rating: 0 out of 5 stars0 ratingsBig Data Visualization Rating: 0 out of 5 stars0 ratingsBig Data in Practice: How 45 Successful Companies Used Big Data Analytics to Deliver Extraordinary Results Rating: 4 out of 5 stars4/5Understanding the Predictive Analytics Lifecycle Rating: 5 out of 5 stars5/5Introduction to R for Business Intelligence Rating: 0 out of 5 stars0 ratingsTaming The Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics Rating: 4 out of 5 stars4/5A Practical Guide to Analytics for Governments: Using Big Data for Good Rating: 0 out of 5 stars0 ratingsSpreadsheets To Cubes (Advanced Data Analytics for Small Medium Business): Data Science Rating: 0 out of 5 stars0 ratingsModern Enterprise Business Intelligence and Data Management: A Roadmap for IT Directors, Managers, and Architects Rating: 0 out of 5 stars0 ratings
Project Management For You
Building a Second Brain: A Proven Method to Organize Your Digital Life and Unlock Your Creative Potential Rating: 4 out of 5 stars4/5Fundamentals of Project Management Rating: 4 out of 5 stars4/5Project Management For Dummies Rating: 5 out of 5 stars5/5The PARA Method: Simplify, Organize, and Master Your Digital Life Rating: 5 out of 5 stars5/5The Book on Flipping Houses: How to Buy, Rehab, and Resell Residential Properties Rating: 4 out of 5 stars4/5Focus: The Hidden Driver of Excellence Rating: 4 out of 5 stars4/5SHRM Society for Human Resource Management Complete Study Guide: SHRM-CP Exam and SHRM-SCP Exam Rating: 0 out of 5 stars0 ratingsCome Up for Air: How Teams Can Leverage Systems and Tools to Stop Drowning in Work Rating: 0 out of 5 stars0 ratingsAgile Practice Guide Rating: 4 out of 5 stars4/5The Myth of Multitasking: How "Doing It All" Gets Nothing Done Rating: 5 out of 5 stars5/5Scrum For Dummies Rating: 0 out of 5 stars0 ratingsThe Fast Forward MBA in Project Management Rating: 4 out of 5 stars4/5The Ultimate Freelancer's Guidebook: Learn How to Land the Best Jobs, Build Your Brand, and Be Your Own Boss Rating: 4 out of 5 stars4/5Federal Contracting Made Easy Rating: 5 out of 5 stars5/5Being a Project Manager: The Beginning Rating: 4 out of 5 stars4/5Fundamentals of Project Management, Sixth Edition Rating: 0 out of 5 stars0 ratingsThe Third Wave: An Entrepreneur's Vision of the Future Rating: 4 out of 5 stars4/5Managing Projects (HBR 20-Minute Manager Series) Rating: 4 out of 5 stars4/5The New One-Page Project Manager: Communicate and Manage Any Project With A Single Sheet of Paper Rating: 3 out of 5 stars3/5Managing Time (HBR 20-Minute Manager Series) Rating: 5 out of 5 stars5/5The Six Sigma Method: Boost quality and consistency in your business Rating: 3 out of 5 stars3/5Project Management for Small Business: A Streamlined Approach from Planning to Completion Rating: 0 out of 5 stars0 ratings
Reviews for Making Big Data Work for Your Business
0 ratings0 reviews
Book preview
Making Big Data Work for Your Business - Sudhi Sinha
Table of Contents
Making Big Data Work for Your Business
Credits
Foreword
About the Author
Acknowledgments
About the Reviewers
Preface
What this book covers
Who this book is for
Conventions
Reader feedback
Piracy
1. Building Your Strategy Framework
Using Big Data analytics to identify where to play and how to win, to grow your business
Understanding the changing landscape
Identifying the strategic implications
Spotting and simulating growing influences
Simulating your organization's use of data
Understanding competitive actions
Establishing correlations
Integrating new possibilities into planning
Developing strategies
The Balanced Score Card approach
Force Field Analysis
Aligning existing initiatives
Cascading your strategy
Summary
2. Creating an Opportunity Landscape and Collecting Your Gold Coins
Building your Data Catalog
The Gold Coin approach
Identifying your Gold Coins
Qualification
Benefit assessment
Strategic advantage assessment
Gold Coin examples
Problem 1 – service parts historical analysis
Problem 2 – customer records aggregation
Problem 3 – mining corporate social media to understand employee engagement
Assessing your Gold Coin project
Prioritizing your Gold Coins
Developing the prioritization framework
Building your Gold Mine
Summary
3. Managing Your Big Data Projects Effectively
Recognizing how Big Data Analytics projects are different
Scope fluidity
Business case certainty
Focus specificity
Initiation and progression
Learning tolerance
Data complexity
Functional transaction processing
Defining unique success criteria
Creating an Explore, Validate, Amplify framework for Big Data Analytics projects
Explore
Building use cases
Identifying data sources
Ingressing data
Deciding your analytics models
Applying analytical models on your selected data sets
Testing your hypothesis
Validate
Identifying more data sets
Retesting your use case
Identifying adjacent data types, sources, and data sets
Refining and extending your use case
Validating your modified use case
Identifying output data needs
Amplify
Building an enterprise data model
Refining the ingress process
Developing a functional user prototype
Developing repeatable analytical algorithms
Developing an application package
Hosting your data and application
Developing a user guide
Developing a communication package
Establishing a governance model
Treating customer-facing applications differently
Intellectual property protection
User experience excellence
A comprehensive approach for building an organizational Big Data Analytics infrastructure
The enterprise data map
The enterprise data ingestion infrastructure
Scalable data storage
The analysis engine
Benefits map
The platform framework
Summary
4. Building the Right Technology Landscape
Designing Big Data storage
Evolution of storage technology
Big Data storage architecture
Big Data storage calculations
The hardware and operating system needs for Big Data
Identifying the different technology layers
Quality check
Cleansing
Correlation
Enrichment
Data cataloging
Modeling for analysis
Classification and clustering
Statistical summary for preliminary insights
Human explorations
Selecting from your technology choices
An overview of key Big Data technology components
Hadoop
MapReduce
Programming languages
Java
Python
Pig Latin
R
Hive
Mahout
ZooKeeper
NoSQL
Other Hadoop components
Technology choices by layers
Making the right technology choices
Creating a visualization of your Big Data
Understanding the difference between Enterprise Data Warehouse and Big Data
Summary
5. Building a Winning Team
Understanding the distinctive skills you need
Data scientist
Skills of a data scientist
Sourcing data scientists
Experimental analyst
Skills of an experimental analyst
Sourcing experimental analysts
Application developer
Skills of an application developer
Sourcing application developers
Infrastructure specialist
Skills of an infrastructure specialist
Sourcing infrastructure specialists
Change leader
Skills of a change leader
Sourcing the change leader
The project manager
Skills of a project manager
Sourcing the project manager
Defining the team and structure
Building an extended ecosystem
Educational institutes
The engagement framework
Best practices
Consulting organizations
The engagement framework
Best practices
Improving team alignment and performance
Training and orientation
Quick wins
Rotational leadership
Motivating your team towards progress
Summary
6. Managing Investments and Monetization of Data
Understanding how data creates value
Insights and influence
Immediate and future value
Value creation for data
Capturing the value of data
Identifying the value
Building a value catalog
Understanding and capturing your Big Data costs
Data collection costs
Storage and processing costs
Software licensing costs
People costs
Infrastructure and administrative costs
Maintenance costs
Capturing the costs
Cost Context Framework
Monetizing your Big Data
Direct business impact
Selling data externally
Valuation of Big Data
Managing your Big Data investments
Summary
7. Driving Change Effectively
Understanding changes caused by Big Data
The significance of changes
Applying the IMMERSE framework to manage change
Identify
Modulate
Mitigate
Educate
Role play
Show
Effect
Creating stakeholder groups to drive change
Project group
Work group
Review group
Summary
8. Driving Communication Effectively
Identifying your communication needs
Communicating with the internal audience
Providing a general overview of Big Data
Sharing the strategic cascade
Communicating about Gold Coin projects
Communicating with the external audience
Engaging with customers
Swaying the business influencers
Roping in your ecosystem partners
Communicating with the shareholders
Selecting your communication channels
Communication channels
Channel effectiveness
Building your communication strategy
Building your communication plan
Engaging executives effectively
Monitoring and modulating your communication program
Effectiveness metrics
Measuring effectiveness
Summary
Making Big Data Work for Your Business
Making Big Data Work for Your Business
Copyright © 2014 Impackt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author nor Impackt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Impackt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Impackt Publishing cannot guarantee the accuracy of this information.
First published: October 2014
Production reference: 1221014
Published by Impackt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78300-098-2
www.impacktpub.com
Credits
Author
Sudhi Sinha
Reviewers
Vikash Gaur
Richard Heimann
Acquisition Editor
Richard Gall
Content Development Editor
Sweny Sukumaran
Copy Editors
Tanvi Bhatt
Simran Bhogal
Karuna Narayanan
Alfida Paiva
Faisal Siddiqui
Project Coordinator
Venitha Cutinho
Proofreaders
Simran Bhogal
Maria Gould
Ameesha Green
Paul Hindle
Graphics
Ronak Dhruv
Abhinash Sahu
Production Coordinator
Melwyn D'sa
Cover Work
Simon Cardew
Foreword
Big Data is now ubiquitous. At its core, the Big Data phenomenon includes a realization, a vision, and a resulting implementation effort. There is now a widespread realization (indeed, one may even say that an inflection point has been reached) that data can hugely benefit applications in virtually all corners of society. This has given rise to the vision of a data-driven world, one in which organizations, governments, disciplines, communities, and individuals alike try to capture as much data as possible and then use knowledge gleaned from the data to make decisions. The race is therefore on to implement this vision, with needs for ever more tools to capture data, for distributed and parallel platforms to process huge volumes of data, and for analytics software to glean the available insights.
And there is more to come. Looking forward, just over the horizon, the Internet of Things is emerging. On this Internet, ordinary appliances such as refrigerators, thermostats, furnaces, and cars have sensors that allow them to generate torrents of data and to communicate with one another. Companies are eager to build this Internet, as it will allow them to offer high-margin, value-added services, projected to be worth billions of dollars. When this Internet takes hold, it will make the data-driven world even bigger and the data deluge even more ubiquitous.
Thus, it is no wonder that companies are racing to prepare themselves for the Big Data era and the coming Internet of Things. It is difficult, however, to know exactly where to start and what to do. This is where this book comes in. Written by Sudhi Sinha at Johnson Controls, with decades of experience in data management, this is a perfect starter book for any manager who wants to understand and explore Big Data. The book covers important challenges, ranging from building an overall strategy, to creating new opportunities, to managing projects, and to driving Big Data communications effectively throughout the entire organization. The book is peppered with concrete examples and practical tips, in an engaging presentation style. The Big Data field is evolving quickly, and this book serves as a quick and practical introduction to the field. I have found it very informative and interesting, and I believe that you will too.
AnHai Doan
Professor, University of Wisconsin-Madison, and Chief Scientist, WalmartLabs, USA
About the Author
Sudhi Sinha is a business leader with over 17 years of global experience in technology and general management. He started his career designing and developing database management systems and business intelligence systems. Currently, he is the Vice President for product development and engineering for Building Technology and Services in Johnson Controls. He is also responsible for several Big Data initiatives. He has worked in technology consulting, engineering, sales, strategy, operations, and P&L roles across US, Asia, and Europe. He has written extensively on various technical and management topics including applying Big Data to different aspects of business. Sudhi holds a degree in Production Engineering from Jadavpur University, Kolkata, India. He resides in Mumbai with his wife, Sohini who is an entrepreneur and a fashion designer.
Acknowledgments
I have been incredibly lucky to have always worked with exceptional people and be given the right opportunities. First, I want to thank Mr. N. Chandrasekaran, CEO of Tata Consultancy Services, who helped me crystallize my thinking on various aspects of technology and management for the past 7 years. Chandra has always encouraged me to think carefully about any subject at hand and do extensive background research. I tried to follow this advice while developing this book.
Next, I want to express my sincerest thanks to Mr. Soren Bjerg, VP and Managing Director, Building Efficiency (Asia), Johnson Controls, and Mr. Swarup Biswas, VP of Asia Lines of Business, Building Efficiency, Johnson Controls for letting me cut my teeth in Big Data initiatives. Without the education and coaching from Mr. Howard Hayes, VP of Data and Analytics and Dr. Youngchoon Park, Director of Data and Analytics, both working for Johnson Controls, all that I know about Big Data would not have been possible. They helped me navigate the complex and evolving world of Big Data and develop many of the frameworks and topics discussed in this book. They introduced me to leading thinkers such as Prof. AnNhai Doan and Prof. Jignesh Patel of University of Wisconsin at Madison who have guided me on many topics that are technical as well as managerial.
I would also like to thank the one whom I consider as my guru in Big Data—Prof. Victor Myer-Scoenberger of University of Oxford; his book on Big Data opened my eyes for the first time and has had the deepest influence on my thinking around Big Data. I thank Joel Makower of GreenBiz for the many interactions we had on the scope of Big Data in energy and sustainability, and for publishing many of my early articles.
Finally, this book would not have been possible without the untiring efforts and patience of my editors Richard Gall and Sweny Sukumaran, project coordinator Venitha Cutinho, and the other people at Packt Publishing who worked through the various stages of this book.
About the Reviewers
Vikash Gaur is Assistant Vice President with Cognizant's Manufacturing & Logistics Practice heading delivery for its North America customers. He drives growth for his unit, ensures flawless delivery of leading-edge technology projects for global customers, while building future-proof solutions that ensure market leadership few years into the future.
Vikash is a management professional with almost 20 years of business and technology experience in leadership positions. He has played a variety of roles in the manufacturing industry and in IT, beginning with the automotive industry and moving on to IT services in the manufacturing industry.
He has experience in business consulting, business process re-engineering, program management, solution architecting, business-IT alignment, and leadership development. With his unerring ability to understand the real challenges that customers face, he helps make customers' businesses stronger, often leveraging emerging technologies.
Richard Heimann is Chief Data Scientist at L-3 National Security Solutions (NSS) (NYSE:LLL) and is an EMC Certified Data Scientist with concentrations in spatial statistics, data mining, Big Data, and pattern discovery and recognition. Richard also leads the Data Science team at the L-3 Data Tactics Business Unit. L-3 NSS and L-3 Data Tactics are both premier Big Data and Analytics service providers based in Washington D.C. and serve customers globally.
Richard is an adjunct professor at University of Maryland, Baltimore County, where he teaches Spatial Analysis and Statistical Reasoning. Additionally, he is an instructor at George Mason University, teaching Human Terrain Analysis and is also a selection committee member for the 2014-2015 AAAS Big Data and Analytics Fellowship Program and member of the Washington Exec Big Data Council.
Richard has also recently published a book titled Social Media Mining with R, Packt Publishing. He has recently supported DARPA, DHS, the US Army, and the Pentagon with analytical support.
To my parents Mr. Sukumar Ranjan Sinha and Mrs. Keya Sinha who not only brought me into this world and taught me a lot, they also loved me and encouraged me for all the good things that I have been associated with including this book.
And to my lovely wife Sohini, who showered me with her love, encouragement, and patience while I was working on this book.
Data is the only perpetual entity. Actions and human interventions have definitive outcomes and a definitive life however long. Data lives forever and takes a new evolving meaning.
Preface
In the 2008 historic presidential election in the United States, President Barack Obama captured the imagination of the nation with his Yes we can
slogan and his different but definitive ideas. He employed social media very well to get his message out to the millions of new voters. Fast forward to 2012; he was facing a very difficult re-election campaign. This time he employed data and analytics to win this election. In his November 16, 2012 article in The Atlantic, Alexis Madrigal captures this transformative experimentation in details. For the first time in history, a political campaign had a Chief Technology Officer in Harper Reed. Mr. Reed assembled an eclectic team from Google, Facebook, Twitter, and many other new age technology companies and initiated the project Narwahl. They mined through huge volumes of data—demographic, past voting patterns, economic data, social media interactions, and others to predict how the campaign is going to perform in each seat and how they can persuade individual voters. Big Data is here and now.
In the past 50 years, the world has seen itself transforming to the era of the information age; in the last 5 years, we have seen ourselves gravitating towards the Big Data age. Big Data touches every aspect of our lives. Talking about Big Data, we often refer to huge volumes, variety, and velocity of data—lots of data, different types of data, and data getting created, captured, and processed at breakneck speeds. We often use examples of the 1 billion plus Facebook users, the 3 billion plus likes they click on every day, the 100 billion credit card transactions that happen across the world, the millions of transactions that occur in individual retail chains every hour, the 200 million + e-mails sent every minute, and so on. Sometimes it is difficult for us to fathom the quantum of data.
Let me share an excellent representation that I saw in one of the infographics published by EMC (http://www.emc.com/campaign/global/big-data/hfbd-infographic-4web-1500.jpg). Data is measured in bytes or multiples of that. In the following table, we compare these multiples with more physical equivalents of words/pictures and sand:
Traditional technologies have demonstrated limitations when the volume goes beyond a few terabytes. The new reality is terabytes of data are not considered enough to capture every transaction that happens in organizations or businesses in a year. Research group SINTEF published in their May 2013 report that 90 percent of the world's data was created in the past 2 years. In their 2011 Digital Universe Study, IDC has projected growth of information generated to be 50 times of current rate by 2020. Last year the global mobile data traffic is expected to be close to 1 Exabyte per month. So we see massive proliferation of Big Data everywhere every day.
Big Data not only relates to the new age technology companies, large financial institutions, or the mega retail chains; it also relates to traditional manufacturing companies and brick and mortar industries like construction. Today, a medium size building of 10,000 square meters generates over 10 gigabytes of data every year! Everybody is in awe of the size of data and the possibilities it brings. In the last few years, the companies have been scrambling to make sense of this type of Big Data and effectively use this to create new products and services to differentiate themselves and transform their businesses. Billions of dollars are getting invested worldwide in Big Data and trillions of dollars' worth of benefits are expected to be generated from them.
New disciplines of technology like Near Field Communications, Augmented Reality, and many others are getting enabled and impacted by Big Data. New economic