Applied Analytics through Case Studies Using SAS and R: Implementing Predictive Models and Machine Learning Techniques

Ebook594 pages4 hours

Applied Analytics through Case Studies Using SAS and R: Implementing Predictive Models and Machine Learning Techniques

Name: Applied Analytics through Case Studies Using SAS and R: Implementing Predictive Models and Machine Learning Techniques
Author: Deepti Gupta
ISBN: 9781484235256

By Deepti Gupta

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Examine business problems and use a practical analytical approach to solve them by implementing predictive models and machine learning techniques using SAS and the R analytical language.
This book is ideal for those who are well-versed in writing code and have a basic understanding of statistics, but have limited experience in implementing predictive models and machine learning techniques for analyzing real world data. The most challenging part of solving industrial business problems is the practical and hands-on knowledge of building and deploying advanced predictive models and machine learning algorithms.
Applied Analytics through Case Studies Using SAS and R is your answer to solving these business problems by sharpening your analytical skills.
What You'll Learn

Understand analytics and basic data concepts
Use an analytical approach to solve Industrial business problems
Build predictive model with machine learning techniques
Create and apply analytical strategies

Who This Book Is For

Data scientists, developers, statisticians, engineers, and research students with a great theoretical understanding of data and statistics who would like to enhance their skills by getting practical exposure in data modeling.

Skip carousel

LanguageEnglish

PublisherApress

Release dateAug 3, 2018

ISBN9781484235256

Author

Deepti Gupta

Dr. Deepti Gupta is Professor in the Department of Textile Technology at IIT Delhi, India. She completed her PhD from the same department in 1995 and joined as a faculty member in 1997. She has 18 years of teaching and research experience and has published more than 40 papers in national and international journals of repute. She has guided two PhD and several M.Tech projects at IIT. Dr. Gupta has conducted research on the problem of body size chart development for the Indian ready-made garment industry for the last 6 years. She has published several papers in international journals and spoken at national and international conferences on the subject. Her team has generated a huge database of accurate anthropometric data of various segments of the Indian population. This has been analysed extensively to propose a unique, computer aided solution to the extremely complex problem of garment sizing.

Related authors

Skip carousel

Related to Applied Analytics through Case Studies Using SAS and R

Related ebooks

Skip carousel

The Burgess Boys by Elizabeth Strout (Trivia-On-Books)
Ebook
The Burgess Boys by Elizabeth Strout (Trivia-On-Books)
byTrivion Books
Rating: 0 out of 5 stars
0 ratings
The Surprising Benefits of Being Unemployed
Ebook
The Surprising Benefits of Being Unemployed
byDavid Dvorkin
Rating: 4 out of 5 stars
4/5
The Body at Ballytierney
Ebook
The Body at Ballytierney
byNoreen Wainwright
Rating: 5 out of 5 stars
5/5
The Quest of the Simple Life
Ebook
The Quest of the Simple Life
byW. J. Dawson
Rating: 0 out of 5 stars
0 ratings
The Best American Humorous Short Stories
Ebook
The Best American Humorous Short Stories
byH. C. (Henry Cuyler) Bunner
Rating: 0 out of 5 stars
0 ratings
The Happy Marriage: And Other Stories
Ebook
The Happy Marriage: And Other Stories
byR. V. Cassill
Rating: 0 out of 5 stars
0 ratings
Women Drinking Benedictine
Ebook
Women Drinking Benedictine
bySharon Dilworth
Rating: 0 out of 5 stars
0 ratings
Bliss, and Other Stories
Ebook
Bliss, and Other Stories
byKatherine Mansfield
Rating: 0 out of 5 stars
0 ratings
The Festival of Vision and Fire: Faerie Festival Series, #2
Ebook
The Festival of Vision and Fire: Faerie Festival Series, #2
byLogan Miehl
Rating: 0 out of 5 stars
0 ratings
Bromley Girls
Ebook
Bromley Girls
byMartha Mendelsohn
Rating: 0 out of 5 stars
0 ratings
Fearless: Wilma Soss and America's Forgotten Investor Movement
Ebook
Fearless: Wilma Soss and America's Forgotten Investor Movement
byRobert Wright
Rating: 0 out of 5 stars
0 ratings
The Journals of Arnold Bennett
Ebook
The Journals of Arnold Bennett
byFlower Newman Flower
Rating: 0 out of 5 stars
0 ratings
Data Science Career Guide Interview Preparation
Ebook
Data Science Career Guide Interview Preparation
byGradient Publication
Rating: 0 out of 5 stars
0 ratings
Introduction to Data Science Using R
Ebook
Introduction to Data Science Using R
byPrema Alla
Rating: 0 out of 5 stars
0 ratings
Business Analytics: A Practitioner’s Guide
Ebook
Business Analytics: A Practitioner’s Guide
byRahul Saxena
Rating: 0 out of 5 stars
0 ratings
Data and Analytics in Action: Project Ideas and Basic Code Skeleton in Python
Ebook
Data and Analytics in Action: Project Ideas and Basic Code Skeleton in Python
byZemelak Goraga
Rating: 0 out of 5 stars
0 ratings
Introduction to Statistical and Machine Learning Methods for Data Science
Ebook
Introduction to Statistical and Machine Learning Methods for Data Science
byCarlos Andre Reis Pinheiro
Rating: 0 out of 5 stars
0 ratings
Introducing HR Analytics with Machine Learning: Empowering Practitioners, Psychologists, and Organizations
Ebook
Introducing HR Analytics with Machine Learning: Empowering Practitioners, Psychologists, and Organizations
byChristopher M. Rosett
Rating: 0 out of 5 stars
0 ratings
Data Science Project Ideas for Thesis, Term Paper, and Portfolio
Ebook
Data Science Project Ideas for Thesis, Term Paper, and Portfolio
byZemelak Goraga
Rating: 0 out of 5 stars
0 ratings
PYTHON DATA ANALYTICS: Harnessing the Power of Python for Data Exploration, Analysis, and Visualization (2024)
Ebook
PYTHON DATA ANALYTICS: Harnessing the Power of Python for Data Exploration, Analysis, and Visualization (2024)
byNED MUNOZ
Rating: 0 out of 5 stars
0 ratings
Market Data Analysis Using JMP
Ebook
Market Data Analysis Using JMP
byWalter R. Paczkowski
Rating: 0 out of 5 stars
0 ratings
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next
Ebook
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next
byRupam Kumar Sharma
Rating: 0 out of 5 stars
0 ratings
Data Science: What the Best Data Scientists Know About Data Analytics, Data Mining, Statistics, Machine Learning, and Big Data – That You Don't
Ebook
Data Science: What the Best Data Scientists Know About Data Analytics, Data Mining, Statistics, Machine Learning, and Big Data – That You Don't
byHerbert Jones
Rating: 5 out of 5 stars
5/5
Data Analysis Simplified: A Hands-On Guide for Beginners with Excel Mastery.
Ebook
Data Analysis Simplified: A Hands-On Guide for Beginners with Excel Mastery.
byRichard D. Mello
Rating: 0 out of 5 stars
0 ratings
Data Mining for Managers: How to Use Data (Big and Small) to Solve Business Challenges
Ebook
Data Mining for Managers: How to Use Data (Big and Small) to Solve Business Challenges
byR. Boire
Rating: 0 out of 5 stars
0 ratings
Capitalizing Data Science: A Guide to Unlocking the Power of Data for Your Business and Products (English Edition)
Ebook
Capitalizing Data Science: A Guide to Unlocking the Power of Data for Your Business and Products (English Edition)
byMathangi Sri Ramachandran
Rating: 0 out of 5 stars
0 ratings
What Is Data Analytics? A Complete Guide For Beginners
Ebook
What Is Data Analytics? A Complete Guide For Beginners
byPiyush Kumar Jain
Rating: 0 out of 5 stars
0 ratings
Business Analytics for Managers
Ebook
Business Analytics for Managers
byWolfgang Jank
Rating: 0 out of 5 stars
0 ratings
Practical Machine Learning with Python: A Problem-Solver's Guide to Building Real-World Intelligent Systems
Ebook
Practical Machine Learning with Python: A Problem-Solver's Guide to Building Real-World Intelligent Systems
byDipanjan Sarkar
Rating: 0 out of 5 stars
0 ratings
Artificial Intelligence for Business
Ebook
Artificial Intelligence for Business
byRajendra Akerkar
Rating: 0 out of 5 stars
0 ratings

Databases For You

Skip carousel

100+ SQL Queries T-SQL for Microsoft SQL Server
Ebook
100+ SQL Queries T-SQL for Microsoft SQL Server
byIFS Harrison
Rating: 4 out of 5 stars
4/5
Summary of Building a Second Brain: by Tiago Forte - A Proven Method to Organize Your Digital Life and Unlock Your Creative Potential - A Comprehensive Summary
Ebook
Summary of Building a Second Brain: by Tiago Forte - A Proven Method to Organize Your Digital Life and Unlock Your Creative Potential - A Comprehensive Summary
byAlexander Cooper
Rating: 1 out of 5 stars
1/5
Practical Data Analysis
Ebook
Practical Data Analysis
byHector Cuesta
Rating: 4 out of 5 stars
4/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Learn SQL Server Administration in a Month of Lunches
Ebook
Learn SQL Server Administration in a Month of Lunches
byDon Jones
Rating: 0 out of 5 stars
0 ratings
LINUX: Beginner's Crash Course. Your Step-By-Step Guide To Learning The Linux Operating System And Command Line Easy & Fast!
Ebook
LINUX: Beginner's Crash Course. Your Step-By-Step Guide To Learning The Linux Operating System And Command Line Easy & Fast!
byJeremy Li
Rating: 3 out of 5 stars
3/5
Learn SQL in 24 Hours
Ebook
Learn SQL in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5
Blockchain Basics: A Non-Technical Introduction in 25 Steps
Ebook
Blockchain Basics: A Non-Technical Introduction in 25 Steps
byDaniel Drescher
Rating: 5 out of 5 stars
5/5
CompTIA DataSys+ Study Guide: Exam DS0-001
Ebook
CompTIA DataSys+ Study Guide: Exam DS0-001
byMike Chapple
Rating: 0 out of 5 stars
0 ratings
Data Governance: How to Design, Deploy and Sustain an Effective Data Governance Program
Ebook
Data Governance: How to Design, Deploy and Sustain an Effective Data Governance Program
byJohn Ladley
Rating: 4 out of 5 stars
4/5
Oracle DBA Mentor: Succeeding as an Oracle Database Administrator
Ebook
Oracle DBA Mentor: Succeeding as an Oracle Database Administrator
byBrian Peasland
Rating: 0 out of 5 stars
0 ratings
SQL Programming & Database Management For Absolute Beginners SQL Server, Structured Query Language Fundamentals: "Learn - By Doing" Approach And Master SQL
Ebook
SQL Programming & Database Management For Absolute Beginners SQL Server, Structured Query Language Fundamentals: "Learn - By Doing" Approach And Master SQL
byWilliam Sullivan
Rating: 5 out of 5 stars
5/5
Access 2010 All-in-One For Dummies
Ebook
Access 2010 All-in-One For Dummies
byAlison Barrows
Rating: 4 out of 5 stars
4/5
Access 2019 For Dummies
Ebook
Access 2019 For Dummies
byLaurie A. Ulrich
Rating: 0 out of 5 stars
0 ratings
Building a Scalable Data Warehouse with Data Vault 2.0
Ebook
Building a Scalable Data Warehouse with Data Vault 2.0
byDaniel Linstedt
Rating: 4 out of 5 stars
4/5
Behind Every Good Decision: How Anyone Can Use Business Analytics to Turn Data into Profitable Insight
Ebook
Behind Every Good Decision: How Anyone Can Use Business Analytics to Turn Data into Profitable Insight
byPiyanka Jain
Rating: 5 out of 5 stars
5/5
The Visual Imperative: Creating a Visual Culture of Data Discovery
Ebook
The Visual Imperative: Creating a Visual Culture of Data Discovery
byLindy Ryan
Rating: 4 out of 5 stars
4/5
Data Mining: Concepts and Techniques
Ebook
Data Mining: Concepts and Techniques
byJiawei Han
Rating: 4 out of 5 stars
4/5
Beginning Microsoft SQL Server 2012 Programming
Ebook
Beginning Microsoft SQL Server 2012 Programming
byPaul Atkinson
Rating: 1 out of 5 stars
1/5
Relational Database Design and Implementation
Ebook
Relational Database Design and Implementation
byJan L. Harrington
Rating: 5 out of 5 stars
5/5
Business Intelligence Guidebook: From Data Integration to Analytics
Ebook
Business Intelligence Guidebook: From Data Integration to Analytics
byRick Sherman
Rating: 4 out of 5 stars
4/5
The Data and Analytics Playbook: Proven Methods for Governed Data and Analytic Quality
Ebook
The Data and Analytics Playbook: Proven Methods for Governed Data and Analytic Quality
byLowell Fryman
Rating: 5 out of 5 stars
5/5
Data Modeling Essentials
Ebook
Data Modeling Essentials
byGraeme Simsion
Rating: 4 out of 5 stars
4/5
SQL Clearly Explained
Ebook
SQL Clearly Explained
byJan L. Harrington
Rating: 5 out of 5 stars
5/5
The SQL Workshop: Learn to create, manipulate and secure data and manage relational databases with SQL
Ebook
The SQL Workshop: Learn to create, manipulate and secure data and manage relational databases with SQL
byFrank Solomon
Rating: 0 out of 5 stars
0 ratings
Database Design: Know It All
Ebook
Database Design: Know It All
byToby J. Teorey
Rating: 5 out of 5 stars
5/5
Beginning Microsoft Power BI: A Practical Guide to Self-Service Data Analytics
Ebook
Beginning Microsoft Power BI: A Practical Guide to Self-Service Data Analytics
byDan Clark
Rating: 0 out of 5 stars
0 ratings
Serverless Architectures on AWS, Second Edition
Ebook
Serverless Architectures on AWS, Second Edition
byPeter Sbarski
Rating: 5 out of 5 stars
5/5
Python and SQLite Development
Ebook
Python and SQLite Development
byAgus Kurniawan
Rating: 0 out of 5 stars
0 ratings

Related podcast episodes

Skip carousel

[AI Team Success] Getting Culture and Talent Right - with Mazin Gilbert of Google: Today's guest is the Director of Engineering, Telecommunications Orchestration, Analytics, and Automation at Google, Mazin Gilbert. Mazin's career is one that has seen its fair share of enterprise teams facing the challenges of legacy data stacks as...
Podcast episode
[AI Team Success] Getting Culture and Talent Right - with Mazin Gilbert of Google: Today's guest is the Director of Engineering, Telecommunications Orchestration, Analytics, and Automation at Google, Mazin Gilbert. Mazin's career is one that has seen its fair share of enterprise teams facing the challenges of legacy data stacks as...
byThe AI in Business Podcast
0 ratings
0% found this document useful
1147: The Smart Cube — The Analytics Player in Supply Chain Sourcing: Prasad Kothari, VP Data Science, and AI, SmartCube
Podcast episode
1147: The Smart Cube — The Analytics Player in Supply Chain Sourcing: Prasad Kothari, VP Data Science, and AI, SmartCube
byThe Tech Talks Daily Podcast
0 ratings
0% found this document useful
Is data science something for you?: Interview with Cytel statisticians Yannis Jemiai and Rajat Mukherjee
Podcast episode
Is data science something for you?: Interview with Cytel statisticians Yannis Jemiai and Rajat Mukherjee
byThe Effective Statistician - in association with PSI
0 ratings
0% found this document useful
Use Your Data Warehouse To Power Your Product Analytics With NetSpring: With the rise of the web and digital business came the need to understand how customers are interacting with the products and services that are being sold. Product analytics has grown into its own category and brought with it several services with generational differences in how they approach the problem. NetSpring is a warehouse-native product analytics service that allows you to gain powerful insights into your customers and their needs by combining your event streams with the rest of your business data. In this episode Priyendra Deshwal explains how NetSpring is designed to empower your product and data teams to build and explore insights around your products in a streamlined and maintainable workflow.
Podcast episode
Use Your Data Warehouse To Power Your Product Analytics With NetSpring: With the rise of the web and digital business came the need to understand how customers are interacting with the products and services that are being sold. Product analytics has grown into its own category and brought with it several services with generational differences in how they approach the problem. NetSpring is a warehouse-native product analytics service that allows you to gain powerful insights into your customers and their needs by combining your event streams with the rest of your business data. In this episode Priyendra Deshwal explains how NetSpring is designed to empower your product and data teams to build and explore insights around your products in a streamlined and maintainable workflow.
byData Engineering Podcast
0 ratings
0% found this document useful
GM’s Iwao Fusillo on Recruiting Top Talent and Building a Successful Data Literacy Strategy: Iwao Fusillo, Chief Data and Analytics Officer at General Motors, explains why domain expertise is not a precursor to a successful career in data and analytics, how to scale remote and hybrid teams, and how to implement a data literacy strategy within a legacy company.
Podcast episode
GM’s Iwao Fusillo on Recruiting Top Talent and Building a Successful Data Literacy Strategy: Iwao Fusillo, Chief Data and Analytics Officer at General Motors, explains why domain expertise is not a precursor to a successful career in data and analytics, how to scale remote and hybrid teams, and how to implement a data literacy strategy within a legacy company.
byThe Data Chief
0 ratings
0% found this document useful
Designing Data Platforms For Fintech Companies: Working with financial data requires a high degree of rigor due to the numerous regulations and the risks involved in security breaches. In this episode Andrey Korchack, CTO of fintech startup Monite, discusses the complexities of designing and implementing a data platform in that sector.
Podcast episode
Designing Data Platforms For Fintech Companies: Working with financial data requires a high degree of rigor due to the numerous regulations and the risks involved in security breaches. In this episode Andrey Korchack, CTO of fintech startup Monite, discusses the complexities of designing and implementing a data platform in that sector.
byData Engineering Podcast
0 ratings
0% found this document useful
AI Frontiers: The future of scale with Ahmed Awadallah and Ashley Llorens
Podcast episode
AI Frontiers: The future of scale with Ahmed Awadallah and Ashley Llorens
byMicrosoft Research Podcast
0 ratings
0% found this document useful
Machine Learning, Business Success – Charles Martin, PhD, Data Scientist, Machine Learning AI Consultant, and Chief Scientist at Calculation Consulting – Rapidly Evolving Opportunities For Business Via Machine Learning and Data Science: Charles Martin, PhD, data scientist, machine learning AI consultant, and chief scientist at Calculation Consulting, delivers a thorough overview of the technologies that are helping companies expand their customer base and increase revenue. Martin is...
Podcast episode
Machine Learning, Business Success – Charles Martin, PhD, Data Scientist, Machine Learning AI Consultant, and Chief Scientist at Calculation Consulting – Rapidly Evolving Opportunities For Business Via Machine Learning and Data Science: Charles Martin, PhD, data scientist, machine learning AI consultant, and chief scientist at Calculation Consulting, delivers a thorough overview of the technologies that are helping companies expand their customer base and increase revenue. Martin is...
byFinding Genius Podcast
0 ratings
0% found this document useful
Data jobs: Interview with data & machine learning expert Catherine Lopes PhD (Ep 42): Who would have thought that 2020 would be the year of data charts? That we would be glued to the daily news like never before, anxiously waiting to see more and more charts, expecting data analysts to tell us which way curves, bars, and pie charts ar...
Podcast episode
Data jobs: Interview with data & machine learning expert Catherine Lopes PhD (Ep 42): Who would have thought that 2020 would be the year of data charts? That we would be glued to the daily news like never before, anxiously waiting to see more and more charts, expecting data analysts to tell us which way curves, bars, and pie charts ar...
byThe Job Hunting Podcast
0 ratings
0% found this document useful
Ep 532: Data Driven Talent Acquisition: Grant Telfer, Business Development Director at Textkernel, talks to Matt Alder
Podcast episode
Ep 532: Data Driven Talent Acquisition: Grant Telfer, Business Development Director at Textkernel, talks to Matt Alder
byRecruiting Future with Matt Alder
0 ratings
0% found this document useful
Four Most Commonly Asked Questions About AI with Dr. Jerry Smith: Dr. Jerry Smith welcomes you to another episode of AI Live and Unbiased to explore the breadth and depth of Artificial Intelligence and to encourage you to change the world, not just observe it! Dr. Jerry is talking today about questions and...
Podcast episode
Four Most Commonly Asked Questions About AI with Dr. Jerry Smith: Dr. Jerry Smith welcomes you to another episode of AI Live and Unbiased to explore the breadth and depth of Artificial Intelligence and to encourage you to change the world, not just observe it! Dr. Jerry is talking today about questions and...
byAI Live & Unbiased
0 ratings
0% found this document useful
Stephan Kolassa, Bahman Rostami Tabar, and Eno Siemsen
Podcast episode
Stephan Kolassa, Bahman Rostami Tabar, and Eno Siemsen
byForecasting Impact
0 ratings
0% found this document useful
554. Barry Saunders: AI Project Case Study: Show Notes: Barry Saunders, a digital expert at McKinsey, discusses his background in the firm and his experience in AI-related projects. He worked in the LEAP practice, which built platforms for video streaming, preventative maintenance, and...
Podcast episode
554. Barry Saunders: AI Project Case Study: Show Notes: Barry Saunders, a digital expert at McKinsey, discusses his background in the firm and his experience in AI-related projects. He worked in the LEAP practice, which built platforms for video streaming, preventative maintenance, and...
byUnleashed - How to Thrive as an Independent Professional
0 ratings
0% found this document useful
Data Discoveries — Ed Lorenzini, President and CEO, and Scott Chase, CTO, of Analyze — The Importance of Analyzing Customer Data To Drive Sales and Understand Demographics: Ed Lorenzini, president, and CEO, and Scott Chase, CTO, of Analyze (<a href="http://analyzecorp.com">analyzecorp.com</a>), deliver a useful overview of data analytics and how it can help every business target new potential customers, as...
Podcast episode
Data Discoveries — Ed Lorenzini, President and CEO, and Scott Chase, CTO, of Analyze — The Importance of Analyzing Customer Data To Drive Sales and Understand Demographics: Ed Lorenzini, president, and CEO, and Scott Chase, CTO, of Analyze (<a href="http://analyzecorp.com">analyzecorp.com</a>), deliver a useful overview of data analytics and how it can help every business target new potential customers, as...
byFinding Genius Podcast
0 ratings
0% found this document useful
OpenTable’s Grant Parsamyan on How Data and Analytics is Helping the Restaurant Industry Rebound from COVID-19: Grant Parsamyan, Senior Vice President of Data and Analytics at OpenTable, discusses how the team is using data to capture a 360-degree view of the restaurant industry and why organization-wide data literacy is fundamental for success.
Podcast episode
OpenTable’s Grant Parsamyan on How Data and Analytics is Helping the Restaurant Industry Rebound from COVID-19: Grant Parsamyan, Senior Vice President of Data and Analytics at OpenTable, discusses how the team is using data to capture a 360-degree view of the restaurant industry and why organization-wide data literacy is fundamental for success.
byThe Data Chief
0 ratings
0% found this document useful
AI Access and Inclusivity as a Technical Challenge with Prem Natarajan - #658
Podcast episode
AI Access and Inclusivity as a Technical Challenge with Prem Natarajan - #658
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
From fraud to quality: How to use trusted market research environments to drive growth with Sharekh Shaikh
Podcast episode
From fraud to quality: How to use trusted market research environments to drive growth with Sharekh Shaikh
byPredictable B2B Success
0 ratings
0% found this document useful
#328 FDA Guidance on Artificial Intelligence (AI) in Medical Devices
Podcast episode
#328 FDA Guidance on Artificial Intelligence (AI) in Medical Devices
byGlobal Medical Device Podcast powered by Greenlight Guru
0 ratings
0% found this document useful
FoA 245: Agtech Product Strategy with Climate Corp Chief Product Officer Ranjeeta Singh: Today’s show connects back to episode 241 with Craig Rupp of Sabanto, where we talked about, among many other things, how the Climate Corp has been able to become a central data collection platform on so many large scale farms. Ranjeeta Singh,...
Podcast episode
FoA 245: Agtech Product Strategy with Climate Corp Chief Product Officer Ranjeeta Singh: Today’s show connects back to episode 241 with Craig Rupp of Sabanto, where we talked about, among many other things, how the Climate Corp has been able to become a central data collection platform on so many large scale farms. Ranjeeta Singh,...
byFuture of Agriculture
0 ratings
0% found this document useful
Suresh Vittal, CPO at Alteryx, On His Journey to Revolutionizing Data Analytics
Podcast episode
Suresh Vittal, CPO at Alteryx, On His Journey to Revolutionizing Data Analytics
byAI and the Future of Work
0 ratings
0% found this document useful
Product Owners in Data Science - Anna Hannemann
Podcast episode
Product Owners in Data Science - Anna Hannemann
byDataTalks.Club
0 ratings
0% found this document useful
Using Data To Illuminate The Intentionally Opaque Insurance Industry: The insurance industry is notoriously opaque and hard to navigate. Max Cho found that fact frustrating enough that he decided to build a business of making policy selection more navigable. In this episode he shares his journey of data collection and analysis and the challenges of automating an intentionally manual industry.
Podcast episode
Using Data To Illuminate The Intentionally Opaque Insurance Industry: The insurance industry is notoriously opaque and hard to navigate. Max Cho found that fact frustrating enough that he decided to build a business of making policy selection more navigable. In this episode he shares his journey of data collection and analysis and the challenges of automating an intentionally manual industry.
byData Engineering Podcast
0 ratings
0% found this document useful
Using AI for financial planning and analysis
Podcast episode
Using AI for financial planning and analysis
byAI Business Podcast
0 ratings
0% found this document useful
559. Paul Gaspar: AI Project Case Study: Show Notes: In this episode of Unleashed, Paul Gaspar discusses his experience working with artificial intelligence at a major global insurance conglomerate in Japan. The company faced pressure to streamline operations and reduce costs within its auto...
Podcast episode
559. Paul Gaspar: AI Project Case Study: Show Notes: In this episode of Unleashed, Paul Gaspar discusses his experience working with artificial intelligence at a major global insurance conglomerate in Japan. The company faced pressure to streamline operations and reduce costs within its auto...
byUnleashed - How to Thrive as an Independent Professional
0 ratings
0% found this document useful
Developing an Enterprise Data Strategy the Right Way - with Dora Boussias of Stryker: In this week’s episode, we’re discussing a topic that applies across industries for AI projects: where is the data coming from? Our guest is Dora Boussias, Senior Director for Data Strategy and Architecture at Stryker. Stryker is a $17B per year...
Podcast episode
Developing an Enterprise Data Strategy the Right Way - with Dora Boussias of Stryker: In this week’s episode, we’re discussing a topic that applies across industries for AI projects: where is the data coming from? Our guest is Dora Boussias, Senior Director for Data Strategy and Architecture at Stryker. Stryker is a $17B per year...
byThe AI in Business Podcast
0 ratings
0% found this document useful
The Role of IT in Growth: Accuray’s CIO Explains Connection: In this week’s episode of The Breakout Growth Podcast, brought to you by SAP, Sean Ellis and Ethan Garr speak with GS Jha, who serves as both Global Chief Information Officer and Chief Information Security Officer at Accuray, a manufacturer of radiotherapy technologies. When you hear GS’s story it’s evident where his passion for life sciences comes from. As a child growing up in India, GS saw firsthand the impacts of poor access to medicine on his community. And while he thought that his contributions to world health might be as a doctor, ultimately his path led to a place where infrastructure and business converged. That intersection is a key reason why GS feels he can be a catalyst for growth at Accuray in his role as CIO. From GS’s perspective, success depends on aligning the systems he manages with the goals of both the company and the individuals who drive it forward. And what connects these things together is cre
Podcast episode
The Role of IT in Growth: Accuray’s CIO Explains Connection: In this week’s episode of The Breakout Growth Podcast, brought to you by SAP, Sean Ellis and Ethan Garr speak with GS Jha, who serves as both Global Chief Information Officer and Chief Information Security Officer at Accuray, a manufacturer of radiotherapy technologies. When you hear GS’s story it’s evident where his passion for life sciences comes from. As a child growing up in India, GS saw firsthand the impacts of poor access to medicine on his community. And while he thought that his contributions to world health might be as a doctor, ultimately his path led to a place where infrastructure and business converged. That intersection is a key reason why GS feels he can be a catalyst for growth at Accuray in his role as CIO. From GS’s perspective, success depends on aligning the systems he manages with the goals of both the company and the individuals who drive it forward. And what connects these things together is cre
byThe Breakout Growth Podcast
0 ratings
0% found this document useful
#338: Site Selection for Clinical Trials
Podcast episode
#338: Site Selection for Clinical Trials
byGlobal Medical Device Podcast powered by Greenlight Guru
0 ratings
0% found this document useful
AI for Inventory Prediction in Manufacturing - with Anand Mahurkar of Findability Sciences: Today’s guest is Anand Mahurkar, Founder and CEO of Findability Sciences. With about 100 employees worldwide, Findability operates in a variety of industries, including manufacturing. In this episode, Anand provides an in-depth overview of the...
Podcast episode
AI for Inventory Prediction in Manufacturing - with Anand Mahurkar of Findability Sciences: Today’s guest is Anand Mahurkar, Founder and CEO of Findability Sciences. With about 100 employees worldwide, Findability operates in a variety of industries, including manufacturing. In this episode, Anand provides an in-depth overview of the...
byThe AI in Business Podcast
0 ratings
0% found this document useful
How Aira Matrix supports digital pathology with deep learning on demand w/ Chaith Kondragunta
Podcast episode
How Aira Matrix supports digital pathology with deep learning on demand w/ Chaith Kondragunta
byDigital Pathology Podcast
0 ratings
0% found this document useful
Quantifying The Return On Investment For Your Data Team: As businesses increasingly invest in technology and talent focused on data engineering and analytics, they want to know whether they are benefiting. So how do you calculate the return on investment for data? In this episode Barr Moses and Anna Filippova explore that question and provide useful exercises to start answering that in your company.
Podcast episode
Quantifying The Return On Investment For Your Data Team: As businesses increasingly invest in technology and talent focused on data engineering and analytics, they want to know whether they are benefiting. So how do you calculate the return on investment for data? In this episode Barr Moses and Anna Filippova explore that question and provide useful exercises to start answering that in your company.
byData Engineering Podcast
0 ratings
0% found this document useful

Skip carousel

Putting Artificial Intelligence to Work
Rotman Management
Article
Putting Artificial Intelligence to Work
May 1, 2018
11 min read
Harnessing Data And Research
NZ Marketing
Article
Harnessing Data And Research
Dec 8, 2023
4 min read
Q&A
Rotman Management
Article
Q&A
May 1, 2023
Describe the capability that companies like Netflix, UPS, Amazon and Caesars Entertainment have in common. These are all leading firms in their industries with respect to leveraging analytics as a source of competitive advantage. We now have so much
7 min read
Cognitive Enterprise
Techfastly
Article
Cognitive Enterprise
Dec 1, 2021
6 min read
ARTIFICIAL INTELLIGENCE (AI) IN SUPPLY CHAIN PLANNING THE Future is Here & Now
The European Business Review
Article
ARTIFICIAL INTELLIGENCE (AI) IN SUPPLY CHAIN PLANNING THE Future is Here & Now
Dec 3, 2019
7 min read
Questions for Angela Zutavern, Machine Intelligence Expert, Booz Allen Hamilton
Rotman Management
Article
Questions for Angela Zutavern, Machine Intelligence Expert, Booz Allen Hamilton
Jan 1, 2018
You believe that the world of leadership has hit an inflection point. How so? As useful as popular mental models and heuristics are, machine models now outstrip human performance in about half of the portfolio of cognitive tasks. Going forward, we wi
6 min read
The Democratization of Judgment
Rotman Management
Article
The Democratization of Judgment
Jan 1, 2018
8 min read
How And Where You Use Machine-learning
APC
Article
How And Where You Use Machine-learning
Oct 7, 2019
4 min read
Playing With Numbers
India Today
Article
Playing With Numbers
Jul 18, 2019
In the last few years, we have probably created more data digitally than in the rest of human history. Think about the millions of Internet searches and social media posts that are made every minute, and the resultant data that corporations and gover
3 min read
How Clever Tech Is Changing The Game
Finweek - English
Article
How Clever Tech Is Changing The Game
Oct 18, 2019
3 min read
Pivoting To First-party Data
NZ Marketing
Article
Pivoting To First-party Data
Jun 9, 2021
5 min read
The Tech Trends Every Leader Needs to Understand
Rotman Management
Article
The Tech Trends Every Leader Needs to Understand
Sep 1, 2023
11 min read
Why We Need To Fear The Risk Of AI Model Collapse
Evening Standard
Article
Why We Need To Fear The Risk Of AI Model Collapse
Dec 17, 2023
4 min read
TRIP Framework: Re-Thinking Organisational Competitiveness in Digital Spheres
The European Business Review
Article
TRIP Framework: Re-Thinking Organisational Competitiveness in Digital Spheres
Aug 2, 2019
10 min read
The Business Case For Artificial Intelligence
Lebanon Opportunities
Article
The Business Case For Artificial Intelligence
Mar 8, 2019
6 min read
Better Together: Behavioural Science + Data Science
Rotman Management
Article
Better Together: Behavioural Science + Data Science
May 1, 2020
IMAGINE THIS SCENARIO: You are designing a new customer experience to drive a shift in customer behaviour. You have reviewed the reports and dashboards describing current behaviour. You have asked customers how they felt and incorporated their feedba
5 min read
Getting The edge
The European Business Review
Article
Getting The edge
Feb 25, 2021
7 min read
Adoption of Cognitive Computing Across Various Industries
Techfastly
Article
Adoption of Cognitive Computing Across Various Industries
Dec 1, 2021
5 min read
Why a Hedge Fund Started a Video Game Competition
Nautilus
Article
Why a Hedge Fund Started a Video Game Competition
Nov 30, 2017
There’s a weird way in which a hedge fund is a confluence of everything. There’s the money of course—Two Sigma, located in lower Manhattan, manages over $50 billion, an amount that has grown 600 percent in 6 years and is roughly the size of the econo
9 min read
PEOPLE ASSESSMENT in the Digital Age
The European Business Review
Article
PEOPLE ASSESSMENT in the Digital Age
May 25, 2021
8 min read
01 Ready Or Not, AI Is Here To Assist You
HWM Singapore
Article
01 Ready Or Not, AI Is Here To Assist You
Jul 11, 2023
4 min read
Being Sensible With Tech
Business Today
Article
Being Sensible With Tech
Dec 23, 2022
6 min read
The Future Of Cannabis Data
High Times
Article
The Future Of Cannabis Data
Jan 10, 2024
3 min read
Embracing AI in Financial Services
Rotman Management
Article
Embracing AI in Financial Services
Jan 1, 2020
You are the Chief Science Officer at RBC and you also oversee its AI research institute. Describe the bank’s interest in this arena. There are many aspects to our interest in AI. First of all, financial services is a very data-driven business. From t
6 min read
WHAT EVERY MANAGER SHOULD KNOW ABOUT HUMAN-CENTERED AI: A Manager’s Introduction to Human-Centered Artificial Intelligence
The European Business Review
Article
WHAT EVERY MANAGER SHOULD KNOW ABOUT HUMAN-CENTERED AI: A Manager’s Introduction to Human-Centered Artificial Intelligence
Dec 3, 2019
9 min read
How Are Technology Leaders Using Data and Machine Learning to Help Identify New Business Opportunities?
Techfastly
Article
How Are Technology Leaders Using Data and Machine Learning to Help Identify New Business Opportunities?
Mar 1, 2022
2 min read
We Don’t Actually Know If AI Is Taking Over Everything
The Atlantic
Article
We Don’t Actually Know If AI Is Taking Over Everything
Oct 19, 2023
5 min read
Machine Learning And Investing: The Cautious Seldom Err Or Write Great Poetry
Finweek - English
Article
Machine Learning And Investing: The Cautious Seldom Err Or Write Great Poetry
Oct 18, 2019
5 min read
What European Banks Need to Know about Competing with Ecosystems
The European Business Review
Article
What European Banks Need to Know about Competing with Ecosystems
Dec 3, 2019
6 min read
AI And Digital Resources In Fintech: Creating An Evolutionary Analytic Platform For “Risk” Estimation
The European Business Review
Article
AI And Digital Resources In Fintech: Creating An Evolutionary Analytic Platform For “Risk” Estimation
Sep 20, 2018
5 min read

Related categories

Skip carousel

Reviews for Applied Analytics through Case Studies Using SAS and R

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Applied Analytics through Case Studies Using SAS and R - Deepti Gupta

Deepti GuptaApplied Analytics through Case Studies Using SAS and Rhttps://doi.org/10.1007/978-1-4842-3525-6_1

1. Data Analytics and Its Application in Various Industries

Deepti Gupta¹

(1)

Boston, Massachusetts, USA

Data analytics has become part and parcel of any business in today’s world. In fact, it has evolved into an industry in itself. Vast numbers of software platforms are available for data extraction, scrubbing, analysis, and visualization. Some of these platforms are specialized for carrying out one of the above-listed aspects of data analytics, while others offer a generalist tool to carry out almost all tasks ranging from data scrubbing to visualization. Of these platforms, SAS® and R are the most popular for data analytics with a large global clientele.

In 1967, Statistical Analysis System (SAS) started as a federal funded project for graduate students to track agriculture data at North Carolina State University. ¹ Today it has become a global leader in data analysis software market with customers spanning over 148 countries. ² Ninety-six of the top 100 Fortune Global 500® companies use SAS. R, which originally was a statistical computing language, has advanced significantly over the years. R Studio is an Integrated Development Environment (IDE) for R ³ and offers a free, user-friendly platform for data analytics. Both SAS® and R offer vast capabilities but have certain contrasting advantages that are discussed later in more detail.

A broad array of companies ranging from the largest global banks to regional transport firms are using data analytics to solve diverse sets of problems These diverse applications have one commonality: using data and statistics as the basis for decision making.

In this chapter, certain key aspects related to data analytics will be introduced.

What Is Data Analytics?

Analytics is defined as the process of developing the actionable insights through the application of statistical model and analysis from the data. ⁴ Applying data analytics for decision making is a systematic process. It starts with understanding the nature of industry, general functionality, bottlenecks, and challenges specific to the industry. It is also helpful to know who the key companies are, size of industry, and in some cases general vocabulary and terms associated with operations. After that we take a deeper dive in to the area specific to the application or a business case to which data analytics needs to be applied. A thorough understanding of the application, associated variables, sources of data, and knowledge of the reliability of different data sources are very important.

Data analytics firms pay a lot of attention to these aspects and often employ a vast number of subject-matter experts specific to industries and at times even specific to certain key applications. Business research consultants are also employed for gaining understanding and insights in certain cases. During the preliminary phase of a project, data analytics firms perform elaborate surveys and conduct series of interviews to gain more information about the company and the business problem. ⁵ A good understanding of industry and the application can result in significant cost saving and can improve accuracy, performance, and practicality of the model.

Once the application or the problem statement is well understood, then the implementation process starts. The core methodology of implementing data analytics for solving a business problem is shown in Figure 1-1. ⁶

../images/456228_1_En_1_Chapter/456228_1_En_1_Fig1_HTML.png

Figure 1-1

Data Analytics Methodology

Data Collection

The first step in the process is data collection . Data relevant to the applicant is collected. The quality, quantity, validity, and nature of data directly impact the analytical outcome. A thorough understanding of the data on hand is extremely critical.

It is also useful to have an idea about some other variables that may not directly be sourced from the industry or the specific application itself but may have a significant impact if included into the model. For example, when developing a model to predict flight delays, weather can be a very important variable, but it might have to be obtained from a different source then the rest of the data set. Data analytics firms also have ready access to certain key global databases including weather, financial indices, etc. In recent years, data mining of digital social media like Twitter and Facebook is also becoming very popular. ⁷ This is particularly helpful in understanding trends related to customer satisfaction with various services and products. This technique also helps reduce the reliance on surveys and feedbacks. Figure 1-2 shows a Venn diagram of various sources of data that can be tapped into for a given application.

../images/456228_1_En_1_Chapter/456228_1_En_1_Fig2_HTML.png

Figure 1-2

Venn diagram of data sources

Data Preparation

The next step is data preparation. Usually raw data is not in a format that can be directly used to perform data analysis. In very simple terms, most platforms require data to be in a matrix form with the variables being in different columns and rows representing various observations. Figure 1-3 shows an example of structured data.

../images/456228_1_En_1_Chapter/456228_1_En_1_Fig3_HTML.jpg

Figure 1-3

Format of structured data

Data may be available in structured, semi-structured, and unstructured form. A significant effort is needed to align semi-structured and unstructured data into a usable form as shown in Figure 1-3. Once the data is brought together in a structured form, the next stage in data preparation is data cleansing or scrubbing. Data scrubbing encompass processes that help remove inconsistencies, errors, missing values, or any other issues that can pose challenges during data analysis or model building with a given data set. ⁸ Work at this stage can be as simple as changing the format of a variable, to running advanced algorithms to estimate suitable estimates for missing values. This task is significantly more involved when it comes to big data.

Data Analysis

Once data is converted into a structured format, the next stage is to perform data analysis. At this stage underlying trends in the data are identified. This step can include fitting a linear or nonlinear regression model, performing principal component analysis or cluster analysis, identifying if data is normally distributed or not. The goal is to identify what kind of information can be extracted from the data and if there are underlying trends that can be useful for a given application. This phase is also very useful for scoping out the models that can be most useful to capture the trends in data and if the data satisfies underlying assumptions for the model. One example would be to see if the data is normally distributed or not to identify if parametric models can be used or a non-parametric model is required.

Model Building

Once the trends in data are identified, the next step is to put the data to work and build a model that will help with the given application or help solve a business problem. A vast number of statistical models are available that can be used, and new models are being developed every day. Models can significantly vary in terms of complexity and can range from simple univariate linear regression models to complex machine learning algorithms. Quality of a model is not governed by complexity but rather by its ability to account for real trends and variations in data and sift information from noise.

Results

Results obtained from the models are validated to ensure accuracy and model robustness. This can be done two ways; the first is by splitting the original data set into training and validation data sets. In this approach, part of the data is used for model building and the remaining part is used for validation. The other approach is to validate data against real-time data once the model is deployed. In some cases, the same data is used to build multiple different types of models to confirm if the model outputs are real and not statistical artifacts.

Put into Use

Once the model is developed it is deployed in a real-time setting for a given application. As shown in the Figure 1-1, the overall process is somewhat iterative in nature. Many times, the models have to be corrected and new variables added or some variables removed to enhance model performance. Additionally, models need to be constantly recalibrated with fresh data to keep them current and functional.

Types of Analytics

Analytics can be broadly classified under three categories: descriptive analytics, predictive analytics, and prescriptive analytics. ⁹ Figure 1-4 shows the types and descriptions of types of analytics.

../images/456228_1_En_1_Chapter/456228_1_En_1_Fig4_HTML.png

Figure 1-4

Types of Analytics

Different types of information can be obtained by applying the different categories of analytics. This will be explained in the following section.

Descriptive Analytics: Most of the organizations use descriptive analytics in order to know about their company performance. Example, management at a retail firm can use descriptive analytics to know the trends of sales in past years, or inferring trends of operation cost, product, or service performance.

PredictiveAnalytics: In case of predictive analytics, historical trends coupled with other variables are used to see what could happen in the future to the firm. Example, Management at the same retail firm can use the sales trends from previous years to forecast sales for the coming year.

Prescriptive Analytics: In prescriptive analytics, the objective is to identify factors or variables that are impacting trends. Once the responsible variables are identified, strategies and recommendations are made to improve the outcome. For example, Management at the same retail firm identifies that the operation cost is significantly high due to overstocking at certain stores. Based on this insight, an improved inventory management would be recommended to the given locations.

Understanding Data and Its Types

Data is a collection of variables, facts, and figures that serves as raw material to create information and generate insights. The data needs to be manipulated, processed, and aligned in order to withdraw useful insights. Data is divided into two broad forms: qualitative and quantitative data. ¹⁰

Qualitative data: The data that is expressed in words and descriptions like text, images, etc. is considered as qualitative data. Qualitative data collection uses unstructured and semi-structured techniques. There are various common methods to collect qualitative data like conducting interviews, diary studies, open-ended questionnaires, etc. Examples of qualitative data are gender, demographic details, colors, etc. There are three main types of qualitative data:

Nominal: Nominal data can have two or more categories but there is no intrinsic rank or order to the categories. For example, gender and marital status (single, married) are categorical variables having two categories and there is no intrinsic rank or order to the categories.

Ordinal: In ordinal data, the items are assigned to categories and there is an intrinsic rank or order to the categories. For example, age group: Infant, Young, Adult, and Senior Citizen.

Binary: Binary data can take only two possible values. For example, Yes/No, True/False.

Quantitative data: The data that is in numerical format is considered as quantitative data. Such a type of data is used in conducting quantitative analysis. Quantitative data collection uses much more structured techniques. There are various common methods to collect quantitative data like surveys, online polls, telephone interviews, etc. Examples of quantitative data are height, weight, temperature, etc. There are two types of quantitative data:

Discrete Data: Discrete data is based on count and it can only take a finite number of values. Typically it involves integers. For example, the number of students in data science class is discrete data because you are counting a whole and it cannot be subdivided. It is not possible to have 8.3 students.

Continuous Data: Continuous data can be measured, take any numeric values, and be subdivided meaningfully into finer and finer levels. For example, the weights of the data science students can be measured at a more precise scale – kilograms, grams, milligrams, etc.

While on the topic of data, it is a good time to get a basic understanding of Big Data.

Big Data is not just a buzzword but is fast becoming a critical aspect of data analytics. It is discussed in more detail in the following section.

What Is Big Data Analytics?

The term big data is defined as the huge volume of both structured and unstructured data that is so large that it is not possible to process such data using traditional databases and software. As a result, many organizations that collect, process, and conduct big data analysis turn to specialized big data tools like NoSQL databases, Hadoop, Kafka, Mapreduce, Spark, etc. Big data is a huge cluster of numbers and words. Big data analytics is the process of finding the hidden patterns, trends, correlations, and other effective insights from those large stores of data. Big data analytics helps organizations harness their data to use it for finding new opportunities, faster and better decision making, increased security, and competitive advantages over rivals, such as higher profits and better customer service. Characteristics of Big data are often described using 5 Vs, which are velocity, volume, value, variety, and veracity. ¹¹ Figure 1-5 illustrates 5 Vs related to the big data.

../images/456228_1_En_1_Chapter/456228_1_En_1_Fig5_HTML.png

Figure 1-5

5 Vs of Big Data

Big Data analytics applications assist data miners, data scientists, statistical modelers, and other professionals to analyze the growing volumes of structured and mostly unstructured data such as data from social media, emails, web servers, sensors, etc. Big data analytics helps companies to get accessibility to nontraditional variables or sources of information, which helps organizations to make quicker and smarter business decisions.

Big Data Analytics Challenges

Most of the organizations are experiencing effective benefits by using big data analytics, but there are some different obstacles that is making it difficult to achieve the benefits promised by big data analytics. ¹² Some of the key challenges are listed below:

Lack of internal skills: The most important challenge that organizations face in implementing big data initiatives is lack of internal skills, and there is a high cost of hiring data scientists and data miners for filling the gaps.

Increasing growth of the data: Another important challenge of big data analytics is the growth of the data at a tremendous pace. It creates issues in managing the quality, security, and governance of the data.

Unstructured Data: As most of the organizations are trying to leverage new and emerging data sources, it is leading to the more unstructured and semi-structured data. These new unstructured and semi-structured data sources are largely streaming data coming from social medial platforms like Twitter, Facebook, web server logs, Internet of Things (IOT), mobile applications, surveys, and many more. The data can be in the form of images, email messages, audio and video files, etc. Such unstructured data is not easy to analyze without having advanced big data analytical tools.

Data Siloes: In organizations there are several types of applications for creating the data like customer relationship management (CRM), supply chain management (SCM), enterprise resource planning (ERP), and many more. Integrating the data from all these wide sources is not an easy task for the organization and is one of the biggest challenges faced by big data analytics.

Data Analytics and Big Data Tools

Data science and analytics tools are evolving and can be broadly classified into two classes: tools for those techies with high levels of expertise in programming and profound knowledge of statistics and computer science like R, SAS, SPSS, etc.; and tools for common audiences that can automate the general analysis and daily reports like Rapid Miner, DataRPM, Weka, etc. Figure 1-6 displays the currently prevalent languages, tools, and software that are used for various data analytics applications.

../images/456228_1_En_1_Chapter/456228_1_En_1_Fig6_HTML.png

Figure 1-6

Languages, Tools, and Software

There is a long list of tools, and a few popular data science and analytical tools are discussed in the following section.

R: The Most Popular ProgrammingLanguagefor statisticians and data scientists

R is an open source tool widely used by statisticians and data miners for conducting statistical analysis and modeling.¹³ R has thousands of packages available easily that make the jobs of statisticians and data scientists easy for handling the tasks from text analytics to voice recognition, face recognition, and genomic science. The demand of R has increased dramatically across all the industries and is becoming popular because of its strong package ecosystem. R is used in industries for solving their big data issues and building statistical and predictive models for withdrawing the effective insights and hidden patterns from the data.

SAS (Statistical Analysis System) Data Science and Predictive Analytics Software Suite

SAS is a software suite that is popular for handling large and unstructured data sets and is used in advance analytics, multivariate analysis, data mining, and predictive analytics, etc. The SAS software suite has more than 200 components like BASE SAS, SAS/ STAT, SAS/ETS, SAS/GRAPH, etc. BASE SAS software, SAS Enterprise Guide, and SAS Enterprise Miner are licensed tools and are used for commercial purposes by all the industries. SAS University Edition is free SAS software and is used for noncommercial uses like teaching and learning statistics and modeling in an SAS environment. It includes the SAS components BASE SAS, SAS/STAT, SAS/IML, SAS/ACCESS, and SAS Studio. SAS can be expensive but it is a very popular tool in industries; it has an effective and quick support system and more than 65,000 customers.

IBM SPSS Statistics and SPSS Modeler: Data Mining and Text Analytics Software

SPSS Modeler and SPSS Statistics were acquired by IBM in 2009 and is considered as a data mining, statistical, and text analytics software. It is used to load, clean, prepare the data, and then build the predictive models and conduct other analytical and statistical tasks. It has the visual interface so users without good programming knowledge can easily build the predictive model and statistical analysis.¹⁴ It has been widely used in industries for fraud detection, risk management, forecasting, etc. IBM SPSS modeler (version 17) is present in two separate bundles as:

SPSS Modeler Professional: it is used for structured data such as databases, flat files, etc.

SPSS Modeler Premium: it is a high-performance analytical tool that helps in gaining effective insights from the data. It includes all the features from SPSS Modeler Professional and in addition it is used for conducting Text Analytics,¹⁵ Entity Analytics,¹⁶ and Social Network Analytics.

Python: High-Level Programming Language Software

Python is an object-oriented and high-level programming language.¹⁷ Python is easy to learn and its syntax is designed to be readable and straightforward. Python is used for data science and machine learning. Robust libraries used for data science and machine learning are using the interface of Python, which is making the language more popular for data analytics and machine learning algorithms.¹⁸ For example, there robust libraries for statistical modeling (Scipy and Numpy), data mining (Orange and Pattern), and supervised and unsupervised machine learning (Scikit-learn).¹⁹

Rapid Miner: GUI Driven Data Science Software

Rapid Miner is open source data mining software. It was started in 2006 and was originally called Rapid-I. In 2013 the name was changed from Rapid-I to Rapid Miner. The older version of Rapid Miner is open source but the latest version is licensed. Rapid miner is widely used in industries for data preparation in visualization, predictive modeling, model evaluation, and deployment.²⁰ Rapid Miner has a user-friendly graphic user interface and a block diagram approach. Predefined blocks act as a plug and play system. Connecting the blocks accurately helps in building a wide variety of machine learning algorithms and statistical models without writing a single line of code. R and Python can also be used to program Rapid Miner.

Role of Analytics in Various Industries

The onset of the digital era has made vast amounts of data accessible, analyzable, and usable. This, coupled with a highly competitive landscape, is driving industries to adopt data analytics. Industries ranging from banking and telecommunication to health care and education, everyone is applying various predictive analytics algorithms in order to gain critical information from data and generate effective insights that drive business decisions.

There are vast numbers of applications within each industry where data analytics can be applied. Some applications are common across many industries. These include customer-centric applications like analyzing factors impacting customer churn, engagement, and customer satisfaction. Another big data analytics application is for predicting financial outcomes. These include forecasting of sales, revenues, operation costs, and profits. In addition to these, data analytics is also widely used for risk management and fraud detection and price optimization in various industries.

There are also large numbers of industry-specific applications of data analytics. To list a few: flight delay prediction in the aviation industry, prediction of cancer remission in health care, forecasting wheat production in agriculture.

An overview of some of the industries benefiting from predictive and big data analytics insights and, most importantly, how is discussed in this section.

Insurance Industry:

The insurance industry has always relied on statistic to determine the insurance rates. Risk-based models form the basis for calculators that are used to calculate insurance premiums. Here is a case specific to the automotive insurance. In the United States, some of the variables in these risk-based models are reasonable but others are debatable. For example, gender is a variable that determines the insurance rate. An average American male driver pays more compared to a female driver with equivalent credentials. Today, people look upon these factors as discriminatory and demand a fairer method with higher weightage to variables that are in control of the actual drivers. The European Court of Justice has passed a ruling stating that gender cannot be used as a basis for deciding insurance premiums.²¹ The current trend requires risk-based models to give consideration to individuals’ statistics rather than generalized population statistics. This seems fair but does require handling significantly more data on a daily basis and new models to replace the traditional ones. Big data tools and advanced data analytics might pave the way for a fairer insurance industry of the future. Predictive analytics is also widely used by the insurance industries for fraud detection, claims analytics, and compliance & risk management.

Travel & TourismIndustry:

The travel & tourism industry is also using big data analytics for enhancing customer experiences and offer customized recommendations. These firms use demographic statistics, average time spent by users on certain travel-related web pages, personal historic travel preferences, etc.

In order to provide better customized service, data analytics also helps the travel industry to predict when people will travel, location of traveling, purpose of traveling, etc., which can be used to assist with logistics and planning so as to provide the best customer experience at the right price. Predictive analytics is also used by travel industries for personalized offers, passenger safety, fraud detection, and predicting travel delays.

Finance Industry:

There has been a drastic or unique change seen in the financial industry in the last few years. Success in the finance industry is all about having the right information at the right time. By using big data and predictive analytics, algorithms help the industry in collecting the data from a variety of data sources and support from trading decisions to predicting default rates and risk management.

Health Industry:

The health industry produces huge amounts of data on a daily basis. The data is generated at hospitals, pharmacies, diagnostics centers, clinics, research centers, etc. Health-care industry data can have diverse data types consisting of numbers, images, x-rays, cardiograms, and even sentiments. Data analytics in health care can be used for all kinds of applications; these can include prognosis and diagnosis of an ailment, identifying the risk of propagation of a pandemic, identifying the effectiveness of a new therapy, systemic health trends in a given population, and many more. Data analytics can also be used in health care for certain non-conventional applications like tracking fraud, tracking counterfeit medicines, and optimizing patient transport logistics.

Telecom Industry:

The telecom industry has access to large amounts of customer usage and network data. By applying data analytics, it has become easier for telecom companies to understand their customer needs and behaviors in better ways and to customize the offers and services accordingly. By proving customized or personalized offers, there is a higher probability of the conversion. The telecom sector relies heavily on advance analytics for a wide variety of applications that include network optimization, fraud identification, price optimization, predicting customer churn, and enhancing the customer experience.

Retail Industry:

The retail industry is a consumer data-driven industry where the bulk of consumer transactional data is generated on a daily basis. Data analytics is helping retailers not only in understanding customer behavior and their shopping patterns but also what they will purchase in the future. Predictive analytics is widely used by both conventional retail stores as well as e-commerce firms for analyzing their historical data and building models for customer engagement, supply chain optimization, price optimization, and space optimization and assortment planning.

Agriculture Industry:

The agriculture industry has seen many changes in the past years and application of analytics has redefined the industry. Insights from agriculture data will help farmers in having a broader picture of their expected cost, the losses year after year, and the expected profit. It helps the agriculture industry from predicting pesticides quantities to predicting crop prices, weather conditions, soil, air quality, crop yield, and reducing waste; and the livestock health can

Enjoying the preview?

Page 1 of 1

Applied Analytics through Case Studies Using SAS and R: Implementing Predictive Models and Machine Learning Techniques

About this ebook

Deepti Gupta

Related authors

Related to Applied Analytics through Case Studies Using SAS and R

Related ebooks

Databases For You

Related podcast episodes

Related articles

Related categories

Reviews for Applied Analytics through Case Studies Using SAS and R

What did you think?

Book preview

Applied Analytics through Case Studies Using SAS and R - Deepti Gupta

1. Data Analytics and Its Application in Various Industries

What Is Data Analytics?

Data Collection

Data Preparation

Data Analysis

Model Building

Results

Put into Use

Types of Analytics

Understanding Data and Its Types

What Is Big Data Analytics?

Big Data Analytics Challenges

Data Analytics and Big Data Tools

Role of Analytics in Various Industries