Practical Data Science Cookbook - Second Edition

Ebook745 pages5 hours

Practical Data Science Cookbook - Second Edition

Name: Practical Data Science Cookbook - Second Edition
Author: Tony Ojeda
ISBN: 9781787123267

By Tony Ojeda, Sean Patrick Murphy, Benjamin Bengfort and

Rating: 0 out of 5 stars

()

Read preview

About this ebook

About This Book

Tackle every step in the data science pipeline and use it to acquire, clean, analyze, and visualize your data
Get beyond the theory and implement real-world projects in data science using R and Python
Easy-to-follow recipes will help you understand and implement the numerical computing concepts

Who This Book Is For

If you are an aspiring data scientist who wants to learn data science and numerical programming concepts through hands-on, real-world project examples, this is the book for you. Whether you are brand new to data science or you are a seasoned expert, you will benefit from learning about the structure of real-world data science projects and the programming examples in R and Python.

Skip carousel

LanguageEnglish

PublisherPackt Publishing

Release dateJun 29, 2017

ISBN9781787123267

Author

Tony Ojeda

Tony is the founder of District Data Labs and focuses on applied analytics for business strategy. He has published a book on practical data science, and has experience with hands-on education and data science curricula.

Related authors

Skip carousel

Related to Practical Data Science Cookbook - Second Edition

Related ebooks

Skip carousel

Mastering Python for Data Science
Ebook
Mastering Python for Data Science
bySamir Madhavan
Rating: 3 out of 5 stars
3/5
Python Data Science Essentials
Ebook
Python Data Science Essentials
byBoschetti Alberto
Rating: 0 out of 5 stars
0 ratings
Pandas 1.x Cookbook - Second Edition: Practical recipes for scientific computing, time series analysis, and exploratory data analysis using Python, 2nd Edition
Ebook
Pandas 1.x Cookbook - Second Edition: Practical recipes for scientific computing, time series analysis, and exploratory data analysis using Python, 2nd Edition
byMatt Harrison
Rating: 5 out of 5 stars
5/5
Learning Data Mining with Python - Second Edition
Ebook
Learning Data Mining with Python - Second Edition
byRobert Layton
Rating: 0 out of 5 stars
0 ratings
Regression Analysis with Python
Ebook
Regression Analysis with Python
byBoschetti Alberto
Rating: 0 out of 5 stars
0 ratings
Python Data Analysis - Second Edition
Ebook
Python Data Analysis - Second Edition
byArmando Fandango
Rating: 0 out of 5 stars
0 ratings
Hands-On Data Analysis with Pandas: Efficiently perform data collection, wrangling, analysis, and visualization using Python
Ebook
Hands-On Data Analysis with Pandas: Efficiently perform data collection, wrangling, analysis, and visualization using Python
byStefanie Molin
Rating: 0 out of 5 stars
0 ratings
Mastering Python Data Analysis
Ebook
Mastering Python Data Analysis
byMagnus Vilhelm Persson
Rating: 0 out of 5 stars
0 ratings
Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries
Ebook
Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries
byPurna Chander Rao. Kathula
Rating: 5 out of 5 stars
5/5
Getting Started with Python Data Analysis
Ebook
Getting Started with Python Data Analysis
byVo.T.H Phuong
Rating: 0 out of 5 stars
0 ratings
NumPy Essentials
Ebook
NumPy Essentials
byLeo (Liang-Huan) Chin
Rating: 0 out of 5 stars
0 ratings
Practical Data Analysis - Second Edition
Ebook
Practical Data Analysis - Second Edition
byHector Cuesta
Rating: 0 out of 5 stars
0 ratings
R for Data Science
Ebook
R for Data Science
byDan Toomey
Rating: 5 out of 5 stars
5/5
Python Data Science Essentials - Second Edition
Ebook
Python Data Science Essentials - Second Edition
byBoschetti Alberto
Rating: 4 out of 5 stars
4/5
Learning pandas - Second Edition
Ebook
Learning pandas - Second Edition
byHeydt Michael
Rating: 4 out of 5 stars
4/5
R High Performance Programming
Ebook
R High Performance Programming
byAloysius Lim
Rating: 4 out of 5 stars
4/5
R Data Science Essentials
Ebook
R Data Science Essentials
byKoushik Raja B.
Rating: 2 out of 5 stars
2/5
R Machine Learning By Example
Ebook
R Machine Learning By Example
byDipanjan Sarkar
Rating: 0 out of 5 stars
0 ratings
Python Unlocked
Ebook
Python Unlocked
byTigeraniya Arun
Rating: 0 out of 5 stars
0 ratings
Web Application Development with R Using Shiny - Second Edition
Ebook
Web Application Development with R Using Shiny - Second Edition
byBeeley Chris
Rating: 0 out of 5 stars
0 ratings
Practical Predictive Analytics
Ebook
Practical Predictive Analytics
byRalph Winters
Rating: 0 out of 5 stars
0 ratings
Mastering Machine Learning with R
Ebook
Mastering Machine Learning with R
byLesmeister Cory
Rating: 0 out of 5 stars
0 ratings
Introduction to R for Business Intelligence
Ebook
Introduction to R for Business Intelligence
byJay Gendron
Rating: 0 out of 5 stars
0 ratings
Web Scraping with Python
Ebook
Web Scraping with Python
byRichard Lawson
Rating: 4 out of 5 stars
4/5
Mastering Social Media Mining with Python
Ebook
Mastering Social Media Mining with Python
byMarco Bonzanini
Rating: 5 out of 5 stars
5/5
Interactive Applications Using Matplotlib
Ebook
Interactive Applications Using Matplotlib
byBenjamin V. Root
Rating: 0 out of 5 stars
0 ratings
Hands-On Data Science for Marketing: Improve your marketing strategies with machine learning using Python and R
Ebook
Hands-On Data Science for Marketing: Improve your marketing strategies with machine learning using Python and R
byYoon Hyup Hwang
Rating: 5 out of 5 stars
5/5
Building a Recommendation System with R
Ebook
Building a Recommendation System with R
byGorakala Suresh K.
Rating: 0 out of 5 stars
0 ratings
Mastering Python Regular Expressions
Ebook
Mastering Python Regular Expressions
byVictor Romero
Rating: 5 out of 5 stars
5/5
Hands-On Time Series Analysis with R: Perform time series analysis and forecasting using R
Ebook
Hands-On Time Series Analysis with R: Perform time series analysis and forecasting using R
byRami Krispin
Rating: 0 out of 5 stars
0 ratings

Data Visualization For You

Skip carousel

Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
Ebook
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
byNigel Tillery
Rating: 0 out of 5 stars
0 ratings
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
Ebook
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
bySteven Cooper
Rating: 4 out of 5 stars
4/5
The Big Book of Dashboards: Visualizing Your Data Using Real-World Business Scenarios
Ebook
The Big Book of Dashboards: Visualizing Your Data Using Real-World Business Scenarios
bySteve Wexler
Rating: 4 out of 5 stars
4/5
How to Lie with Maps
Ebook
How to Lie with Maps
byMark Monmonier
Rating: 4 out of 5 stars
4/5
The Esri Guide to GIS Analysis, Volume 2: Spatial Measurements and Statistics
Ebook
The Esri Guide to GIS Analysis, Volume 2: Spatial Measurements and Statistics
byAndy Mitchell
Rating: 5 out of 5 stars
5/5
Hands-On Data Analysis with Pandas: Efficiently perform data collection, wrangling, analysis, and visualization using Python
Ebook
Hands-On Data Analysis with Pandas: Efficiently perform data collection, wrangling, analysis, and visualization using Python
byStefanie Molin
Rating: 0 out of 5 stars
0 ratings
Fieldwork Handbook: A Practical Guide on the Go
Ebook
Fieldwork Handbook: A Practical Guide on the Go
byMarika Vertzonis
Rating: 0 out of 5 stars
0 ratings
The Applied SQL Data Analytics Workshop - Second Edition: Develop your practical skills and prepare to become a professional data analyst, 2nd Edition
Ebook
The Applied SQL Data Analytics Workshop - Second Edition: Develop your practical skills and prepare to become a professional data analyst, 2nd Edition
byMatt Goldwasser
Rating: 0 out of 5 stars
0 ratings
Learning pandas - Second Edition
Ebook
Learning pandas - Second Edition
byHeydt Michael
Rating: 4 out of 5 stars
4/5
Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals
Ebook
Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals
byBrent Dykes
Rating: 4 out of 5 stars
4/5
Data Analytics for Beginners: Introduction to Data Analytics
Ebook
Data Analytics for Beginners: Introduction to Data Analytics
byAnthony S. Williams
Rating: 4 out of 5 stars
4/5
How to Become a Data Analyst: My Low-Cost, No Code Roadmap for Breaking into Tech
Ebook
How to Become a Data Analyst: My Low-Cost, No Code Roadmap for Breaking into Tech
byAnnie Nelson
Rating: 0 out of 5 stars
0 ratings
Data Science: What the Best Data Scientists Know About Data Analytics, Data Mining, Statistics, Machine Learning, and Big Data – That You Don't
Ebook
Data Science: What the Best Data Scientists Know About Data Analytics, Data Mining, Statistics, Machine Learning, and Big Data – That You Don't
byHerbert Jones
Rating: 5 out of 5 stars
5/5
Data Visualization: A Practical Introduction
Ebook
Data Visualization: A Practical Introduction
byKieran Healy
Rating: 5 out of 5 stars
5/5
Python For Beginners.Learn Data Science in 5 Days the Smart Way and Remember it Longer. With Easy Step by Step Guidance & Hands on Examples. (Python Crash Course-Programming for Beginners): Python for Beginners
Ebook
Python For Beginners.Learn Data Science in 5 Days the Smart Way and Remember it Longer. With Easy Step by Step Guidance & Hands on Examples. (Python Crash Course-Programming for Beginners): Python for Beginners
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
How to be Clear and Compelling with Data: Principles, Practice and Getting Beyond the Basics
Ebook
How to be Clear and Compelling with Data: Principles, Practice and Getting Beyond the Basics
byJohn J Burrett
Rating: 0 out of 5 stars
0 ratings
Top 20 Essential Skills for ArcGIS Pro
Ebook
Top 20 Essential Skills for ArcGIS Pro
byBonnie Shrewsbury
Rating: 0 out of 5 stars
0 ratings
Teach Yourself VISUALLY Power BI
Ebook
Teach Yourself VISUALLY Power BI
byAlexander Loth
Rating: 0 out of 5 stars
0 ratings
Learn D3.js: Create interactive data-driven visualizations for the web with the D3.js library
Ebook
Learn D3.js: Create interactive data-driven visualizations for the web with the D3.js library
byHelder da Rocha
Rating: 0 out of 5 stars
0 ratings
Visualizing Graph Data
Ebook
Visualizing Graph Data
byCorey Lanum
Rating: 0 out of 5 stars
0 ratings
No-Code Data Science: Mastering Advanced Analytics, Machine Learning, and Artificial Intelligence
Ebook
No-Code Data Science: Mastering Advanced Analytics, Machine Learning, and Artificial Intelligence
byDavid Patrishkoff
Rating: 0 out of 5 stars
0 ratings
Visual Analytics with Tableau
Ebook
Visual Analytics with Tableau
byAlexander Loth
Rating: 0 out of 5 stars
0 ratings
DAX Patterns: Second Edition
Ebook
DAX Patterns: Second Edition
byMarco Russo
Rating: 5 out of 5 stars
5/5
Cool Infographics: Effective Communication with Data Visualization and Design
Ebook
Cool Infographics: Effective Communication with Data Visualization and Design
byRandy Krum
Rating: 4 out of 5 stars
4/5
Spatial Statistics Illustrated
Ebook
Spatial Statistics Illustrated
byLauren Bennett
Rating: 5 out of 5 stars
5/5
Present Beyond Measure: Design, Visualize, and Deliver Data Stories That Inspire Action
Ebook
Present Beyond Measure: Design, Visualize, and Deliver Data Stories That Inspire Action
byLea Pica
Rating: 0 out of 5 stars
0 ratings
Excel for Beginners 2023: A Step-by-Step and Comprehensive Guide to Master the Basics of Excel, with Formulas, Functions, & Charts
Ebook
Excel for Beginners 2023: A Step-by-Step and Comprehensive Guide to Master the Basics of Excel, with Formulas, Functions, & Charts
byGerald Stroud
Rating: 0 out of 5 stars
0 ratings
Data Analysis with Stata
Ebook
Data Analysis with Stata
byKothari Prasad
Rating: 5 out of 5 stars
5/5
Financial Reporting with Dashboards in Power BI
Ebook
Financial Reporting with Dashboards in Power BI
byMONICA SCHEIANU
Rating: 0 out of 5 stars
0 ratings
Learning Tableau
Ebook
Learning Tableau
byJoshua N. Milligan
Rating: 0 out of 5 stars
0 ratings

Related podcast episodes

Skip carousel

Anaconda + Pyston and more: with Peter Wang, CEO of Anaconda
Podcast episode
Anaconda + Pyston and more: with Peter Wang, CEO of Anaconda
byPractical AI: Machine Learning, Data Science
0 ratings
0% found this document useful
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
Podcast episode
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
byHow to Data (Joshiverse- Journey of a Budding Data Scientist)
0 ratings
0% found this document useful
#77 Acing the Data Science Interview
Podcast episode
#77 Acing the Data Science Interview
byDataFramed
0 ratings
0% found this document useful
Advantages of Completing Small Python Projects
Podcast episode
Advantages of Completing Small Python Projects
byThe Real Python Podcast
0 ratings
0% found this document useful
[DataFramed Careers Series #2] What Makes a Great Data Science Portfolio
Podcast episode
[DataFramed Careers Series #2] What Makes a Great Data Science Portfolio
byDataFramed
0 ratings
0% found this document useful
Harnessing Python for Research: Scientific Applications of Python with Michael Kennedy: Still scrabbling with Excel? Consider Python language uses, says programmer and podcaster Michael Kennedy. A general programming language that is easy to use in multiple environments, Python programming is limitless and has numerous open source...
Podcast episode
Harnessing Python for Research: Scientific Applications of Python with Michael Kennedy: Still scrabbling with Excel? Consider Python language uses, says programmer and podcaster Michael Kennedy. A general programming language that is easy to use in multiple environments, Python programming is limitless and has numerous open source...
byFinding Genius Podcast
0 ratings
0% found this document useful
MLA 018 Descript: (Optional episode) just showcasing a cool application using machine learning Dept uses Descript for some of their podcasting. I'm using it like a maniac, I think they're surprised at how into it I am. Check out the transcript & see how it...
Podcast episode
MLA 018 Descript: (Optional episode) just showcasing a cool application using machine learning Dept uses Descript for some of their podcasting. I'm using it like a maniac, I think they're surprised at how into it I am. Check out the transcript & see how it...
byMachine Learning Guide
0 ratings
0% found this document useful
108: PySpark - Jonathan Rioux: Apache Spark is a unified analytics engine for large-scale data processing. PySpark blends the powerful Spark big data processing engine with the Python programming language to provide a data analysis platform that can scale up for nearly any task.
Podcast episode
108: PySpark - Jonathan Rioux: Apache Spark is a unified analytics engine for large-scale data processing. PySpark blends the powerful Spark big data processing engine with the Python programming language to provide a data analysis platform that can scale up for nearly any task.
byTest and Code
0 ratings
0% found this document useful
#54 Women in Data Science
Podcast episode
#54 Women in Data Science
byDataFramed
0 ratings
0% found this document useful
#42 Full Stack Data Science
Podcast episode
#42 Full Stack Data Science
byDataFramed
0 ratings
0% found this document useful
Getting Technical about the Data Center Revolution with Jonathan Friedmann, CEO of Speedata
Podcast episode
Getting Technical about the Data Center Revolution with Jonathan Friedmann, CEO of Speedata
byMaking Data Simple
0 ratings
0% found this document useful
78: Mindset of a Rockstar Data Analyst w/ Trevor Tapscott: Our focus for this inspiring episode of AOF is mindset, especially if you want to be a standout data analyst! I have brought one of my first ever followers and day ones! Trevor Tapscott is a VP and Analytics Consultant at Wells Fargo and has been in...
Podcast episode
78: Mindset of a Rockstar Data Analyst w/ Trevor Tapscott: Our focus for this inspiring episode of AOF is mindset, especially if you want to be a standout data analyst! I have brought one of my first ever followers and day ones! Trevor Tapscott is a VP and Analytics Consultant at Wells Fargo and has been in...
byAnalytics on Fire
0 ratings
0% found this document useful
Measuring Your Python Learning Progress
Podcast episode
Measuring Your Python Learning Progress
byThe Real Python Podcast
100%
100% found this document useful
Improving the Learning Experience on Real Python
Podcast episode
Improving the Learning Experience on Real Python
byThe Real Python Podcast
0 ratings
0% found this document useful
Power Up Your Java Using Python With JPype - Episode 286: An interview with Karl Nelson about using the JPype library for bridging the Java and Python ecosystems for scientific computing
Podcast episode
Power Up Your Java Using Python With JPype - Episode 286: An interview with Karl Nelson about using the JPype library for bridging the Java and Python ecosystems for scientific computing
byThe Python Podcast.__init__
0 ratings
0% found this document useful
77: How to become a BI Data Journalist w/ Kimberly Herrington: Finding your dream job in the world of data and analytics might not be as hard you think! Our guest today, Kimberly Herrington stands as a testament to this idea and she joins us on AOF to talk about how you can go about identifying and capturing your...
Podcast episode
77: How to become a BI Data Journalist w/ Kimberly Herrington: Finding your dream job in the world of data and analytics might not be as hard you think! Our guest today, Kimberly Herrington stands as a testament to this idea and she joins us on AOF to talk about how you can go about identifying and capturing your...
byAnalytics on Fire
0 ratings
0% found this document useful
Exploring the Zen of Python & pandas Features for Finance
Podcast episode
Exploring the Zen of Python & pandas Features for Finance
byThe Real Python Podcast
0 ratings
0% found this document useful
Data Visualization with Manuel Lima: Gabi Ferrara and Jon Foust are back today and joined by fellow Googler Manuel Lima.
Podcast episode
Data Visualization with Manuel Lima: Gabi Ferrara and Jon Foust are back today and joined by fellow Googler Manuel Lima.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Data Visualization and D3.js with Irene Ros: Scott talks to Data Visualization expert Irene Ros. When she isn't contributing to the Miso Project, teaching her d3.js class, or working on making OpenVis Conf the best data visualization conference it can be, she's working on projects that focus on creating engaging interactive visual displays of information.
Podcast episode
Data Visualization and D3.js with Irene Ros: Scott talks to Data Visualization expert Irene Ros. When she isn't contributing to the Miso Project, teaching her d3.js class, or working on making OpenVis Conf the best data visualization conference it can be, she's working on projects that focus on creating engaging interactive visual displays of information.
byHanselminutes with Scott Hanselman
0 ratings
0% found this document useful
82: How to Get Started with Advanced Analytics R-Python w/ Ryan Wade: Ryan Wade joins us on AOF today to talk about how to use advanced analytics in your organization! Ryan has been in the analytics game for the last 20 years and is now a Senior Solution Consultant at Blue Granite, based in Indianapolis, Indiana. He...
Podcast episode
82: How to Get Started with Advanced Analytics R-Python w/ Ryan Wade: Ryan Wade joins us on AOF today to talk about how to use advanced analytics in your organization! Ryan has been in the analytics game for the last 20 years and is now a Senior Solution Consultant at Blue Granite, based in Indianapolis, Indiana. He...
byAnalytics on Fire
0 ratings
0% found this document useful
Unraveling Python's Syntax to Its Core With Brett Cannon
Podcast episode
Unraveling Python's Syntax to Its Core With Brett Cannon
byThe Real Python Podcast
100%
100% found this document useful
Linear Programming, PySimpleGUI, and More
Podcast episode
Linear Programming, PySimpleGUI, and More
byThe Real Python Podcast
0 ratings
0% found this document useful
Going Beyond the Basic Stuff With Python and Al Sweigart
Podcast episode
Going Beyond the Basic Stuff With Python and Al Sweigart
byThe Real Python Podcast
0 ratings
0% found this document useful
#059 - 10 Python clean code tips drawn from code reviews
Podcast episode
#059 - 10 Python clean code tips drawn from code reviews
byPybites Podcast
0 ratings
0% found this document useful
Learning Python Through Errors
Podcast episode
Learning Python Through Errors
byThe Real Python Podcast
0 ratings
0% found this document useful
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
Podcast episode
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
#75 The Data Storytelling Skills Data Teams Need with Andy Cotgreave, Technical Evangelist at Tableau
Podcast episode
#75 The Data Storytelling Skills Data Teams Need with Andy Cotgreave, Technical Evangelist at Tableau
byDataFramed
0 ratings
0% found this document useful
#70 Beyond the Language Wars: R & Python for the Modern Data Scientist
Podcast episode
#70 Beyond the Language Wars: R & Python for the Modern Data Scientist
byDataFramed
0 ratings
0% found this document useful
Exploring deep reinforcement learning: with Thomas Simonini of Hugging Face
Podcast episode
Exploring deep reinforcement learning: with Thomas Simonini of Hugging Face
byPractical AI: Machine Learning, Data Science
0 ratings
0% found this document useful
146: Automation Tools for Web App and API Development and Maintenance - Michael Kennedy: Michael Kennedy joins the show this week to share some of the tools he uses during development and maintenance. We talk about tools used for semi-automated exploratory testing. We also talk about some of the other tools and techniques he uses to keep Talk Python Training, Talk Python, and Python Bytes all up and running smoothly.
Podcast episode
146: Automation Tools for Web App and API Development and Maintenance - Michael Kennedy: Michael Kennedy joins the show this week to share some of the tools he uses during development and maintenance. We talk about tools used for semi-automated exploratory testing. We also talk about some of the other tools and techniques he uses to keep Talk Python Training, Talk Python, and Python Bytes all up and running smoothly.
byTest and Code
0 ratings
0% found this document useful

Skip carousel

2 The Use of Python in AI and ML
Techfastly
Article
2 The Use of Python in AI and ML
Nov 30, 2020
3 min read
Scikit-Learn: The Ultimate Python Library
APC
Article
Scikit-Learn: The Ultimate Python Library
Jul 15, 2019
4 min read
Want A Job In Data Science? You Might Have To Take A Standardized Test When Applying
Chicago Tribune
Article
Want A Job In Data Science? You Might Have To Take A Standardized Test When Applying
Jul 10, 2018
3 min read
Manipulate Data Like A Pro With Pandas
Linux Format
Article
Manipulate Data Like A Pro With Pandas
Jul 27, 2021
7 min read
How Image Recognition Works
APC
Article
How Image Recognition Works
Nov 4, 2019
4 min read
DJANGO Create A Database-driven Website
Linux Format
Article
DJANGO Create A Database-driven Website
Jun 4, 2019
The Django web framework was named after the famous guitarist Django Reinhardt and was first created by web developers at a small newspaper in Kansas. The main goals of Django is to enable fast development of complex websites with database needs. It
7 min read
The Big Idea Behind Big Data
NPR
Article
The Big Idea Behind Big Data
Nov 17, 2017
As we find our way in a world shaped by Big Data, it's not the reams of information we gather but the networks they illuminate that's the newest addition to science's index of things, says Adam Frank.
6 min read
FLASK Web Frameworks
Linux Format
Article
FLASK Web Frameworks
Jun 4, 2019
The main focus of Python has always been to get you cracking on with your coding – the language was never made for web programming. However, this has just made it more interesting to extend the language for the web, or to create an interface to web-b
9 min read
01 Giving Data Collectors—and Donors—a Real-Time Rush
Fast Company
Article
01 Giving Data Collectors—and Donors—a Real-Time Rush
Mar 20, 2017
7 min read
Adoption of Cognitive Computing Across Various Industries
Techfastly
Article
Adoption of Cognitive Computing Across Various Industries
Dec 1, 2021
5 min read
Quantum Leap
Marketing
Article
Quantum Leap
Jul 11, 2019
6 min read
Time To Switch On Your Events
Marketing
Article
Time To Switch On Your Events
Feb 11, 2018
4 min read
6 Artificial Intelligence Trends Reshaping the Field of Marketing
Techfastly
Article
6 Artificial Intelligence Trends Reshaping the Field of Marketing
Jun 1, 2021
4 min read
Family History In The AI Era
Family Tree UK
Article
Family History In The AI Era
Apr 12, 2024
7 min read
Family History Software: An Introduction
Family Tree UK
Article
Family History Software: An Introduction
Feb 11, 2020
5 min read
‘MBAs THAT DON’T FOCUS ON DATA & TECH WON’T DO WELL’
Business Today
Article
‘MBAs THAT DON’T FOCUS ON DATA & TECH WON’T DO WELL’
Oct 28, 2022
6 min read
Smart Answers: GenAI Tool Makes It Easier To Find The Info You Need On PCWorld
PCWorld
Article
Smart Answers: GenAI Tool Makes It Easier To Find The Info You Need On PCWorld
Sep 5, 2023
4 min read
Jobs Of The Future
True Love
Article
Jobs Of The Future
Jan 26, 2023
5 min read
Decoding The Impact Of AI
Her World Singapore
Article
Decoding The Impact Of AI
May 5, 2023
6 min read
Arnab PANDEY
Techfastly
Article
Arnab PANDEY
Apr 1, 2021
11 min read
Ideas Lab
K-Zone
Article
Ideas Lab
Oct 10, 2021
Meet Rashina Hoda, a software engineering researcher who studies how software engineers develop the software products we all love! K-Z : Hi Rashina! What do you do in your role at Monash University? R: As Associate Professor of Software Engineeri
2 min read
The Era of Human + Machine Innovation
Rotman Management
Article
The Era of Human + Machine Innovation
Jan 1, 2019
Interview by Karen Christensen In today's environment, organizations that don't keep up with customers' evolving needs are doomed. What is the best way to get a handle on these evolving needs? The first step in understanding your customers is to acce
5 min read
01 Ready Or Not, AI Is Here To Assist You
HWM Singapore
Article
01 Ready Or Not, AI Is Here To Assist You
Jul 11, 2023
4 min read
Understanding The POTENTIAL OF AI In A Technology Driven World
The European Business Review
Article
Understanding The POTENTIAL OF AI In A Technology Driven World
Apr 3, 2019
9 min read
Why We Need To Fear The Risk Of AI Model Collapse
Evening Standard
Article
Why We Need To Fear The Risk Of AI Model Collapse
Dec 17, 2023
4 min read
Getting The edge
The European Business Review
Article
Getting The edge
Feb 25, 2021
7 min read
Embracing AI in Financial Services
Rotman Management
Article
Embracing AI in Financial Services
Jan 1, 2020
You are the Chief Science Officer at RBC and you also oversee its AI research institute. Describe the bank’s interest in this arena. There are many aspects to our interest in AI. First of all, financial services is a very data-driven business. From t
6 min read
How An A.i. Chatbot Works
Muse: The magazine of science, culture, and smart laughs for kids and children
Article
How An A.i. Chatbot Works
Feb 1, 2024
1 min read
The Future Of Cannabis Data
High Times
Article
The Future Of Cannabis Data
Jan 10, 2024
3 min read
Fact-check And Verify Information
Post South Africa
Article
Fact-check And Verify Information
Mar 13, 2024
Q: What is AI? A: AI is the acronym for artificial intelligence (AI) and refers to the development of computer systems capable of performing tasks that typically require human intelligence, such as visual perception, speech recognition, decision-maki
3 min read

Related categories

Skip carousel

Reviews for Practical Data Science Cookbook - Second Edition

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Practical Data Science Cookbook - Second Edition - Tony Ojeda

Practical Data Science Cookbook

Second Edition

Practical recipes on data pre-processing, analysis and visualization using R and Python

Prabhanjan Tattar

Tony Ojeda

Sean Patrick Murphy

Benjamin Bengfort

Abhijit Dasgupta

BIRMINGHAM - MUMBAI

< html PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN http://www.w3.org/TR/REC-html40/loose.dtd>

Practical Data Science Cookbook

Second Edition

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: September 2014

Second Edition: June 2017

Production reference: 1270617

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-78712-962-7

www.packtpub.com

Credits

About the Authors

Prabhanjan Tattar has 9 years of experience as a statistical analyst. His main thurst has been to explain statistical and machine learning techniques through elegant programming which will clear the nuances of the underlying mathematics. Survival analysis and statistical inference are his main areas of research/interest, and he has published several research papers in peer-reviewed journals and also has authored two books on R: R Statistical Application Development by Example, Packt Publishing, and A Course in Statistics with R, Wiley. He also maintains the R packages gpk, RSADBE, and ACSWR.

I would like to thank the readers for their encouragement and feedback that lead to the improvements in this edition and hope that they find the current edition useful. Thanks are due to Tushar Gupta for introducing me to this project, Cheryl Dsa for bearing with the delays, Karan Thakkar for the eagle-eyed editing, and the entire Packt team for every little support. The authors of the first edition need to be thanked by me as their platform is largely carried forward. On the personal front, I continue to thank my family: Pranathi the kiddo, Chandrika the wifey, Lakshmi the goddess mother, and Narayanachar the beloved father.

Tony Ojeda is an accomplished data scientist and entrepreneur, with expertise in business process optimization and over a decade of experience creating and implementing innovative data products and solutions. He has a master's degree in finance from Florida International University and an MBA with a focus on strategy and entrepreneurship from DePaul University. He is the founder of District Data Labs, is a cofounder of Data Community DC, and is actively involved in promoting data science education through both organizations.

Sean Patrick Murphy spent 15 years as a senior scientist at The Johns Hopkins University, Applied Physics Laboratory, where he focused on machine learning, modeling and simulation, signal processing, and high performance computing in the Cloud. Now, he acts as an advisor and data consultant for companies in San Francisco, New York, and Washington DC. He completed graduation from The Johns Hopkins University and got his MBA from the University of Oxford. He currently co-organizes the Data Innovation DC meetup and co-founded the Data Science MD meetup. He is also a board member and co-founder of Data Community DC.

Benjamin Bengfort is an experienced data scientist and Python developer who has worked in the military, industry, and academia for the past 8 years. He is currently pursuing his PhD in Computer Science at the University of Maryland, College Park, doing research in Metacognition and Natural Language Processing. He holds a Master's degree in Computer Science from North Dakota State University, where he taught undergraduate Computer Science courses. He is also an adjunct faculty member at Georgetown University, where he teaches Data Science and Analytics. Benjamin has been involved in two data science start-ups in the DC region: leveraging large-scale machine learning and Big Data techniques across a variety of applications. He has a deep appreciation for the combination of models and data for entrepreneurial effect, and he is currently building one of these start-ups into a more mature organization.

Abhijit Dasgupta is a data consultant working in the greater DC-Maryland-Virginia area, with several years of experience in biomedical consulting, business analytics, bioinformatics, and bioengineering consulting. He has a PhD in biostatistics from the University of Washington and over 40 collaborative peer-reviewed manuscripts, with strong interests in bridging the statistics/machine-learning divide. He is always on the lookout for interesting and challenging projects, and is an enthusiastic speaker and discussant on new and better ways to look at and analyze data. He is a member of Data Community DC and a founding member and co-organizer of Statistical Programming DC (formerly R Users DC).

About the Reviewer

Alberto Boschetti is a data scientist, with strong expertise in signal processing and statistics. He holds a PhD in telecommunication engineering and currently lives and works in London. In his work projects he daily faces challenges spanning among natural language processing (NLP), machine learning, and distributed processing. He is very passionate about his job and he always tries to be updated on the latest development of data science technologies, attending meetups, conferences and other events. He is the author of Python Data Science Essentials, Regression Analysis with Python, and Large Scale Machine Learning with Python, all published by Packt.

I would like to thank my family, friends, and colleagues. Also, a big thanks to the open source community.

Abhinav Rai has been working as a Data Scientist for nearly a decade, currently working at Microsoft. He has experience working in telecom, retail marketing, and online advertisement. His areas of interest include the evolving techniques of Machine Learning and the associated technologies. He is especially more interested in analyzing large and humongous datasets and likes to generate deep insights in such scenarios. Academically holding a double master's degree in Mathematics from Deendayal Upadhyay Gorakhpur University with an NBHM scholarship and in Computer Science from Indian Statistical Institute, rigor and sophistication is a surety with his analytical deliveries.

www.PacktPub.com

For support files and downloads related to your book, please visit www.PacktPub.com. Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us atservice@packtpub.com for more details. At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www.packtpub.com/mapt

Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.

Why subscribe?

Fully searchable across every book published by Packt

Copy and paste, print, and bookmark content

On demand and accessible via a web browser

Customer Feedback

Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://www.amazon.com/dp/1787129624.

If you'd like to join our team of regular reviewers, you can e-mail us at customerreviews@packtpub.com. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!

Preface

What this book covers

What you need for this book

Who this book is for

Sections

Getting ready

How to do it…

How it works…

There's more…

See also

Conventions

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

Preparing Your Data Science Environment

Understanding the data science pipeline

How to do it...

How it works...

Installing R on Windows, Mac OS X, and Linux

How to do it...

How it works...

See also

Installing libraries in R and RStudio

Getting ready

How to do it...

How it works...

There's more...

See also

Installing Python on Linux and Mac OS X

Getting ready

How to do it...

How it works...

See also

Installing Python on Windows

How to do it...

How it works...

See also

Installing the Python data stack on Mac OS X and Linux

Getting ready

How to do it...

How it works...

There's more...

See also

Installing extra Python packages

Getting ready

How to do it...

How it works...

There's more...

See also

Installing and using virtualenv

Getting ready

How to do it...

How it works...

There's more...

See also

Driving Visual Analysis with Automobile Data with R

Introduction

Acquiring automobile fuel efficiency data

Getting ready

How to do it...

How it works...

Preparing R for your first project

Getting ready

How to do it...

There's more...

See also

Importing automobile fuel efficiency data into R

Getting ready

How to do it...

How it works...

There's more...

See also

Exploring and describing fuel efficiency data

Getting ready

How to do it...

How it works...

There's more...

Analyzing automobile fuel efficiency over time

Getting ready

How to do it...

How it works...

There's more...

See also

Investigating the makes and models of automobiles

Getting ready

How to do it...

How it works...

There's more...

See also

Creating Application-Oriented Analyses Using Tax Data and Python

Introduction

An introduction to application-oriented approaches

Preparing for the analysis of top incomes

Getting ready

How to do it...

How it works...

Importing and exploring the world's top incomes dataset

Getting ready

How to do it...

How it works...

There's more...

See also

Analyzing and visualizing the top income data of the US

Getting ready

How to do it...

How it works...

Furthering the analysis of the top income groups of the US

Getting ready

How to do it...

How it works...

Reporting with Jinja2

Getting ready

How to do it...

How it works...

There's more...

See also

Repeating the analysis in R

Getting ready

How to do it...

There's more...

Modeling Stock Market Data

Introduction

Requirements

Acquiring stock market data

How to do it...

Summarizing the data

Getting ready

How to do it...

How it works...

There's more...

Cleaning and exploring the data

Getting ready

How to do it...

How it works...

See also

Generating relative valuations

Getting ready

How to do

How it works...

Screening stocks and analyzing historical prices

Getting ready

How to do it...

How it works...

Visually Exploring Employment Data

Introduction

Preparing for analysis

Getting ready

How to do it...

How it works...

See also

Importing employment data into R

Getting ready

How to do it...

How it works...

There's more...

See also

Exploring the employment data

Getting ready

How to do it...

How it works...

See also

Obtaining and merging additional data

Getting ready

How to do it...

How it works...

Adding geographical information

Getting ready

How to do it...

How it works...

See also

Extracting state- and county-level wage and employment information

Getting ready

How to do it...

How it works...

See also

Visualizing geographical distributions of pay

Getting ready

How to do it...

How it works...

See also

Exploring where the jobs are, by industry

How to do it...

How it works...

There's more...

See also

Animating maps for a geospatial time series

Getting ready

How to do it...

How it works...

There is more...

Benchmarking performance for some common tasks

Getting ready

How to do it...

How it works...

There's more...

See also

Driving Visual Analyses with Automobile Data

Introduction

Getting started with IPython

Getting ready

How to do it...

How it works...

See also

Exploring Jupyter Notebook

Getting ready

How to do it...

How it works...

There's more...

See also

Preparing to analyze automobile fuel efficiencies

Getting ready

How to do it...

How it works...

There's more...

See also

Exploring and describing fuel efficiency data with Python

Getting ready

How to do it...

How it works...

There's more...

See also

Analyzing automobile fuel efficiency over time with Python

Getting ready

How to do it...

How it works...

There's more...

See also

Investigating the makes and models of automobiles with Python

Getting ready

How to do it...

How it works...

See also

Working with Social Graphs

Introduction

Understanding graphs and networks

Preparing to work with social networks in Python

Getting ready

How to do it...

How it works...

There's more...

Importing networks

Getting ready

How to do it...

How it works...

Exploring subgraphs within a heroic network

Getting ready

How to do it...

How it works...

There's more...

Finding strong ties

Getting ready

How to do it...

How it works...

There's more...

Finding key players

Getting ready

How to do it...

How it works...

There's more...

The betweenness centrality

The closeness centrality

The eigenvector centrality

Deciding on centrality algorithm

Exploring the characteristics of entire networks

Getting ready

How to do it...

How it works...

Clustering and community detection in social networks

Getting ready

How to do it...

How it works...

There's more...

Visualizing graphs

Getting ready

How to do it...

How it works...

Social networks in R

Getting ready

How to do it...

How it works...

Recommending Movies at Scale (Python)

Introduction

Modeling preference expressions

How to do it...

How it works...

Understanding the data

Getting ready

How to do it...

How it works...

There's more...

Ingesting the movie review data

Getting ready

How to do it...

How it works...

Finding the highest-scoring movies

Getting ready

How to do it...

How it works...

There's more...

See also

Improving the movie-rating system

Getting ready

How to do it...

How it works...

There's more...

See also

Measuring the distance between users in the preference space

Getting ready

How to do it...

How it works...

There's more...

See also

Computing the correlation between users

Getting ready

How to do it...

How it works...

There's more...

Finding the best critic for a user

Getting ready

How to do it...

How it works...

Predicting movie ratings for users

Getting ready

How to do it...

How it works...

Collaboratively filtering item by item

Getting ready

How to do it...

How it works...

Building a non-negative matrix factorization model

How to do it...

How it works...

See also

Loading the entire dataset into the memory

Getting ready

How to do it...

How it works...

There's more...

Dumping the SVD-based model to the disk

How to do it...

How it works...

Training the SVD-based model

How to do it...

How it works...

There's more...

Testing the SVD-based model

How to do it...

How it works...

There's more...

Harvesting and Geolocating Twitter Data (Python)

Introduction

Creating a Twitter application

Getting ready

How to do it...

How it works...

See also

Understanding the Twitter API v1.1

Getting ready

How to do it...

How it works...

There's more...

See also

Determining your Twitter followers and friends

Getting ready

How to do it...

How it works...

There's more...

See also

Pulling Twitter user profiles

Getting ready

How to do it...

How it works...

There's more...

See also

Making requests without running afoul of Twitter's rate limits

Getting ready

How to do it...

How it works...

Storing JSON data to disk

Getting ready

How to do it...

How it works...

Setting up MongoDB for storing Twitter data

Getting ready

How to do it...

How it works...

There's more...

See also

Storing user profiles in MongoDB using PyMongo

Getting ready

How to do it...

How it works...

Exploring the geographic information available in profiles

Getting ready

How to do it...

How it works...

There's more...

See also

Plotting geospatial data in Python

Getting ready

How to do it...

How it works...

There's more...

See also

Forecasting New Zealand Overseas Visitors

Introduction

The ts object

Getting ready

How to do it

How it works...

Visualizing time series data

Getting ready

How to do it...

How it works...

Simple linear regression models

Getting ready

How to do it...

How it works...

See also

ACF and PACF

Getting ready

How to do it...

How it works...

ARIMA models

Getting ready

How to do it...

How it works...

Accuracy measurements

Getting ready

How to do it...

How it works...

Fitting seasonal ARIMA models

Getting ready

How to do it...

How it works...

There's more...

German Credit Data Analysis

Introduction

Simple data transformations

Getting ready

How to do it...

How it works...

There's more...

Visualizing categorical data

Getting ready

How to do it...

How it works...

Discriminant analysis

Getting ready

How to do it...

How it works...

See also

Dividing the data and the ROC

Getting ready

How to do it...

Fitting the logistic regression model

Getting ready

How to do it...

How it works...

See also

Decision trees and rules

Getting ready

How to do it...

How it works...

See also

Decision tree for german data

Getting ready

How to do it ...

How it works...

Preface

Welcome to the second edition of Practical Data Science Cookbook. It was the positive feedback and usefulness that the book has found for its readers that made a second edition possible. When Packt asked me to co-author the second edition, I had a preview of some of its reviews across the web and immediately found the reasons for the popularity of the book and its little weakness. Thus, the current version retains the positives of the acceptance and removes the pain points as much as possible. The two new chapters: Chapter 10, German Credit Data Analysis and Chapter 11, Forecasting New Zealand Overseas Visitors are included to enhance the usefulness of the book.

We live in the age of data. As increasing amounts are generated each year, the need to analyze and create value from this asset is more important than ever. Companies that know what to do with their data and how to do it well will have a competitive advantage over companies that don't. Due to this, there will be an increasing demand for people who possess both the analytical and technical abilities to extract valuable insights from data and the business acumen to create valuable and pragmatic solutions that put these insights to use. This book provides multiple opportunities to learn how to create value from data through a variety of projects that run the spectrum of types of contemporary data science projects. Each chapter stands on its own, with step-by-step instructions that include screenshots, code snippets, and more detailed explanations where necessary and with a focus on process and practical application. The goal of this book is to introduce the data science pipeline, show you how it applies to a variety of different data science projects, and get you comfortable enough to apply it in future to projects of your own. Along the way, you'll learn different analytical and programming lessons, and the fact that you are working through an actual project while learning will help cement these concepts and facilitate your understanding of them.

What this book covers

Chapter 1, Preparing Your Data Science Environment, introduces the data science pipeline and helps you get your data science environment properly set up with instructions for the Mac, Windows, and Linux operating systems. This chapter is a guideline for setting up the environment for R and Python on the preceding platforms.

Chapter 2, Driving Visual Analysis with Automobile Data with R, takes you through the process of analyzing and visualizing automobile data to identify trends and patterns in fuel efficiency over time. The chapter will give you a taste of acquisition, exploration, munging, analysis, and communication. The concepts will be implemented in R.

Chapter 3, Creating Application-Oriented Analyses Using Tax Data and Python, shows you how to use Python to transition your analyses from one-off, custom efforts to reproducible and production-ready code using income distribution data as the base for the project.

Chapter 4, Modeling Stock Market Data, shows you how to build your own stock screener and use moving averages to analyze historical stock prices. You will learn how to acquire, summarize, clean, and generate relative evaluations of data.

Chapter 5, Visually Exploring Employment Data, shows you how to obtain employment and earnings data from the Bureau of Labor Statistics and conduct geospatial analysis at different levels with R. The same will be implemented using Python. The focus of this chapter is on the transformation, manipulation, and visualization of data.

Chapter 6, Driving Visual Analyses with Automobile Data, mirrors the automobile data analyses and visualizations in Chapter 2, Driving Visual Analysis with Automobile Data with R, but does so using the powerful programming language, Python. It focuses on the implementation of the analysis model using Python.

Chapter 7, Working with Social Graphs, shows you how to build, visualize, and analyze a social network that consists of comic book character relationships. You will also see the R and Python implementation.

Chapter 8, Recommending Movies at Scale (Python), walks you through building a movie recommender system with Python. You will also learn the R and Python code to implement a predictive model and the use of collaborative filtering to implement a predictive model.

Chapter 9, Harvesting and Geolocating Twitter Data (Python), shows you how to connect to the Twitter API and plot the geographic information contained in profiles. You will also learn the use of RESTful APIs in TextMining

Chapter 10, Forecasting New Zealand Overseas Visitors, explains how to create time series objects and describes various methods to visualize time series data. You will also learn how to build an appropriate model for the data and identify if the data has any trends and seasonal components.

Chapter 11, German Credit Data Analysis, demonstrates Exploratory Data Analysis (EDA), with a few basic tree methods and random forest. You will learn the method to apply EDA, tree-based methods and random forest on some particular data.

What you need for this book

For this book, you will need a computer with access to the Internet and the ability to install the open source software needed for the projects. The primary software we will be using consists of the R and Python programming languages, with a myriad of freely available packages and libraries. Installation instructions are in the first chapter.

Who this book is for

This book is intended for aspiring data scientists who want to learn data science and numerical programming concepts through hands-on, real-world projects. Whether you are brand new to data science or you are a seasoned expert, you will benefit from learning about the structure of real-world data science projects and the programming examples in R and Python.

Sections

In this book, you will find several headings that appear frequently (Getting ready, How to do it, How it works, There's more, and See also). To give clear instructions on how to complete a recipe, we use these sections as follows.

Getting ready

This section tells you what to expect in the recipe, and describes how to set up any software or any preliminary settings required for the recipe.

How to do it…

This section contains the steps required to follow the recipe.

How it works…

This section usually consists of a detailed explanation of what happened in the previous section.

There's more…

This section consists of additional information about the recipe in order to make the reader more knowledgeable about the recipe.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning. Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: Create a new user for JIRA in the database and grant the user access to the jiradb database we just created using the following command:

A block of code is set as follows:

Any command-line input or output is written as follows:

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: Select System info from the Administration panel.

Warnings or important notes appear in a box like this.

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of. To send us general feedback, simply e-mail feedback@packtpub.com, and mention the book's title in the subject of your message. If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you. You can download the code files by following these steps:

Hover the mouse pointer on the SUPPORT tab at the top.

Click on Code Downloads & Errata.

Enter the name of the book in the Search box.

Select the book for which you're looking to download the code files.

Choose from the drop-down menu where you purchased this book from.

Click on Code Download.

You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account. Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for Windows

Zipeg / iZip / UnRarX for Mac

7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Practical-Data-Science-Cookbook-Second-Edition. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/PracticalDataScienceCookbookSecondEditon_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy

Enjoying the preview?

Page 1 of 1

Practical Data Science Cookbook - Second Edition

About this ebook

Tony Ojeda

Related authors

Related to Practical Data Science Cookbook - Second Edition

Related ebooks

Data Visualization For You

Related podcast episodes

Related articles

Related categories

Reviews for Practical Data Science Cookbook - Second Edition

What did you think?

Book preview

Practical Data Science Cookbook - Second Edition - Tony Ojeda

Practical Data Science Cookbook

Second Edition

Practical recipes on data pre-processing, analysis and visualization using R and Python

Prabhanjan Tattar

Tony Ojeda

Sean Patrick Murphy

Benjamin Bengfort

Abhijit Dasgupta

< html PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN http://www.w3.org/TR/REC-html40/loose.dtd>

Practical Data Science Cookbook

Second Edition

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-78712-962-7

Credits

About the Authors

About the Reviewer

www.PacktPub.com

Why subscribe?

Customer Feedback

Table of Contents

Preface

What this book covers

What you need for this book

Who this book is for

Sections

Getting ready

There's more…

See also

Conventions

Reader feedback

Customer support

Downloading the example code

Errata