The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition

Ebook849 pages4 hours

The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition

Name: The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition
Author: Blaine Bateman
ISBN: 9781800208322

By Blaine Bateman, Ashish Ranjan Jha, Benjamin Johnston and Ishita Mathur

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Cut through the noise and get real results with a step-by-step approach to understanding supervised learning algorithms

Key Features

Ideal for those getting started with machine learning for the first time
A step-by-step machine learning tutorial with exercises and activities that help build key skills
Structured to let you progress at your own pace, on your own terms
Use your physical print copy to redeem free access to the online interactive edition

Book Description

You already know you want to understand supervised learning, and a smarter way to do that is to learn by doing. The Supervised Learning Workshop focuses on building up your practical skills so that you can deploy and build solutions that leverage key supervised learning algorithms. You'll learn from real examples that lead to real results.

Throughout The Supervised Learning Workshop, you'll take an engaging step-by-step approach to understand supervised learning. You won't have to sit through any unnecessary theory. If you're short on time you can jump into a single exercise each day or spend an entire weekend learning how to predict future values with auto regressors. It's your choice. Learning on your terms, you'll build up and reinforce key skills in a way that feels rewarding.

Every physical print copy of The Supervised Learning Workshop unlocks access to the interactive edition. With videos detailing all exercises and activities, you'll always have a guided solution. You can also benchmark yourself against assessments, track progress, and receive content updates. You'll even earn a secure credential that you can share and verify online upon completion. It's a premium learning experience that's included with your printed copy. To redeem, follow the instructions located at the start of your book.

Fast-paced and direct, The Supervised Learning Workshop is the ideal companion for those with some Python background who are getting started with machine learning. You'll learn how to apply key algorithms like a data scientist, learning along the way. This process means that you'll find that your new skills stick, embedded as best practice. A solid foundation for the years ahead.

What you will learn

Get to grips with the fundamental of supervised learning algorithms
Discover how to use Python libraries for supervised learning
Learn how to load a dataset in pandas for testing
Use different types of plots to visually represent the data
Distinguish between regression and classification problems
Learn how to perform classification using K-NN and decision trees

Who this book is for

Our goal at Packt is to help you be successful, in whatever it is you choose to do. The Supervised Learning Workshop is ideal for those with a Python background, who are just starting out with machine learning. Pick up a Workshop today, and let Packt help you develop skills that stick with you for life.

Skip carousel

LanguageEnglish

PublisherPackt Publishing

Release dateFeb 28, 2020

ISBN9781800208322

Author

Blaine Bateman

Related authors

Skip carousel

Related to The Supervised Learning Workshop - Second Edition

Related ebooks

Skip carousel

The Data Science Workshop: A New, Interactive Approach to Learning Data Science
Ebook
The Data Science Workshop: A New, Interactive Approach to Learning Data Science
byAnthony So
Rating: 0 out of 5 stars
0 ratings
Python: Deeper Insights into Machine Learning
Ebook
Python: Deeper Insights into Machine Learning
byJohn Hearty
Rating: 0 out of 5 stars
0 ratings
Practical Full Stack Machine Learning: A Guide to Build Reliable, Reusable, and Production-Ready Full Stack ML Solutions
Ebook
Practical Full Stack Machine Learning: A Guide to Build Reliable, Reusable, and Production-Ready Full Stack ML Solutions
byAlok Kumar
Rating: 0 out of 5 stars
0 ratings
Mastering Machine Learning Algorithms - Second Edition: Expert techniques for implementing popular machine learning algorithms, fine-tuning your models, and understanding how they work, 2nd Edition
Ebook
Mastering Machine Learning Algorithms - Second Edition: Expert techniques for implementing popular machine learning algorithms, fine-tuning your models, and understanding how they work, 2nd Edition
byGiuseppe Bonaccorso
Rating: 0 out of 5 stars
0 ratings
Designing Machine Learning Systems with Python
Ebook
Designing Machine Learning Systems with Python
byDavid Julian
Rating: 0 out of 5 stars
0 ratings
Machine Learning - A Comprehensive, Step-by-Step Guide to Learning and Applying Advanced Concepts and Techniques in Machine Learning: 3
Ebook
Machine Learning - A Comprehensive, Step-by-Step Guide to Learning and Applying Advanced Concepts and Techniques in Machine Learning: 3
byPeter Bradley
Rating: 0 out of 5 stars
0 ratings
Mastering Python for Data Science
Ebook
Mastering Python for Data Science
bySamir Madhavan
Rating: 3 out of 5 stars
3/5
Building Machine Learning Systems with Python
Ebook
Building Machine Learning Systems with Python
byWilli Richert
Rating: 4 out of 5 stars
4/5
Python Machine Learning: A Practical Beginner's Guide to Understanding Machine Learning, Deep Learning and Neural Networks with Python, Scikit-Learn, Tensorflow and Keras
Ebook
Python Machine Learning: A Practical Beginner's Guide to Understanding Machine Learning, Deep Learning and Neural Networks with Python, Scikit-Learn, Tensorflow and Keras
byBrandon Railey
Rating: 0 out of 5 stars
0 ratings
Learning Predictive Analytics with Python
Ebook
Learning Predictive Analytics with Python
byKumar Ashish
Rating: 0 out of 5 stars
0 ratings
Python Machine Learning Cookbook
Ebook
Python Machine Learning Cookbook
byPrateek Joshi
Rating: 0 out of 5 stars
0 ratings
MLOps Engineering at Scale
Ebook
MLOps Engineering at Scale
byCarl Osipov
Rating: 0 out of 5 stars
0 ratings
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
Ebook
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
bySebastian Raschka
Rating: 5 out of 5 stars
5/5
Machine Learning for Business: Using Amazon SageMaker and Jupyter
Ebook
Machine Learning for Business: Using Amazon SageMaker and Jupyter
byDoug Hudgeon
Rating: 5 out of 5 stars
5/5
Python Data Analysis Cookbook
Ebook
Python Data Analysis Cookbook
byIvan Idris
Rating: 5 out of 5 stars
5/5
Python Machine Learning: A Step by Step Beginner’s Guide to Learn Machine Learning Using Python
Ebook
Python Machine Learning: A Step by Step Beginner’s Guide to Learn Machine Learning Using Python
byBrady Ellison
Rating: 0 out of 5 stars
0 ratings
Learning Data Mining with Python
Ebook
Learning Data Mining with Python
byRobert Layton
Rating: 0 out of 5 stars
0 ratings
Test-Driven Machine Learning
Ebook
Test-Driven Machine Learning
byBozonier Justin
Rating: 0 out of 5 stars
0 ratings
Pragmatic Machine Learning with Python: Learn How to Deploy Machine Learning Models in Production
Ebook
Pragmatic Machine Learning with Python: Learn How to Deploy Machine Learning Models in Production
byAvishek Nag
Rating: 0 out of 5 stars
0 ratings
Learning Apache Mahout
Ebook
Learning Apache Mahout
byTiwary Chandramani
Rating: 0 out of 5 stars
0 ratings
Learning Data Mining with Python - Second Edition
Ebook
Learning Data Mining with Python - Second Edition
byRobert Layton
Rating: 0 out of 5 stars
0 ratings
OpenCV: Computer Vision Projects with Python
Ebook
OpenCV: Computer Vision Projects with Python
byJoseph Howse
Rating: 0 out of 5 stars
0 ratings
Regression Analysis with Python
Ebook
Regression Analysis with Python
byBoschetti Alberto
Rating: 0 out of 5 stars
0 ratings
Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries
Ebook
Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries
byPurna Chander Rao. Kathula
Rating: 5 out of 5 stars
5/5
Python Data Science Essentials - Second Edition
Ebook
Python Data Science Essentials - Second Edition
byBoschetti Alberto
Rating: 4 out of 5 stars
4/5
The Python Workshop: Learn to code in Python and kickstart your career in software development or data science
Ebook
The Python Workshop: Learn to code in Python and kickstart your career in software development or data science
byAndrew Bird
Rating: 5 out of 5 stars
5/5
PyTorch Recipes: A Problem-Solution Approach
Ebook
PyTorch Recipes: A Problem-Solution Approach
byPradeepta Mishra
Rating: 0 out of 5 stars
0 ratings
Microsoft Azure Machine Learning
Ebook
Microsoft Azure Machine Learning
bySumit Mund
Rating: 4 out of 5 stars
4/5
Python: Real-World Data Science
Ebook
Python: Real-World Data Science
byRobert Layton
Rating: 0 out of 5 stars
0 ratings
Machine Learning Algorithms for Data Scientists: An Overview
Ebook
Machine Learning Algorithms for Data Scientists: An Overview
byVinaitheerthan Renganathan
Rating: 0 out of 5 stars
0 ratings

Programming For You

Skip carousel

Java for Beginners: A Crash Course to Learn Java Programming in 1 Week
Ebook
Java for Beginners: A Crash Course to Learn Java Programming in 1 Week
byBrady Ellison
Rating: 5 out of 5 stars
5/5
Game Development with Unreal Engine 5: Learn the Basics of Game Development in Unreal Engine 5 (English Edition)
Ebook
Game Development with Unreal Engine 5: Learn the Basics of Game Development in Unreal Engine 5 (English Edition)
byMitchell Lynn
Rating: 0 out of 5 stars
0 ratings
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
Ebook
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
byKevin Clark
Rating: 5 out of 5 stars
5/5
Coding All-in-One For Dummies
Ebook
Coding All-in-One For Dummies
byNikhil Abraham
Rating: 4 out of 5 stars
4/5
Microsoft Office 365 Bible: 10:1 Mastery | Excel in Your Profession, Enhance Time Management, and Foster Exceptional Collaboration [III EDITION]: Career Elevator
Ebook
Microsoft Office 365 Bible: 10:1 Mastery | Excel in Your Profession, Enhance Time Management, and Foster Exceptional Collaboration [III EDITION]: Career Elevator
byKevin Pitch
Rating: 5 out of 5 stars
5/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
HTML & CSS: Learn the Fundaments in 7 Days
Ebook
HTML & CSS: Learn the Fundaments in 7 Days
byMichael Knapp
Rating: 4 out of 5 stars
4/5
C# Programming from Zero to Proficiency (Beginner): C# from Zero to Proficiency, #2
Ebook
C# Programming from Zero to Proficiency (Beginner): C# from Zero to Proficiency, #2
byPatrick Felicia
Rating: 0 out of 5 stars
0 ratings
Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps
Ebook
Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps
byJason Scotts
Rating: 4 out of 5 stars
4/5
Python: For Beginners A Crash Course Guide To Learn Python in 1 Week
Ebook
Python: For Beginners A Crash Course Guide To Learn Python in 1 Week
byTimothy C. Needham
Rating: 4 out of 5 stars
4/5
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
Ebook
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
byNigel Tillery
Rating: 0 out of 5 stars
0 ratings
Python Programming For Beginners: Learn The Basics Of Python Programming (Python Crash Course, Programming for Dummies)
Ebook
Python Programming For Beginners: Learn The Basics Of Python Programming (Python Crash Course, Programming for Dummies)
byJames Tudor
Rating: 5 out of 5 stars
5/5
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
Ebook
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
byGwendolyn Faraday
Rating: 5 out of 5 stars
5/5
Python Programming for Beginners: A Comprehensive Crash Course With Practical Exercises to Quickly Learn Coding and Programming for Data Analysis and Machine Learning
Ebook
Python Programming for Beginners: A Comprehensive Crash Course With Practical Exercises to Quickly Learn Coding and Programming for Data Analysis and Machine Learning
byAnthony Adams
Rating: 4 out of 5 stars
4/5
Learn JavaScript in 24 Hours
Ebook
Learn JavaScript in 24 Hours
byAlex Nordeen
Rating: 3 out of 5 stars
3/5
Python QuickStart Guide: The Simplified Beginner's Guide to Python Programming Using Hands-On Projects and Real-World Applications
Ebook
Python QuickStart Guide: The Simplified Beginner's Guide to Python Programming Using Hands-On Projects and Real-World Applications
byRobert Oliver
Rating: 0 out of 5 stars
0 ratings
Python Programming, Deep Learning: 3 Books in 1: A Complete Guide for Beginners, Python Coding for Ai, Neural Networks, & Machine Learning, Data Science/Analysis with Practical Exercises for Learners
Ebook
Python Programming, Deep Learning: 3 Books in 1: A Complete Guide for Beginners, Python Coding for Ai, Neural Networks, & Machine Learning, Data Science/Analysis with Practical Exercises for Learners
byAnthony Adams
Rating: 4 out of 5 stars
4/5
HTML & CSS QuickStart Guide: The Simplified Beginners Guide to Developing a Strong Coding Foundation, Building Responsive Websites, and Mastering the Fundamentals of Modern Web Design
Ebook
HTML & CSS QuickStart Guide: The Simplified Beginners Guide to Developing a Strong Coding Foundation, Building Responsive Websites, and Mastering the Fundamentals of Modern Web Design
byDavid DuRocher
Rating: 4 out of 5 stars
4/5
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
Ebook
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
byMark Chan
Rating: 5 out of 5 stars
5/5
Python Machine Learning By Example
Ebook
Python Machine Learning By Example
byYuxi (Hayden) Liu
Rating: 4 out of 5 stars
4/5
Problem Solving in C and Python: Programming Exercises and Solutions, Part 1
Ebook
Problem Solving in C and Python: Programming Exercises and Solutions, Part 1
byYana Kortsarts
Rating: 5 out of 5 stars
5/5
Python Data Structures and Algorithms
Ebook
Python Data Structures and Algorithms
byBenjamin Baka
Rating: 5 out of 5 stars
5/5
The Advanced Roblox Coding Book: An Unofficial Guide, Updated Edition: Learn How to Script Games, Code Objects and Settings, and Create Your Own World!
Ebook
The Advanced Roblox Coding Book: An Unofficial Guide, Updated Edition: Learn How to Script Games, Code Objects and Settings, and Create Your Own World!
byHeath Haskins
Rating: 5 out of 5 stars
5/5
Linux: Learn in 24 Hours
Ebook
Linux: Learn in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5
Expert Python Programming - Third Edition: Become a master in Python by learning coding best practices and advanced programming concepts in Python 3.7, 3rd Edition
Ebook
Expert Python Programming - Third Edition: Become a master in Python by learning coding best practices and advanced programming concepts in Python 3.7, 3rd Edition
byMichał Jaworski
Rating: 0 out of 5 stars
0 ratings
The Unofficial Guide to Open Broadcaster Software: OBS: The World's Most Popular Free Live-Streaming Application
Ebook
The Unofficial Guide to Open Broadcaster Software: OBS: The World's Most Popular Free Live-Streaming Application
byPaul Richards
Rating: 0 out of 5 stars
0 ratings
Python GUI Programming Cookbook - Second Edition
Ebook
Python GUI Programming Cookbook - Second Edition
byMeier Burkhard A.
Rating: 5 out of 5 stars
5/5
Learn SQL in 24 Hours
Ebook
Learn SQL in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5

Related podcast episodes

Skip carousel

Anaconda + Pyston and more: with Peter Wang, CEO of Anaconda
Podcast episode
Anaconda + Pyston and more: with Peter Wang, CEO of Anaconda
byPractical AI: Machine Learning, Data Science
0 ratings
0% found this document useful
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
Podcast episode
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
Podcast episode
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
byThe Web Platform Podcast
100%
100% found this document useful
Exploring deep reinforcement learning: with Thomas Simonini of Hugging Face
Podcast episode
Exploring deep reinforcement learning: with Thomas Simonini of Hugging Face
byPractical AI: Machine Learning, Data Science
0 ratings
0% found this document useful
040: Graph Databases: Traditional relational databases like MySQL or Postgres are really good at providing many solutions to the problem of persisting state. But these types of database are really horrible at querying highly connected models in an efficient way. Graph datab...
Podcast episode
040: Graph Databases: Traditional relational databases like MySQL or Postgres are really good at providing many solutions to the problem of persisting state. But these types of database are really horrible at querying highly connected models in an efficient way. Graph datab...
byPHPRoundtable Podcast
0 ratings
0% found this document useful
Measuring Your Python Learning Progress
Podcast episode
Measuring Your Python Learning Progress
byThe Real Python Podcast
100%
100% found this document useful
This Week In Machine Learning & AI - 5/20/16: AI at Google I/O, Amazon's Deep Learning DSSTNE: This Week In Machine Learning & AI - May 20, 2016…
Podcast episode
This Week In Machine Learning & AI - 5/20/16: AI at Google I/O, Amazon's Deep Learning DSSTNE: This Week In Machine Learning & AI - May 20, 2016…
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Learning Python Through Errors
Podcast episode
Learning Python Through Errors
byThe Real Python Podcast
0 ratings
0% found this document useful
[AI is Here] Unlocking NLP's Potential in Banking - with Christophe Makni of Migros Bank: Today’s guest is Christophe Makni, Head of Business Operations at Migros Bank. Christophe shares a few key insights in this episode, starting with where natural language processing is finding a fit in banking today and the real deployments in the...
Podcast episode
[AI is Here] Unlocking NLP's Potential in Banking - with Christophe Makni of Migros Bank: Today’s guest is Christophe Makni, Head of Business Operations at Migros Bank. Christophe shares a few key insights in this episode, starting with where natural language processing is finding a fit in banking today and the real deployments in the...
byThe AI in Business Podcast
0 ratings
0% found this document useful
This Week In Machine Learning & AI - 5/27/16: The White House on AI & Aggressive Self-Driving Cars: This Week in Machine Learning & AI brings you the…
Podcast episode
This Week In Machine Learning & AI - 5/27/16: The White House on AI & Aggressive Self-Driving Cars: This Week in Machine Learning & AI brings you the…
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Exploring The Evolving Role Of Data Engineers: An interview with Maxime Beauchemin about how the technological progression in the data ecosystem is driving a constant change in the role and responsibilities of data engineers.
Podcast episode
Exploring The Evolving Role Of Data Engineers: An interview with Maxime Beauchemin about how the technological progression in the data ecosystem is driving a constant change in the role and responsibilities of data engineers.
byData Engineering Podcast
100%
100% found this document useful
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
Podcast episode
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
byHow to Data (Joshiverse- Journey of a Budding Data Scientist)
0 ratings
0% found this document useful
MLA 018 Descript: (Optional episode) just showcasing a cool application using machine learning Dept uses Descript for some of their podcasting. I'm using it like a maniac, I think they're surprised at how into it I am. Check out the transcript & see how it...
Podcast episode
MLA 018 Descript: (Optional episode) just showcasing a cool application using machine learning Dept uses Descript for some of their podcasting. I'm using it like a maniac, I think they're surprised at how into it I am. Check out the transcript & see how it...
byMachine Learning Guide
0 ratings
0% found this document useful
108: PySpark - Jonathan Rioux: Apache Spark is a unified analytics engine for large-scale data processing. PySpark blends the powerful Spark big data processing engine with the Python programming language to provide a data analysis platform that can scale up for nearly any task.
Podcast episode
108: PySpark - Jonathan Rioux: Apache Spark is a unified analytics engine for large-scale data processing. PySpark blends the powerful Spark big data processing engine with the Python programming language to provide a data analysis platform that can scale up for nearly any task.
byTest and Code
0 ratings
0% found this document useful
Eureka moments with natural language processing: featuring Nicholas Mohnacky of bundleIQ
Podcast episode
Eureka moments with natural language processing: featuring Nicholas Mohnacky of bundleIQ
byPractical AI: Machine Learning, Data Science
0 ratings
0% found this document useful
[DataFramed Careers Series #2] What Makes a Great Data Science Portfolio
Podcast episode
[DataFramed Careers Series #2] What Makes a Great Data Science Portfolio
byDataFramed
0 ratings
0% found this document useful
Getting Technical about the Data Center Revolution with Jonathan Friedmann, CEO of Speedata
Podcast episode
Getting Technical about the Data Center Revolution with Jonathan Friedmann, CEO of Speedata
byMaking Data Simple
0 ratings
0% found this document useful
167 | Visualization and Statistics with Andrew Gelman and Jessica Hullman
Podcast episode
167 | Visualization and Statistics with Andrew Gelman and Jessica Hullman
byData Stories
0 ratings
0% found this document useful
Data Visualization and D3.js with Irene Ros: Scott talks to Data Visualization expert Irene Ros. When she isn't contributing to the Miso Project, teaching her d3.js class, or working on making OpenVis Conf the best data visualization conference it can be, she's working on projects that focus on creating engaging interactive visual displays of information.
Podcast episode
Data Visualization and D3.js with Irene Ros: Scott talks to Data Visualization expert Irene Ros. When she isn't contributing to the Miso Project, teaching her d3.js class, or working on making OpenVis Conf the best data visualization conference it can be, she's working on projects that focus on creating engaging interactive visual displays of information.
byHanselminutes with Scott Hanselman
0 ratings
0% found this document useful
Improving the Learning Experience on Real Python
Podcast episode
Improving the Learning Experience on Real Python
byThe Real Python Podcast
0 ratings
0% found this document useful
One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
Podcast episode
One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
008 Math: Introduction to the branches of mathematics used in machine learning. Linear algebra, statistics, calculus. ocdevel.com/mlg/8 for notes and resources
Podcast episode
008 Math: Introduction to the branches of mathematics used in machine learning. Linear algebra, statistics, calculus. ocdevel.com/mlg/8 for notes and resources
byMachine Learning Guide
0 ratings
0% found this document useful
#51 Francois Chollet - Intelligence and Generalisation
Podcast episode
#51 Francois Chollet - Intelligence and Generalisation
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
Beam and Spark with Holden Karau: This week our colleague, Holden Karau, joins us to talk about Spark and Beam.
Podcast episode
Beam and Spark with Holden Karau: This week our colleague, Holden Karau, joins us to talk about Spark and Beam.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Computational Thinking & Learning Python During an AI Revolution
Podcast episode
Computational Thinking & Learning Python During an AI Revolution
byThe Real Python Podcast
0 ratings
0% found this document useful
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
Podcast episode
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
byData Engineering Podcast
0 ratings
0% found this document useful
2155: Databricks - The Story Behind the Lakehouse Company: Many are citing open source as the future. The UK Government's National Data Strategy even talks about the importance of opening public sector datasets to form the backbone of innovation, efficiency, and growth. This is a trend that Databricks...
Podcast episode
2155: Databricks - The Story Behind the Lakehouse Company: Many are citing open source as the future. The UK Government's National Data Strategy even talks about the importance of opening public sector datasets to form the backbone of innovation, efficiency, and growth. This is a trend that Databricks...
byThe Tech Talks Daily Podcast
0 ratings
0% found this document useful
Declarative Machine Learning Without The Operational Overhead Using Continual: An interview with Tristan Zajonc about his work at Continual to make declarative machine learning workflows possible and seamless by building on top of the data warehouse, and how it reduces the time and cost of putting machine learning into production.
Podcast episode
Declarative Machine Learning Without The Operational Overhead Using Continual: An interview with Tristan Zajonc about his work at Continual to make declarative machine learning workflows possible and seamless by building on top of the data warehouse, and how it reduces the time and cost of putting machine learning into production.
byData Engineering Podcast
0 ratings
0% found this document useful
Open Source TensorFlow with Yifei Feng: Yifei Feng, a TensorFlow software engineer, shares with Melanie and Mark about her work on the open source TensorFlow project and the tools she builds.
Podcast episode
Open Source TensorFlow with Yifei Feng: Yifei Feng, a TensorFlow software engineer, shares with Melanie and Mark about her work on the open source TensorFlow project and the tools she builds.
byGoogle Cloud Platform Podcast
100%
100% found this document useful
Learning Long-Time Dependencies with RNNs w/ Konstantin Rusch - #484: Today we conclude our 2021 ICLR coverage joined by Konstantin Rusch, a PhD Student at ETH Zurich. In our conversation with Konstantin, we explore his recent papers, titled coRNN and uniCORNN respectively, which focus on a novel architecture of...
Podcast episode
Learning Long-Time Dependencies with RNNs w/ Konstantin Rusch - #484: Today we conclude our 2021 ICLR coverage joined by Konstantin Rusch, a PhD Student at ETH Zurich. In our conversation with Konstantin, we explore his recent papers, titled coRNN and uniCORNN respectively, which focus on a novel architecture of...
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful

Skip carousel

Scikit-Learn: The Ultimate Python Library
APC
Article
Scikit-Learn: The Ultimate Python Library
Jul 15, 2019
4 min read
How Image Recognition Works
APC
Article
How Image Recognition Works
Nov 4, 2019
4 min read
Tensor Flow 101
APC
Article
Tensor Flow 101
Jan 27, 2020
4 min read
How AI Algorithms Could Help Design New Drugs
Futurity
Article
How AI Algorithms Could Help Design New Drugs
Apr 6, 2017
A new kind of AI algorithm—designed to work with a small amount of data—may be able to assist in the early stages of drug development. Artificially intelligent algorithms can learn to identify amazingly subtle information, enabling them to distinguis
3 min read
Top Five AI-ML Books For Business Leaders
Techfastly
Article
Top Five AI-ML Books For Business Leaders
Aug 2, 2021
5 min read
Want A Job In Data Science? You Might Have To Take A Standardized Test When Applying
Chicago Tribune
Article
Want A Job In Data Science? You Might Have To Take A Standardized Test When Applying
Jul 10, 2018
3 min read
Brave Browser 1.48.171
Linux Format
Article
Brave Browser 1.48.171
Apr 4, 2023
2 min read
Upgrade Your Marketing With Machine Learning
Fast Company
Article
Upgrade Your Marketing With Machine Learning
Sep 9, 2019
2 min read
» Stochastic Algorithms
Linux Format
Article
» Stochastic Algorithms
Dec 14, 2021
If you’re up for some relatively maths-heavy computer-science reading (and who isn’t?), then consider looking into stochastic algorithms. Sometimes lumped together with machine-learning, stochastic algorithms is a loosely defined category that you co
1 min read
Comparing Time Series Data Like A Pro
Linux Format
Article
Comparing Time Series Data Like A Pro
Jun 1, 2021
8 min read
Help Yourself To Avoid These Pitfalls
MacLife
Article
Help Yourself To Avoid These Pitfalls
Dec 11, 2018
GETTING UP TO full speed with the Shortcuts app takes time, and you’ll inevitably make a few mistakes along the way. Having to troubleshoot your efforts doesn’t mean you’ve failed — with years of experience, even professional programmers do this. Tak
2 min read
Powering Costing With Artificial Intelligence: The Case Of Vodafone Procurement
The European Business Review
Article
Powering Costing With Artificial Intelligence: The Case Of Vodafone Procurement
May 25, 2021
8 min read
Manipulate Data Like A Pro With Pandas
Linux Format
Article
Manipulate Data Like A Pro With Pandas
Jul 27, 2021
7 min read
Image Recognition
Linux Format
Article
Image Recognition
Apr 6, 2021
4 min read
Image Recognition
APC
Article
Image Recognition
Oct 4, 2021
4 min read
Problems Solved
Computeractive
Article
Problems Solved
Mar 16, 2022
11 min read
Deep Learning Technique for Object Detection
Techfastly
Article
Deep Learning Technique for Object Detection
Jun 1, 2021
3 min read
Image Recognition
Maximum PC
Article
Image Recognition
Sep 14, 2021
4 min read
Ultra-Precision, Super-Speed, Zero-Error Inspection; Cognitive Visual Inspection in Manufacturing
Techfastly
Article
Ultra-Precision, Super-Speed, Zero-Error Inspection; Cognitive Visual Inspection in Manufacturing
Dec 1, 2021
5 min read
The Deep Learning Revolution For Artificial Intelligence
Facility Management
Article
The Deep Learning Revolution For Artificial Intelligence
Mar 28, 2019
3 min read
ARTIFICIAL INTELLIGENCE (AI) IN SUPPLY CHAIN PLANNING THE Future is Here & Now
The European Business Review
Article
ARTIFICIAL INTELLIGENCE (AI) IN SUPPLY CHAIN PLANNING THE Future is Here & Now
Dec 3, 2019
7 min read
Overall Usefulness
Linux Format
Article
Overall Usefulness
Sep 22, 2020
3 min read
Mastering Chatgpt
PC Pro Magazine
Article
Mastering Chatgpt
Jan 4, 2024
5 min read
Code A Cataloguing Application In Python
Linux Format
Article
Code A Cataloguing Application In Python
Nov 15, 2022
Credit: www.djangoproject.com Matt Holder has been a fan of the open source methodology for over two decades and uses Linux and other tools where possible. More featurepacked source code for this project can be downloaded from https://github.com/mat
8 min read
Mac Software
MacFormat
Article
Mac Software
Dec 15, 2020
3 min read
Luminar AI Photo Editor
MacLife
Article
Luminar AI Photo Editor
Jan 5, 2021
2 min read
Generative AI: What Leaders Need To Know
Rotman Management
Article
Generative AI: What Leaders Need To Know
Jan 1, 2024
12 min read
DJANGO Create A Database-driven Website
Linux Format
Article
DJANGO Create A Database-driven Website
Jun 4, 2019
The Django web framework was named after the famous guitarist Django Reinhardt and was first created by web developers at a small newspaper in Kansas. The main goals of Django is to enable fast development of complex websites with database needs. It
7 min read
Eye Spy With My Little Pi API…
Linux Format
Article
Eye Spy With My Little Pi API…
May 30, 2023
9 min read
Gauge Against The Machine
Cycling Weekly
Article
Gauge Against The Machine
Aug 17, 2023
6 min read

Related categories

Skip carousel

Reviews for The Supervised Learning Workshop - Second Edition

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

The Supervised Learning Workshop - Second Edition - Blaine Bateman

Appendix

Preface

About the Book

Would you like to understand how and why machine learning techniques and data analytics are spearheading enterprises globally? From analyzing bioinformatics to predicting climate change, machine learning plays an increasingly pivotal role in our society.

Although the real-world applications may seem complex, this book simplifies supervised learning for beginners with a step-by-step interactive approach. Working with real-time datasets, you'll learn how supervised learning, when used with Python, can produce efficient predictive models.

Starting with the fundamentals of supervised learning, you'll quickly move to understand how to automate manual tasks and the process of assessing data using Jupyter and Python libraries like pandas. Next, you'll use data exploration and visualization techniques to develop powerful supervised learning models, before understanding how to distinguish variables and represent their relationships using scatter plots, heatmaps, and box plots. After using regression and classification models on real-time datasets to predict future outcomes, you'll grasp advanced ensemble techniques such as boosting and random forests. Finally, you'll learn the importance of model evaluation in supervised learning and study metrics to evaluate regression and classification tasks.

By the end of this book, you'll have the skills you need to work on your own real-life supervised learning Python projects.

Audience

If you are a beginner or a data scientist who is just getting started and looking to learn how to implement machine learning algorithms to build predicting models, then this book is for you. To expedite the learning process, a solid understanding of Python programming is recommended as you'll be editing the classes or functions instead of creating from scratch.

About the Chapters

Chapter 1, Fundamentals, introduces you to supervised learning, Jupyter notebooks, and some of the most common pandas data methods.

Chapter 2, Exploratory Data Analysis and Visualization, teaches you how to perform exploration and analysis on a new dataset.

Chapter 3, Linear Regression, teaches you how to tackle regression problems and analysis, introducing you to linear regression as well as multiple linear regression and gradient descent.

Chapter 4, Autoregression, teaches you how to implement autoregression as a method to forecast values that depend on past values.

Chapter 5, Classification Techniques, introduces classification problems, classification using linear and logistic regression, k-nearest neighbors, and decision trees.

Chapter 6, Ensemble Modeling, teaches you how to examine the different ways of ensemble modeling, including their benefits and limitations.

Chapter 7, Model Evaluation, demonstrates how you can improve a model's performance by using hyperparameters and model evaluation metrics.

Conventions

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: Use the pandas read_csv function to load the CSV file containing the synth_temp.csv dataset, and then display the first five lines of data.

Words that you see on screen, for example, in menus or dialog boxes, also appear in the text like this: Open the titanic.csv file by clicking on it on the Jupyter notebook home page.

A block of code is set as follows:

print(data[pd.isnull(data.damage_millions_dollars)].shape[0])

print(data[pd.isnull(data.damage_millions_dollars) &

(data.damage_description != 'NA')].shape[0])

New terms and important words are shown like this: Supervised means that the labels for the data are provided within the training, allowing the model to learn from these labels.

Code Presentation

Lines of code that span multiple lines are split using a backslash ( \ ). When the code is executed, Python will ignore the backslash, and treat the code on the next line as a direct continuation of the current line.

For example:

history = model.fit(X, y, epochs=100, batch_size=5, verbose=1, \

validation_split=0.2, shuffle=False)

Comments are added into code to help explain specific bits of logic. Single-line comments are denoted using the # symbol, as follows:

# Print the sizes of the dataset

print(Number of Examples in the Dataset = , X.shape[0])

print(Number of Features for each example = , X.shape[1])

Multi-line comments are enclosed by triple quotes, as shown below:

Define a seed for the random number generator to ensure the

result will be reproducible

seed = 1

np.random.seed(seed)

random.set_seed(seed)

Setting up Your Environment

Before we explore the book in detail, we need to set up specific software and tools. In the following section, we shall see how to do that.

Installation and Setup

All code in this book is executed using Jupyter Notebooks and Python 3.7. Jupyter Notebooks and Python 3.7 are available once you install Anaconda on your system. The following sections lists the instructions for installing Anaconda on Windows, macOS, and Linux systems.

Installing Anaconda on Windows

Here are the steps that you need to follow to complete the installation:

Visit https://www.anaconda.com/products/individual and click on the Download button.

Under the Anaconda Installer/Windows section, select the Python 3.7 version of the installer.

Ensure that you install a version relevant to the architecture of your computer (either 32-bit or 64-bit). You can find out this information in the System Properties window of your OS.

Once the installer has been downloaded, double-click on the file, and follow the on-screen instructions to complete the installation.

These installations will be executed in the ‘C’ drive of your system. However, you can choose to change the destination.

Installing Anaconda on macOS

Visit https://www.anaconda.com/products/individual and click on the Download button.

Under the Anaconda Installer/MacOS section, select the (Python 3.7) 64-Bit Graphical Installer.

Once the installer has been downloaded, double-click on the file, and follow the on-screen instructions to complete the installation.

Installing Anaconda on Linux

Visit https://www.anaconda.com/products/individual and click on the Download button.

Under the Anaconda Installer/Linux section, select the (Python 3.7) 64-Bit (x86) installer.

Once the installer has been downloaded, run the following command in your terminal: bash ~/Downloads/Anaconda-2020.02-Linux-x86_64.sh

Follow the instructions that appear on your terminal to complete the installation.

You can find more details regarding the installation for various systems by visiting this site: https://docs.anaconda.com/anaconda/install/.

Installing Libraries

pip comes pre-installed with Anaconda. Once Anaconda is installed on your machine, all the required libraries can be installed using pip, for example, pip install numpy. Alternatively, you can install all the required libraries using pip install –r requirements.txt. You can find the requirements.txt file at https://packt.live/3hSJgYy.

The exercises and activities will be executed in Jupyter Notebooks. Jupyter is a Python library and can be installed in the same way as the other Python libraries – that is, with pip install jupyter, but fortunately, it comes pre-installed with Anaconda. To open a notebook, simply run the command jupyter notebook in the Terminal or Command Prompt.

Accessing the Code Files

You can find the complete code files of this book at https://packt.live/2TlcKDf. You can also run many activities and exercises directly in your web browser by using the interactive lab environment at https://packt.live/37QVpsD.

We've tried to support interactive versions of all activities and exercises, but we recommend a local installation as well for instances where this support isn't available.

If you have any issues or questions about installation, please email us at workshops@packt.com.

1. Fundamentals

Overview

This chapter introduces you to supervised learning, using Anaconda to manage coding environments, and using Jupyter notebooks to create, manage, and run code. It also covers some of the most common Python packages used in supervised learning: pandas, NumPy, Matplotlib, and seaborn. By the end of this chapter, you will be able to install and load Python libraries into your development environment for use in analysis and machine learning problems. You will also be able to load an external data source using pandas, and use a variety of methods to search, filter, and compute descriptive statistics of the data. This chapter will enable you to gauge the potential impact of various issues such as missing data, class imbalance, and low sample size within the data source.

Introduction

The study and application of machine learning and artificial intelligence has recently been the source of much interest and research in the technology and business communities. Advanced data analytics and machine learning techniques have shown great promise in advancing many sectors, such as personalized healthcare and self-driving cars, as well as in solving some of the world's greatest challenges, such as combating climate change (see Tackling Climate Change with Machine Learning: https://arxiv.org/pdf/1906.05433.pdf).

This book has been designed to help you to take advantage of the unique confluence of events in the field of data science and machine learning today. Across the globe, private enterprises and governments are realizing the value and efficiency of data-driven products and services. At the same time, reduced hardware costs and open source software solutions are significantly reducing the barriers to entry of learning and applying machine learning techniques.

Here, we will focus on supervised machine learning (or, supervised learning for short). We'll explain the different types of machine learning shortly, but let's begin with some quick information. The now-classic example of supervised learning is developing an algorithm to distinguish between pictures of cats and dogs. The supervised part arises from two aspects; first, we have a set of pictures where we know the correct answers. We call such data labeled data. Second, we carry out a process where we iteratively test our algorithm's ability to predict cat or dog given pictures, and we make corrections to the algorithm when the predictions are incorrect. This process, at a high level, is similar to teaching children. However, it generally takes a lot more data to train an algorithm than to teach a child to recognize cats and dogs! Fortunately, there are rapidly growing sources of data at our disposal. Note the use of the words learning and train in the context of developing our algorithm. These might seem to be giving human qualities to our machines and computer programs, but they are already deeply ingrained in the machine learning (and artificial intelligence) literature, so let's use them and understand them. Training in our context here always refers to the process of providing labeled data to an algorithm and making adjustments to the algorithm to best predict the labels given the data. Supervised means that the labels for the data are provided within the training, allowing the model to learn from these labels.

Let's now understand the distinction between supervised learning and other forms of machine learning.

When to Use Supervised Learning

Generally, if you are trying to automate or replicate an existing process, the problem is a supervised learning problem. As an example, let's say you are the publisher of a magazine that reviews and ranks hairstyles from various time periods. Your readers frequently send you far more images of their favorite hairstyles for review than you can manually process. To save some time, you would like to automate the sorting of the hairstyle images you receive based on time periods, starting with hairstyles from the 1960s and 1980s, as you can see in the following figure:

Figure 1.1: Images of hairstyles from different time periods

Figure 1.1: Images of hairstyles from different time periods

To create your hairstyles-sorting algorithm, you start by collecting a large sample of hairstyle images and manually labeling each one with its corresponding time period. Such a dataset (known as a labeled dataset) is the input data (hairstyle images) for which the desired output information (time period) is known and recorded. This type of problem is a classic supervised learning problem; we are trying to develop an algorithm that takes a set of inputs and learns to return the answers that we have told it are correct.

Python Packages and Modules

Python is one of the most popular programming languages used for machine learning, and is the language used here.

While the standard features that are included in Python are certainly feature-rich, the true power of Python lies in the additional libraries (also known as packages), which, thanks to open source licensing, can be easily downloaded and installed through a few simple commands. In this book, we generally assume your system has been configured using Anaconda, which is an open source environment manager for Python. Depending on your system, you can configure multiple virtual environments using Anaconda, each one configured with specific packages and even different versions of Python. Using Anaconda takes care of many of the requirements to get ready to perform machine learning, as many of the most common packages come pre-built within Anaconda. Refer to the preface for Anaconda installation instructions.

In this book, we will be using the following additional Python packages:

NumPy (pronounced Num Pie and available at https://www.numpy.org/): NumPy (short for numerical Python) is one of the core components of scientific computing in Python. NumPy provides the foundational data types from which a number of other data structures derive, including linear algebra, vectors and matrices, and key random number functionality.

SciPy (pronounced Sigh Pie and available at https://www.scipy.org): SciPy, along with NumPy, is a core scientific computing package. SciPy provides a number of statistical tools, signal processing tools, and other functionality, such as Fourier transforms.

pandas (available at https://pandas.pydata.org/): pandas is a high-performance library for loading, cleaning, analyzing, and manipulating data structures.

Matplotlib (available at https://matplotlib.org/): Matplotlib is the foundational Python library for creating graphs and plots of datasets and is also the base package from which other Python plotting libraries derive. The Matplotlib API has been designed in alignment with the Matlab plotting library to facilitate an easy transition to Python.

Seaborn (available at https://seaborn.pydata.org/): Seaborn is a plotting library built on top of Matplotlib, providing attractive color and line styles as well as a number of common plotting templates.

Scikit-learn (available at https://scikit-learn.org/stable/): Scikit-learn is a Python machine learning library that provides a number of data mining, modeling, and analysis techniques in a simple API. Scikit-learn includes a number of machine learning algorithms out of the box, including classification, regression, and clustering techniques.

These packages form the foundation of a versatile machine learning development environment, with each package contributing a key set of functionalities. As discussed, by using Anaconda, you will already have all of the required packages installed and ready for use. If you require a package that is not included in the Anaconda installation, it can be installed by simply entering and executing the following code in a Jupyter notebook cell:

!conda install

As an example, if we wanted to install Seaborn, we'd run the following command:

!conda install seaborn

To use one of these packages in a notebook, all we need to do is import it:

import matplotlib

Loading Data in Pandas

pandas has the ability to read and write a number of different file formats and data structures, including CSV, JSON, and HDF5 files, as well as SQL and Python Pickle formats. The pandas input/output documentation can be found at https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html. We will continue to look into the pandas functionality by loading data via a CSV file.

Note

The dataset used in this chapter is available on our GitHub repository via the following link: https://packt.live/2vjyPK9. Once you download the entire repository on your system, you can find the dataset in the Datasets folder. Furthermore, this dataset is the Titanic: Machine Learning from Disaster dataset, which was originally made available at https://www.kaggle.com/c/Titanic/data.

The dataset contains a roll of the guests on board the famous ship Titanic, as well as their age, survival status, and number of siblings/parents. Before we get started with loading the data into Python, it is critical that we spend some time looking over the information provided for the dataset so that we can have a thorough understanding of what it contains. Download the dataset and place it in the directory you're working in.

Looking at the description for the data, we can see that we have the following fields available:

survival: This tells us whether a given person survived (0 = No, 1 = Yes).

pclass: This is a proxy for socio-economic status, where first class is upper, second class is middle, and third class is lower status.

sex: This tells us whether a given person is male or female.

age: This is a fractional value if less than 1; for example, 0.25 is 3 months. If the age is estimated, it is in the form of xx.5.

sibsp: A sibling is defined as a brother, sister, stepbrother, or stepsister, and a spouse is a husband or wife.

parch: A parent is a mother or father, while a child is a daughter, son, stepdaughter, or stepson. Children that traveled only with a nanny did not travel with a parent. Thus, 0 was assigned for this field.

ticket: This gives the person's ticket number.

fare: This is the passenger's fare.

cabin: This tells us the passenger's cabin number.

embarked: The point of embarkation is the location where the passenger boarded the ship.

Note that the information provided with the dataset does not give any context as to how the data was collected. The survival, pclass, and embarked fields are known as categorical variables as they are assigned to one of a fixed number of labels or categories to indicate some other information. For example, in embarked, the C label indicates that the passenger boarded the ship at Cherbourg, and the value of 1 in survival indicates they survived the sinking.

Exercise 1.01: Loading and Summarizing the Titanic Dataset

In this exercise, we will read our Titanic dataset into Python and perform a few basic summary operations on it:

Open a new Jupyter notebook.

Import the pandas and numpy packages using shorthand notation:

import pandas as pd

import numpy as np

Open the titanic.csv file by clicking on it in the Jupyter notebook home page as shown in the following figure:

Figure 1.2: Opening the CSV file

Figure 1.2: Opening the CSV file

The file is a CSV file, which can be thought of as a table, where each line is a row in the table and each comma separates columns in the table. Thankfully, we don't need to work with these tables in raw text form and can load them using pandas:

Figure 1.3: Contents of the CSV file

Figure 1.3: Contents of the CSV file

Note

Take a moment to look up the pandas documentation for the read_csv function at https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html. Note the number of different options available for loading CSV data into a pandas DataFrame.

In an executable Jupyter notebook cell, execute the following code to load the data from the file:

df = pd.read_csv(r'..\Datasets\titanic.csv')

The pandas DataFrame class provides a comprehensive set of attributes and methods that can be executed on its own contents, ranging from sorting, filtering, and grouping methods to descriptive statistics, as well as plotting and conversion.

Note

Open and read the documentation for pandas DataFrame objects at https://pandas.pydata.org/pandas-docs/stable/reference/frame.html.

Read the first ten rows of data using the head() method of the DataFrame:

Note

The # symbol in the code snippet below denotes a code comment. Comments are added into code to help explain specific bits of logic.

df.head(10) # Examine the first 10 samples

The output will be as follows:

Figure 1.4: Reading the first 10 rows

Figure 1.4: Reading the first 10 rows

Note

To access the source code for this specific section, please refer to https://packt.live/2Ynb7sf.

You can also run this example online at https://packt.live/2BvTRrG. You must execute the entire Notebook in order to get the desired result.

In this sample, we have a visual representation of the information in the DataFrame. We can see that the data is organized in a tabular, almost spreadsheet-like structure. The different types of data are organized into columns, while each sample is organized into rows. Each row is assigned an index value and is shown as the numbers 0 to 9 in bold on the left-hand side of the DataFrame. Each column is assigned to a label or name, as shown in bold at the top of the DataFrame.

The idea of a DataFrame as a kind of spreadsheet is a reasonable analogy. As we will see in this chapter, we can sort, filter, and perform computations on the data just as you would in a spreadsheet program. While it's not covered in this chapter, it is interesting to note that DataFrames also contain pivot table functionality, just like a spreadsheet (https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.pivot_table.html).

Exercise 1.02: Indexing and Selecting Data

Now that we have loaded some data, let's use the selection and indexing methods of the DataFrame to access some data of interest. This exercise is a continuation of Exercise 1.01, Loading and Summarizing the Titanic Dataset:

Select individual columns in a similar way to a regular dictionary by using the labels of the columns, as shown here:

df['Age']

The output will be as follows:

0 22.0

1 38.0

2 26.0

3 35.0

4 35.0

...

1304 NaN

1305 39.0

1306 38.5

1307

Enjoying the preview?

Page 1 of 1

The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition

About this ebook

Blaine Bateman

Related authors

Related to The Supervised Learning Workshop - Second Edition

Related ebooks

Programming For You

Related podcast episodes

Related articles

Related categories

Reviews for The Supervised Learning Workshop - Second Edition

What did you think?

Book preview

The Supervised Learning Workshop - Second Edition - Blaine Bateman

About the Book

Audience

About the Chapters

Conventions

Code Presentation

Setting up Your Environment

Installation and Setup

Installing Anaconda on Windows

Installing Anaconda on macOS

Installing Anaconda on Linux

Installing Libraries

Accessing the Code Files

1. Fundamentals

Overview

Introduction

When to Use Supervised Learning

Python Packages and Modules

Loading Data in Pandas

Note

Exercise 1.01: Loading and Summarizing the Titanic Dataset

Note

Note

Note

Note

Exercise 1.02: Indexing and Selecting Data