Ebook540 pages3 hours

Deep Learning with Structured Data

Name: Deep Learning with Structured Data
Author: Mark Ryan
ISBN: 9781638357179

By Mark Ryan

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Deep Learning with Structured Data teaches you powerful data analysis techniques for tabular data and relational databases.

Summary
Deep learning offers the potential to identify complex patterns and relationships hidden in data of all sorts. Deep Learning with Structured Data shows you how to apply powerful deep learning analysis techniques to the kind of structured, tabular data you'll find in the relational databases that real-world businesses depend on. Filled with practical, relevant applications, this book teaches you how deep learning can augment your existing machine learning and business intelligence systems.

Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.

About the technology
Here’s a dirty secret: Half of the time in most data science projects is spent cleaning and preparing data. But there’s a better way: Deep learning techniques optimized for tabular data and relational databases deliver insights and analysis without requiring intense feature engineering. Learn the skills to unlock deep learning performance with much less data filtering, validating, and scrubbing.

About the book
Deep Learning with Structured Data teaches you powerful data analysis techniques for tabular data and relational databases. Get started using a dataset based on the Toronto transit system. As you work through the book, you’ll learn how easy it is to set up tabular data for deep learning, while solving crucial production concerns like deployment and performance monitoring.

What's inside

    When and where to use deep learning
    The architecture of a Keras deep learning model
    Training, deploying, and maintaining models
    Measuring performance

About the reader
For readers with intermediate Python and machine learning skills.

About the author
Mark Ryan is a Data Science Manager at Intact Insurance. He holds a Master's degree in Computer Science from the University of Toronto.

Table of Contents

1 Why deep learning with structured data?

2 Introduction to the example problem and Pandas dataframes

3 Preparing the data, part 1: Exploring and cleansing the data

4 Preparing the data, part 2: Transforming the data

5 Preparing and building the model

6 Training the model and running experiments

7 More experiments with the trained model

8 Deploying the model

9 Recommended next steps

Skip carousel

LanguageEnglish

PublisherManning

Release dateDec 8, 2020

ISBN9781638357179

Author

Mark Ryan

Mark Ryan is a Manager at Google in Kitchener, Canada. Mark has a passion for sharing the benefits of machine learning, including delivering machine learning bootcamps to give participants a hands-on introduction to the world of machine learning. In addition to deep learning and its potential to unlock additional value in structured, tabular data, Mark is interested in chatbots and the potential of autonomous vehicles. Mark has a Bachelor of Mathematics from the University of Waterloo and a Masters in Computer Science from the University of Toronto.

Related authors

Skip carousel

Related to Deep Learning with Structured Data

Related ebooks

Skip carousel

Machine Learning with TensorFlow, Second Edition
Ebook
Machine Learning with TensorFlow, Second Edition
byChris Mattmann
Rating: 0 out of 5 stars
0 ratings
Deep Learning Patterns and Practices
Ebook
Deep Learning Patterns and Practices
byAndrew Ferlitsch
Rating: 0 out of 5 stars
0 ratings
Transfer Learning for Natural Language Processing
Ebook
Transfer Learning for Natural Language Processing
byPaul Azunre
Rating: 0 out of 5 stars
0 ratings
Graph-Powered Machine Learning
Ebook
Graph-Powered Machine Learning
byAlessandro Negro
Rating: 0 out of 5 stars
0 ratings
Machine Learning Systems: Designs that scale
Ebook
Machine Learning Systems: Designs that scale
byJeffrey Smith
Rating: 0 out of 5 stars
0 ratings
MLOps Engineering at Scale
Ebook
MLOps Engineering at Scale
byCarl Osipov
Rating: 0 out of 5 stars
0 ratings
Advanced Deep Learning with TensorFlow 2 and Keras - Second Edition: Apply DL, GANs, VAEs, deep RL, unsupervised learning, object detection and segmentation, and more, 2nd Edition
Ebook
Advanced Deep Learning with TensorFlow 2 and Keras - Second Edition: Apply DL, GANs, VAEs, deep RL, unsupervised learning, object detection and segmentation, and more, 2nd Edition
byRowel Atienza
Rating: 0 out of 5 stars
0 ratings
Machine Learning Engineering in Action
Ebook
Machine Learning Engineering in Action
byBen Wilson
Rating: 0 out of 5 stars
0 ratings
Deep Learning with Python, Second Edition
Ebook
Deep Learning with Python, Second Edition
byFrancois Chollet
Rating: 0 out of 5 stars
0 ratings
Grokking Machine Learning
Ebook
Grokking Machine Learning
byLuis Serrano
Rating: 0 out of 5 stars
0 ratings
Feature Engineering Bookcamp
Ebook
Feature Engineering Bookcamp
bySinan Ozdemir
Rating: 0 out of 5 stars
0 ratings
Advanced Deep Learning with Python: Design and implement advanced next-generation AI solutions using TensorFlow and PyTorch
Ebook
Advanced Deep Learning with Python: Design and implement advanced next-generation AI solutions using TensorFlow and PyTorch
byIvan Vasilev
Rating: 0 out of 5 stars
0 ratings
Think Like a Data Scientist: Tackle the data science process step-by-step
Ebook
Think Like a Data Scientist: Tackle the data science process step-by-step
byBrian Godsey
Rating: 0 out of 5 stars
0 ratings
TensorFlow in Action
Ebook
TensorFlow in Action
byThushan Ganegedara
Rating: 0 out of 5 stars
0 ratings
Reinforcement Learning Algorithms with Python: Learn, understand, and develop smart algorithms for addressing AI challenges
Ebook
Reinforcement Learning Algorithms with Python: Learn, understand, and develop smart algorithms for addressing AI challenges
byAndrea Lonza
Rating: 0 out of 5 stars
0 ratings
Machine Learning Bookcamp: Build a portfolio of real-life projects
Ebook
Machine Learning Bookcamp: Build a portfolio of real-life projects
byAlexey Grigorev
Rating: 4 out of 5 stars
4/5
Deep Reinforcement Learning Hands-On - Second Edition: Apply modern RL methods to practical problems of chatbots, robotics, discrete optimization, web automation, and more, 2nd Edition
Ebook
Deep Reinforcement Learning Hands-On - Second Edition: Apply modern RL methods to practical problems of chatbots, robotics, discrete optimization, web automation, and more, 2nd Edition
byMaxim Lapan
Rating: 0 out of 5 stars
0 ratings
Designing Machine Learning Systems with Python
Ebook
Designing Machine Learning Systems with Python
byDavid Julian
Rating: 0 out of 5 stars
0 ratings
Practical Full Stack Machine Learning: A Guide to Build Reliable, Reusable, and Production-Ready Full Stack ML Solutions
Ebook
Practical Full Stack Machine Learning: A Guide to Build Reliable, Reusable, and Production-Ready Full Stack ML Solutions
byAlok Kumar
Rating: 0 out of 5 stars
0 ratings
Programming Problems: Advanced Algorithms
Ebook
Programming Problems: Advanced Algorithms
byBradley Green
Rating: 4 out of 5 stars
4/5
Graph Databases in Action: Examples in Gremlin
Ebook
Graph Databases in Action: Examples in Gremlin
byJosh Perryman
Rating: 0 out of 5 stars
0 ratings
Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention
Ebook
Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention
bySteven Cooper
Rating: 0 out of 5 stars
0 ratings
Python Data Science Essentials
Ebook
Python Data Science Essentials
byBoschetti Alberto
Rating: 0 out of 5 stars
0 ratings
Deep Learning for Search
Ebook
Deep Learning for Search
byTommaso Teofili
Rating: 0 out of 5 stars
0 ratings
Real-World Machine Learning
Ebook
Real-World Machine Learning
byHenrik Brink
Rating: 0 out of 5 stars
0 ratings
Machine Learning in Action
Ebook
Machine Learning in Action
byPeter Harrington
Rating: 0 out of 5 stars
0 ratings
Deep Reinforcement Learning in Action
Ebook
Deep Reinforcement Learning in Action
byBrandon Brown
Rating: 4 out of 5 stars
4/5
Probabilistic Deep Learning: With Python, Keras and TensorFlow Probability
Ebook
Probabilistic Deep Learning: With Python, Keras and TensorFlow Probability
byBeate Sick
Rating: 0 out of 5 stars
0 ratings
Natural Language Processing in Action: Understanding, analyzing, and generating text with Python
Ebook
Natural Language Processing in Action: Understanding, analyzing, and generating text with Python
byHannes Hapke
Rating: 0 out of 5 stars
0 ratings
Deep Learning for Vision Systems
Ebook
Deep Learning for Vision Systems
byMohamed Elgendy
Rating: 5 out of 5 stars
5/5

Intelligence (AI) & Semantics For You

Skip carousel

Rise of Generative AI and ChatGPT: Understand how Generative AI and ChatGPT are transforming and reshaping the business world (English Edition)
Ebook
Rise of Generative AI and ChatGPT: Understand how Generative AI and ChatGPT are transforming and reshaping the business world (English Edition)
byUtpal Chakraborty
Rating: 0 out of 5 stars
0 ratings
Midjourney Mastery - The Ultimate Handbook of Prompts
Ebook
Midjourney Mastery - The Ultimate Handbook of Prompts
byAndreea Todinca
Rating: 5 out of 5 stars
5/5
AI for Educators: AI for Educators
Ebook
AI for Educators: AI for Educators
byMatt Miller
Rating: 5 out of 5 stars
5/5
101 Midjourney Prompt Secrets
Ebook
101 Midjourney Prompt Secrets
byMarcus Byrne
Rating: 3 out of 5 stars
3/5
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 5 out of 5 stars
5/5
Mastering ChatGPT: Unlock the Power of AI for Enhanced Communication and Relationships: English
Ebook
Mastering ChatGPT: Unlock the Power of AI for Enhanced Communication and Relationships: English
byVasyl Kolomiiets
Rating: 0 out of 5 stars
0 ratings
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
Ebook
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
Ebook
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
byCea West
Rating: 4 out of 5 stars
4/5
ChatGPT For Fiction Writing: AI for Authors
Ebook
ChatGPT For Fiction Writing: AI for Authors
byNova Leigh
Rating: 5 out of 5 stars
5/5
Dancing with Qubits: How quantum computing works and how it can change the world
Ebook
Dancing with Qubits: How quantum computing works and how it can change the world
byRobert S. Sutor
Rating: 5 out of 5 stars
5/5
ChatGPT For Dummies
Ebook
ChatGPT For Dummies
byPam Baker
Rating: 0 out of 5 stars
0 ratings
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
Ebook
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
byHadelin de Ponteves
Rating: 0 out of 5 stars
0 ratings
Artificial Intelligence: A Guide for Thinking Humans
Ebook
Artificial Intelligence: A Guide for Thinking Humans
byMelanie Mitchell
Rating: 4 out of 5 stars
4/5
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
Ebook
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
byTJ Books
Rating: 3 out of 5 stars
3/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
A Quickstart Guide To Becoming A ChatGPT Millionaire: The ChatGPT Book For Beginners (Lazy Money Series®)
Ebook
A Quickstart Guide To Becoming A ChatGPT Millionaire: The ChatGPT Book For Beginners (Lazy Money Series®)
byS M Howard
Rating: 4 out of 5 stars
4/5
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
Ebook
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
bySebastian Raschka
Rating: 5 out of 5 stars
5/5
Discovery Writing with ChatGPT: AI-Powered Storytelling: Three Story Method, #6
Ebook
Discovery Writing with ChatGPT: AI-Powered Storytelling: Three Story Method, #6
byJ. Thorn
Rating: 0 out of 5 stars
0 ratings
The Secrets of ChatGPT Prompt Engineering for Non-Developers
Ebook
The Secrets of ChatGPT Prompt Engineering for Non-Developers
byCea West
Rating: 5 out of 5 stars
5/5
What Makes Us Human: An Artificial Intelligence Answers Life's Biggest Questions
Ebook
What Makes Us Human: An Artificial Intelligence Answers Life's Biggest Questions
byJasmine Wang
Rating: 5 out of 5 stars
5/5
ChatGPT
Ebook
ChatGPT
byRobert Conway
Rating: 1 out of 5 stars
1/5
Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures
Ebook
Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures
byThe Passive Income Strategist
Rating: 4 out of 5 stars
4/5
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
Ebook
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
byMaximus Wilson
Rating: 0 out of 5 stars
0 ratings
ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)
Ebook
ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)
byMatthew Hayes
Rating: 0 out of 5 stars
0 ratings
TensorFlow in 1 Day: Make your own Neural Network
Ebook
TensorFlow in 1 Day: Make your own Neural Network
byKrishna Rungta
Rating: 4 out of 5 stars
4/5
ChatGPT for Marketing: A Practical Guide
Ebook
ChatGPT for Marketing: A Practical Guide
byJuanjo Ramos
Rating: 3 out of 5 stars
3/5
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
THE CHATGPT MILLIONAIRE'S HANDBOOK: UNLOCKING WEALTH THROUGH AI AUTOMATION
Ebook
THE CHATGPT MILLIONAIRE'S HANDBOOK: UNLOCKING WEALTH THROUGH AI AUTOMATION
byLogan Rivers
Rating: 5 out of 5 stars
5/5
The Business Case for AI: A Leader's Guide to AI Strategies, Best Practices & Real-World Applications
Ebook
The Business Case for AI: A Leader's Guide to AI Strategies, Best Practices & Real-World Applications
byKavita Ganesan
Rating: 0 out of 5 stars
0 ratings
Dark Aeon: Transhumanism and the War Against Humanity
Ebook
Dark Aeon: Transhumanism and the War Against Humanity
byJoe Allen
Rating: 5 out of 5 stars
5/5

Related podcast episodes

Skip carousel

55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
Podcast episode
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
byThe Web Platform Podcast
100%
100% found this document useful
#51 Francois Chollet - Intelligence and Generalisation
Podcast episode
#51 Francois Chollet - Intelligence and Generalisation
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
Leveling Up Natural Language Processing with Transfer Learning: An interview with Paul Azunre about how you can use transfer learning techniques to build more flexible natural language processing systems and reduce the requirements for labelled data.
Podcast episode
Leveling Up Natural Language Processing with Transfer Learning: An interview with Paul Azunre about how you can use transfer learning techniques to build more flexible natural language processing systems and reduce the requirements for labelled data.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
Podcast episode
One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
Let's Talk About Natural Language Processing: This episode reboots our podcast with the theme of Natural Language Processing for the next few months. We begin with introductions of Yoshi and Linh Da and then get into a broad discussion about natural language processing: what it is, what some of...
Podcast episode
Let's Talk About Natural Language Processing: This episode reboots our podcast with the theme of Natural Language Processing for the next few months. We begin with introductions of Yoshi and Linh Da and then get into a broad discussion about natural language processing: what it is, what some of...
byData Skeptic
0 ratings
0% found this document useful
#70 Beyond the Language Wars: R & Python for the Modern Data Scientist
Podcast episode
#70 Beyond the Language Wars: R & Python for the Modern Data Scientist
byDataFramed
0 ratings
0% found this document useful
Eureka moments with natural language processing: featuring Nicholas Mohnacky of bundleIQ
Podcast episode
Eureka moments with natural language processing: featuring Nicholas Mohnacky of bundleIQ
byPractical AI: Machine Learning, Data Science
0 ratings
0% found this document useful
Crafting Interpreters With Bob Nystrom: Bob Nystrom is the author of Crafting Interpreters. I speak with Nystrom about building a programming language and an interpreter implementation for it. We talk about parsing, the difference between compiler and interpreters and a lot more. If you are...
Podcast episode
Crafting Interpreters With Bob Nystrom: Bob Nystrom is the author of Crafting Interpreters. I speak with Nystrom about building a programming language and an interpreter implementation for it. We talk about parsing, the difference between compiler and interpreters and a lot more. If you are...
byCoRecursive: Coding Stories
0 ratings
0% found this document useful
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
Podcast episode
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
Podcast episode
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
byData Engineering Podcast
0 ratings
0% found this document useful
Microservices with Rafi Schloming: Microservices are a widely adopted pattern for breaking an application up into pieces that can be well-understood by the individual teams within the company. Microservices also allow these individual pieces to be scaled independently and updated in iso...
Podcast episode
Microservices with Rafi Schloming: Microservices are a widely adopted pattern for breaking an application up into pieces that can be well-understood by the individual teams within the company. Microservices also allow these individual pieces to be scaled independently and updated in iso...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
[DataFramed Careers Series #2] What Makes a Great Data Science Portfolio
Podcast episode
[DataFramed Careers Series #2] What Makes a Great Data Science Portfolio
byDataFramed
0 ratings
0% found this document useful
Level Up Your Data Platform With Active Metadata: A conversation with Atlan co-founder Prukalpa Sankar about the idea of active metadata and how it can reduce the toil involved in managing a data platform
Podcast episode
Level Up Your Data Platform With Active Metadata: A conversation with Atlan co-founder Prukalpa Sankar about the idea of active metadata and how it can reduce the toil involved in managing a data platform
byData Engineering Podcast
0 ratings
0% found this document useful
What is beyond PoCs? ML project-hurdles you should be prepared to take with Balázs Kégl - 016: Why do we do PoCs all the time and why do we struggle with Real projects? We are going to talk about ML project-hurdles with the head of AI at Huawei Paris, Balazs Kegl.
Podcast episode
What is beyond PoCs? ML project-hurdles you should be prepared to take with Balázs Kégl - 016: Why do we do PoCs all the time and why do we struggle with Real projects? We are going to talk about ML project-hurdles with the head of AI at Huawei Paris, Balazs Kegl.
byMachine Learning Cafe
0 ratings
0% found this document useful
001 Introduction: Teaches the high level fundamentals of machine learning and artificial intelligence. I teach basic intuition, algorithms, and math. I discuss languages and frameworks, deep learning, and more. ocdevel.com/mlg/1 for notes and resources
Podcast episode
001 Introduction: Teaches the high level fundamentals of machine learning and artificial intelligence. I teach basic intuition, algorithms, and math. I discuss languages and frameworks, deep learning, and more. ocdevel.com/mlg/1 for notes and resources
byMachine Learning Guide
0 ratings
0% found this document useful
41. Bob Nystrom
Podcast episode
41. Bob Nystrom
byIt's All Widgets! Flutter Podcast
0 ratings
0% found this document useful
Open Source TensorFlow with Yifei Feng: Yifei Feng, a TensorFlow software engineer, shares with Melanie and Mark about her work on the open source TensorFlow project and the tools she builds.
Podcast episode
Open Source TensorFlow with Yifei Feng: Yifei Feng, a TensorFlow software engineer, shares with Melanie and Mark about her work on the open source TensorFlow project and the tools she builds.
byGoogle Cloud Platform Podcast
100%
100% found this document useful
MLA 020 Kubeflow: Conversation with Dirk-Jan Kubeflow (vs cloud native solutions like SageMaker) - Data Scientist at Dept Agency . (From the website:) The Machine Learning Toolkit for Kubernetes. The Kubeflow project is dedicated to making deployments of...
Podcast episode
MLA 020 Kubeflow: Conversation with Dirk-Jan Kubeflow (vs cloud native solutions like SageMaker) - Data Scientist at Dept Agency . (From the website:) The Machine Learning Toolkit for Kubernetes. The Kubeflow project is dedicated to making deployments of...
byMachine Learning Guide
0 ratings
0% found this document useful
#65 Preventing Fraud in eCommerce with Data Science
Podcast episode
#65 Preventing Fraud in eCommerce with Data Science
byDataFramed
0 ratings
0% found this document useful
[MINI] Long Short Term Memory: Thanks to our sponsor brilliant.org/dataskeptics A Long Short Term Memory (LSTM) is a neural unit, often used in Recurrent Neural Network (RNN) which attempts to provide the network the capacity to store information for longer periods of time. An...
Podcast episode
[MINI] Long Short Term Memory: Thanks to our sponsor brilliant.org/dataskeptics A Long Short Term Memory (LSTM) is a neural unit, often used in Recurrent Neural Network (RNN) which attempts to provide the network the capacity to store information for longer periods of time. An...
byData Skeptic
0 ratings
0% found this document useful
2: Pytest vs Unittest vs Nose: Choosing a test framework
Podcast episode
2: Pytest vs Unittest vs Nose: Choosing a test framework
byTest and Code
0 ratings
0% found this document useful
Morgan Senkal: Using Epics to Improve Code Quality Within Sprints: Robby speaks with Morgan Senkal, Software Architect at Metal Toad. Morgan recalls a challenging 15-year-old legacy project that was reminiscent of a Stephen King story and explains what to think about when considering a software rewrite. Morgan and Robby keep a running analogy of technical debt and automotive repairs.
Podcast episode
Morgan Senkal: Using Epics to Improve Code Quality Within Sprints: Robby speaks with Morgan Senkal, Software Architect at Metal Toad. Morgan recalls a challenging 15-year-old legacy project that was reminiscent of a Stephen King story and explains what to think about when considering a software rewrite. Morgan and Robby keep a running analogy of technical debt and automotive repairs.
byMaintainable
0 ratings
0% found this document useful
This Week In Machine Learning & AI - 5/20/16: AI at Google I/O, Amazon's Deep Learning DSSTNE: This Week In Machine Learning & AI - May 20, 2016…
Podcast episode
This Week In Machine Learning & AI - 5/20/16: AI at Google I/O, Amazon's Deep Learning DSSTNE: This Week In Machine Learning & AI - May 20, 2016…
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Every commit is a gift: celebrating Maintainer Week with Brett Cannon
Podcast episode
Every commit is a gift: celebrating Maintainer Week with Brett Cannon
byThe Changelog: Software Development, Open Source
0 ratings
0% found this document useful
#111 The Rise of the Julia Programming Language
Podcast episode
#111 The Rise of the Julia Programming Language
byDataFramed
0 ratings
0% found this document useful
A Programmer's Guide to Computer Science with Dr. William Springer: Have you failed a job interview because you don't know computer science? William Springer has a PhD in computer science and his books takes you through what you would have learned while earning a four-year computer science degree! Both Scott and William believe in breaking down boundaries, and it starts with this show!
Podcast episode
A Programmer's Guide to Computer Science with Dr. William Springer: Have you failed a job interview because you don't know computer science? William Springer has a PhD in computer science and his books takes you through what you would have learned while earning a four-year computer science degree! Both Scott and William believe in breaking down boundaries, and it starts with this show!
byHanselminutes with Scott Hanselman
100%
100% found this document useful
It’s Not a Data Science Problem, It’s a Data Engineering Problem with Laurie Voss: Laurie Voss is a senior data analyst at Netlify, makers of a serverless platform designed to help teams build, deploy, and collaborate on web apps more effectively. Previously, Laurie worked as Chief Data Officer at npm, Inc., co-founded Snowball Factory,
Podcast episode
It’s Not a Data Science Problem, It’s a Data Engineering Problem with Laurie Voss: Laurie Voss is a senior data analyst at Netlify, makers of a serverless platform designed to help teams build, deploy, and collaborate on web apps more effectively. Previously, Laurie worked as Chief Data Officer at npm, Inc., co-founded Snowball Factory,
byScreaming in the Cloud
0 ratings
0% found this document useful
433: Falling for FastAPI: Mike's falling in love with FastAPI and gives us a hint at the next project he's building.
Podcast episode
433: Falling for FastAPI: Mike's falling in love with FastAPI and gives us a hint at the next project he's building.
byCoder Radio
0 ratings
0% found this document useful
Computational Thinking & Learning Python During an AI Revolution
Podcast episode
Computational Thinking & Learning Python During an AI Revolution
byThe Real Python Podcast
0 ratings
0% found this document useful
This Week In Machine Learning & AI - 5/27/16: The White House on AI & Aggressive Self-Driving Cars: This Week in Machine Learning & AI brings you the…
Podcast episode
This Week In Machine Learning & AI - 5/27/16: The White House on AI & Aggressive Self-Driving Cars: This Week in Machine Learning & AI brings you the…
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful

Skip carousel

The Fundamental Limits of Machine Learning
Nautilus
Article
The Fundamental Limits of Machine Learning
Sep 20, 2016
5 min read
Tensor Flow 101
APC
Article
Tensor Flow 101
Jan 27, 2020
4 min read
The Coming Software Apocalypse
The Atlantic
Article
The Coming Software Apocalypse
Sep 26, 2017
33 min read
Scikit-Learn: The Ultimate Python Library
APC
Article
Scikit-Learn: The Ultimate Python Library
Jul 15, 2019
4 min read
Quantum Leap
Marketing
Article
Quantum Leap
Jul 11, 2019
6 min read
Generative AI: What Leaders Need To Know
Rotman Management
Article
Generative AI: What Leaders Need To Know
Jan 1, 2024
12 min read
An Expert Speaks Up on What You Should Know About Programming Languages
Entrepreneur
Article
An Expert Speaks Up on What You Should Know About Programming Languages
Oct 1, 2015
1 min read
Create Your Own Virtual Classroom
Linux Format
Article
Create Your Own Virtual Classroom
Mar 8, 2022
Credit: https://moodle.org David Rutland believes it’s impossible for a person to be either be overdressed or overeducated. People who know him agree that he is neither. Education, education, education. If you were around in 1997, you probably rememb
10 min read
Create Your Own Virtual Classroom
Linux Format
Article
Create Your Own Virtual Classroom
Mar 8, 2022
Credit: https://moodle.org David Rutland believes it’s impossible for a person to be either be overdressed or overeducated. People who know him agree that he is neither. Education, education, education. If you were around in 1997, you probably rememb
10 min read
Mining Actionable Information with Smart Capture
The European Business Review
Article
Mining Actionable Information with Smart Capture
May 22, 2018
4 min read
A.i. Coding
Linux Format
Article
A.i. Coding
Aug 22, 2023
16 min read
GPT-4 Has The Memory Of A Goldfish
The Atlantic
Article
GPT-4 Has The Memory Of A Goldfish
Mar 17, 2023
By this point, the many defects of AI-based language models have been analyzed to death—their incorrigible dishonesty, their capacity for bias and bigotry, their lack of common sense. GPT-4, the newest and most advanced such model yet, is already bei
4 min read
Building PCs
Linux Format
Article
Building PCs
Apr 7, 2020
2 min read
Web App Security
Linux Format
Article
Web App Security
Jun 29, 2021
8 min read
Picture In A Mainframe
Linux Format
Article
Picture In A Mainframe
Jul 2, 2019
11 min read
“We Should Pay Attention To The Way That A New Language Can Redefine The Limits Of Computing”
PC Pro Magazine
Article
“We Should Pay Attention To The Way That A New Language Can Redefine The Limits Of Computing”
Feb 11, 2021
7 min read
Inform And Enhance Your Business With Open Data
PC Pro Magazine
Article
Inform And Enhance Your Business With Open Data
Jun 10, 2021
7 min read
“We Might Beliving On ‘The Edge’, But That’s A Passing Label That Now Only Reflects A By Gone Way Of Working”
PC Pro Magazine
Article
“We Might Beliving On ‘The Edge’, But That’s A Passing Label That Now Only Reflects A By Gone Way Of Working”
Aug 13, 2020
8 min read
‘MBAs THAT DON’T FOCUS ON DATA & TECH WON’T DO WELL’
Business Today
Article
‘MBAs THAT DON’T FOCUS ON DATA & TECH WON’T DO WELL’
Oct 28, 2022
6 min read
Intel ...ON TE FUTURE OF... Computing
TechLife
Article
Intel ...ON TE FUTURE OF... Computing
Jan 13, 2020
5 min read
Perfect Backup: Perfect? No, But Darn Close
PCWorld
Article
Perfect Backup: Perfect? No, But Darn Close
Jan 11, 2023
3 min read
Intel …ON THE FUTURE OF… Computing
T3 Australia
Article
Intel …ON THE FUTURE OF… Computing
Nov 4, 2019
5 min read
2024: What Is The Near Future Of Generative AI?
The European Business Review
Article
2024: What Is The Near Future Of Generative AI?
Jan 26, 2024
8 min read
SYNC OR SWIM Rough Animator
Screen Education
Article
SYNC OR SWIM Rough Animator
Dec 1, 2019
11 min read
Getting The edge
The European Business Review
Article
Getting The edge
Feb 25, 2021
7 min read
Intel …ON THE FUTURE OF… Computing
T3
Article
Intel …ON THE FUTURE OF… Computing
Sep 27, 2019
5 min read
“There’s No Single ‘Best’ Language To Learn. I Think The Real Key Is To Learn How To Write Code”
PC Pro Magazine
Article
“There’s No Single ‘Best’ Language To Learn. I Think The Real Key Is To Learn How To Write Code”
Oct 8, 2022
9 min read
There’s A New Career In Town
True Love
Article
There’s A New Career In Town
Oct 21, 2019
2 min read
Machine-learning On Your Android Phone?
APC
Article
Machine-learning On Your Android Phone?
Dec 30, 2019
4 min read
Note-taking Applications For Family History
Family Tree UK
Article
Note-taking Applications For Family History
Mar 10, 2023
7 min read

Related categories

Skip carousel

Reviews for Deep Learning with Structured Data

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Deep Learning with Structured Data - Mark Ryan

Deep Learning with Structured Data

Mark Ryan

To comment go to liveBook

Manning

Shelter Island

For more information on this and other Manning titles go to

manning.com

Copyright

For online information and ordering of these and other Manning books, please visit manning.com. The publisher offers discounts on these books when ordered in quantity.

For more information, please contact

Special Sales Department

Manning Publications Co.

20 Baldwin Road

PO Box 761

Shelter Island, NY 11964

Email: orders@manning.com

No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher.

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps.

♾ Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine.

ISBN: 9781617296727

dedication

To my daughter, Josephine, who always reminds me that God is the Author.

preface

acknowledgments

about this book

about the author

about the cover illustration

1 Why deep learning with structured data?

Overview of deep learning

Benefits and drawbacks of deep learning

Overview of the deep learning stack

Structured vs. unstructured data

Objections to deep learning with structured data

Why investigate deep learning with a structured data problem?

An overview of the code accompanying this book

What you need to know

Summary

2 Introduction to the example problem and Pandas dataframes

Development environment options for deep learning

Code for exploring Pandas

Pandas dataframes in Python

Ingesting CSV files into Pandas dataframes

Using Pandas to do what you would do with SQL

The major example: Predicting streetcar delays

Why is a real-world dataset critical for learning about deep learning?

Format and scope of the input dataset

The destination: An end-to-end solution

More details on the code that makes up the solutions

Development environments: Vanilla vs. deep-learning-enabled

A deeper look at the objections to deep learning

How deep learning has become more accessible

A first taste of training a deep learning model

Summary

3 Preparing the data, part 1: Exploring and cleansing the data

Code for exploring and cleansing the data

Using config files with Python

Ingesting XLS files into a Pandas dataframe

Using pickle to save your Pandas dataframe from one session to another

Exploring the data

Categorizing data into continuous, categorical, and text categories

Cleaning up problems in the dataset: missing data, errors, and guesses

Finding out how much data deep learning needs

Summary

4 Preparing the data, part 2: Transforming the data

Code for preparing and transforming the data

Dealing with incorrect values: Routes

Why only one substitute for all bad values?

Dealing with incorrect values: Vehicles

Dealing with inconsistent values: Location

Going the distance: Locations

Fixing type mismatches

Dealing with rows that still contain bad data

Creating derived columns

Preparing non-numeric data to train a deep learning model

Overview of the end-to-end solution

Summary

5 Preparing and building the model

Data leakage and features that are fair game for training the model

Domain expertise and minimal scoring tests to prevent data leakage

Preventing data leakage in the streetcar delay prediction problem

Code for exploring Keras and building the model

Deriving the dataframe to use to train the model

Transforming the dataframe into the format expected by the Keras model

A brief history of Keras and TensorFlow

Migrating from TensorFlow 1.x to TensorFlow 2

TensorFlow vs. PyTorch

The structure of a deep learning model in Keras

How the data structure defines the Keras model

The power of embeddings

Code to build a Keras model automatically based on the data structure

Exploring your model

Model parameters

Summary

6 Training the model and running experiments

Code for training the deep learning model

Reviewing the process of training a deep learning model

Reviewing the overall goal of the streetcar delay prediction model

Selecting the train, validation, and test datasets

Initial training run

Measuring the performance of your model

Keras callbacks: Getting the best out of your training runs

Getting identical results from multiple training runs

Shortcuts to scoring

Explicitly saving trained models

Running a series of training experiments

Summary

7 More experiments with the trained model

Code for more experiments with the model

Validating whether removing bad values improves the model

Validating whether embeddings for columns improve the performance of the model

Comparing the deep learning model with XGBoost

Possible next steps for improving the deep learning model

Summary

8 Deploying the model

Overview of model deployment

If deployment is so important, why is it so hard?

Review of one-off scoring

The user experience with web deployment

Steps to deploy your model with web deployment

Behind the scenes with web deployment

The user experience with Facebook Messenger deployment

Behind the scenes with Facebook Messenger deployment

More background on Rasa

Steps to deploy your model in Facebook Messenger with Rasa

Introduction to pipelines

Defining pipelines in the model training phase

Applying pipelines in the scoring phase

Maintaining a model after deployment

Summary

9 Recommended next steps

Reviewing what we have covered so far

What we could do next with the streetcar delay prediction project

Adding location details to the streetcar delay prediction project

Training our deep learning model with weather data

Adding season or time of day to the streetcar delay prediction project

Imputation: An alternative to removing records with bad values

Making the web deployment of the streetcar delay prediction model generally available

Adapting the streetcar delay prediction model to a new dataset

Preparing the dataset and training the model

Deploying the model with web deployment

Deploying the model with Facebook Messenger

Adapting the approach in this book to a different dataset

Resources for additional learning

Summary

appendix A Using Google Colaboratory

index

front matter

I believe that when people look back in 50 years and assess the first two decades of the century, deep learning will be at the top of the list of technical innovations. The theoretical foundations of deep learning were established in the 1950s, but it wasn’t until 2012 that the potential of deep learning became evident to nonspecialists. Now, almost a decade later, deep learning pervades our lives, from smart speakers that are able to seamlessly convert our speech into text to systems that can beat any human in an ever-expanding range of games. This book examines an overlooked corner of the deep learning world: applying deep learning to structured, tabular data (that is, data organized in rows and columns).

If the conventional wisdom is to avoid using deep learning with structured data, and the marquee applications of deep learning (such as image recognition) deal with nonstructured data, why should you read a book about deep learning with structured data? First, as I argue in chapters 1 and 2, some of the objections to using deep learning to solve structured data problems (such as deep learning being too complex or structured datasets being too small) simply don’t hold water today. When we are assessing which machine learning approach to apply to a structured data problem, we need to keep an open mind and consider deep learning as a potential solution. Second, although nontabular data underpins many topical application areas of deep learning (such as image recognition, speech to text, and machine translation), our lives as consumers, employees, and citizens are still largely defined by data in tables. Every bank transaction, every tax payment, every insurance claim, and hundreds more aspects of our daily existence flow through structured, tabular data. Whether you are a newcomer to deep learning or an experienced practitioner, you owe it to yourself to have deep learning in your toolbox when you tackle a problem that involves structured data.

By reading this book, you will learn what you need to know to apply deep learning to a wide variety of structured data problems. You will work through a full-blown application of deep learning to a real-world dataset, from preparing the data to training the deep learning model to deploying the trained model. The code examples that accompany the book are written in Python, the lingua franca of machine learning, and take advantage of the Keras/TensorFlow framework, the most common platform for deep learning in industry.

acknowledgments

I have many people to thank for their support and assistance over the year and a half that I wrote this book. First, I would like to thank the team at Manning Publications, particularly my editor, Christina Taylor, for their masterful direction. I would like to thank my former supervisors at IBM—in particular Jessica Rockwood, Michael Kwok, and Al Martin—for giving me the impetus to write this book. I would like to thank my current team at Intact for their support—in particular Simon Marchessault-Groleau, Dany Simard, and Nicolas Beaupré. My friends have given me consistent encouragement. I would like to particularly thank Dr. Laurence Mussio and Flavia Mussio, both of whom have been unalloyed and enthusiastic supporters of my writing. Jamie Roberts, Luc Chamberland, Alan Hall, Peter Moroney, Fred Gandolfi, and Alina Zhang have all provided encouragement. Finally, I would like to thank my family—Steve and Carol, John and Debby, and Nina—for their love. (We’re a literary family, thank God.)

To all the reviewers: Aditya Kaushik, Atul Saurav, Gary Bake, Gregory Matuszek, Guy Langston, Hao Liu, Ike Okonkwo, Irfan Ullah, Ishan Khurana, Jared Wadsworth, Jason Rendel, Jeff Hajewski, Jesús Manuel López Becerra, Joe Justesen, Juan Rufes, Julien Pohie, Kostas Passadis, Kunal Ghosh, Malgorzata Rodacka, Matthias Busch, Michael Jensen, Monica Guimaraes, Nicole Koenigstein, Rajkumar Palani, Raushan Jha, Sayak Paul, Sean T Booker, Stefano Ongarello, Tony Holdroyd, and Vlad Navitski, your suggestions helped make this a better book.

about this book

This book takes you through the full journey of applying deep learning to a tabular, structured dataset. By working through an extended, real-world example, you will learn how to clean up a messy dataset and use it to train a deep learning model by using the popular Keras framework. Then you will learn how to make your trained deep learning model available to the world through a web page or a chatbot in Facebook Messenger. Finally, you will learn how to extend and improve your deep learning model, as well as how to apply the approach shown in this book to other problems involving structured data.

Who should read this book

To get the most out of this book, you should be familiar with Python coding in the context of Jupyter Notebooks. You should also be familiar with some non-deep-learning machine learning approaches, such as logistic regression and support vector machines, and be familiar with the standard vocabulary of machine learning. Finally, if you regularly work with data that is organized in tables as rows and columns, you will find it easiest to apply the concepts in this book to your work.

How this book is organized: A roadmap

This book is made up of nine chapters and one appendix:

Chapter 1 includes a quick review of the high-level concepts of deep learning and a summary of why (and why not) you would want to apply deep learning to structured data. It also explains what I mean by structured data.

Chapter 2 explains the development environments you can use for the code example in this book. It also introduces the Python library for tabular, structured data (Pandas) and describes the major example used throughout the rest of the book: predicting delays on a light-rail transit system. This example is the streetcar delay prediction problem. Finally, chapter 2 previews the details that are coming in later chapters with a quick run through a simple example of training a deep learning model.

Chapter 3 explores the dataset for the major example and describes how to deal with a set of problems in the dataset. It also examines the question of how much data is required to train a deep learning model.

Chapter 4 covers how to address additional problems in the dataset and what to do with bad values that remain in the data after all the cleanup. It also shows how to prepare non-numeric data to train a deep learning model. Chapter 4 wraps up with a summary of the end-to-end code example.

Chapter 5 describes the process of preparing and building the deep learning model for the streetcar delay prediction problem. It explains the problem of data leakage (training the model with data that won’t be available when you want to make a prediction with the model) and how to avoid it. Then the chapter walks through the details of the code that makes up the deep learning model and shows you options for examining the structure of the model.

Chapter 6 explains the end-to-end model training process, from selecting subsets of the input dataset to train and test the model, to conducting your first training run, to iterating through a set of experiments to improve the performance of the trained model.

Chapter 7 expands on the model training techniques introduced in chapter 6 by conducting three more in-depth experiments. The first experiment proves that one of the cleanup steps from chapter 4 (removing records with invalid values) improves the performance of the model. The second experiment demonstrates the performance benefit of associating learned vectors (embeddings) with categorical columns. Finally, the third experiment compares the performance of the deep learning model with the performance of a popular non-deep learning approach, XGBoost.

Chapter 8 provides details on how you can make your trained deep learning model useful to the outside world. First, it describes how to do a simple web deployment of a trained model. Then it describes how to deploy a trained model in Facebook Messenger by using the Rasa open source chatbot framework.

Chapter 9 starts with a summary of what’s been covered in the book. Then it describes additional data sources that could improve the performance of the model, including location and weather data. Next, it describes how to adapt the code accompanying the book to tackle a completely new problem in tabular, structured data. The chapter wraps up with a list of additional books, courses, and online resources for learning more about deep learning with structured data.

The appendix describes how you can use the free Colab environment to run the code examples that accompany the book.

I suggest that you read this book sequentially, because each chapter builds on the content in the preceding chapters. You will get the most out of the book if you execute the code samples that accompany the book—in particular the code for the streetcar delay prediction problem. Finally, I strongly encourage you to exercise the experiments described in chapters 6 and 7 and to explore the additional enhancements described in chapter 9.

About the code

This book is accompanied by extensive code examples. In addition to the extended code example for the streetcar delay prediction problem in chapters 3-8, there are additional standalone code examples for chapter 2 (to demonstrate the Pandas library and the relationship between Pandas and SQL) and chapter 5 (to demonstrate the Keras sequential and functional APIs).

Chapter 2 describes the options you have for running the code examples, and the appendix has further details on one of the options, Google’s Colab. Whichever environment you choose, you need to have Python (at least version 3.7) and key libraries including the following:

Pandas

Scikit-learn

Keras/TensorFlow 2.x

As you run through the portions of the code, you may need to pip install additional libraries.

The deployment portion of the main streetcar delay prediction example has some additional requirements:

Flask library for the web deployment

Rasa chatbot framework and ngrok for the Facebook Messenger deployment

The source code is formatted in a fixed-width font like this to separate it from ordinary text. Sometimes code is also in bold to highlight code that has changed from previous steps in the chapter, such as when a new feature adds to an existing line of code.

In many cases, the original source code has been reformatted; we’ve added line breaks and reworked indentation to accommodate the available page space in the book. In rare cases, even this was not enough, and listings include line-continuation markers (➥). Additionally, comments in the source code have often been removed from the listings when the code is described in the text. Code annotations accompany many of the listings, highlighting important concepts.

You can find all the code examples for this book in the GitHub repo at http://mng.bz/v95x.

liveBook discussion forum

Purchase of Deep Learning with Structured Data includes free access to a private web forum run by Manning Publications where you can make comments about the book, ask technical questions, and receive help from the author and from other users. To access the forum, go to https://livebook.manning.com/#!/book/deep-learning-with-structured-data/discussion. You can also learn more about Manning’s forums and the rules of conduct at https://livebook.manning.com/#!/discussion.

Manning’s commitment to our readers is to provide a venue where a meaningful dialogue between individual readers and between readers and the author can take place. It is not a commitment to any specific amount of participation on the part of the author, whose contribution to the forum remains voluntary (and unpaid). We suggest you try asking the author some challenging questions lest his interest stray! The forum and the archives of previous discussions will be accessible from the publisher’s website as long as the book is in print.

about the author

Mark Ryan is a data science manager at Intact Insurance in Toronto, Canada. Mark has a passion for sharing the benefits of machine learning, including delivering machine learning bootcamps to give participants a hands-on introduction to the world of machine learning. In addition to deep learning and its potential to unlock additional value in structured, tabular data, his interests include chatbots and the potential of autonomous vehicles. He has a bachelor of mathematics degree from the University of Waterloo and a master’s degree in computer science from the University of Toronto.

about the cover illustration

The figure on the cover of Deep Learning with Structured Data is captioned Homme de Navarre, or A man from Navarre, a diverse northern region of northern Spain. The illustration is taken from a collection of dress costumes from various countries by Jacques Grasset de Saint-Sauveur (1757-1810), titled Costumes de Différents Pays, published in France in 1797. Each illustration is finely drawn and colored by hand. The rich variety of Grasset de Saint-Sauveur’s collection reminds us vividly of how culturally apart the world’s towns and regions were just 200 years ago. Isolated from each other, people spoke different dialects and languages. In the streets or in the countryside, it was easy to identify where they lived and what their trade or station in life was just by their dress.

The way we dress has changed since then and the diversity by region, so rich at the time, has faded away. It is now hard to tell apart the inhabitants of different continents, let alone different towns, regions, or countries. Perhaps we have traded cultural diversity for a more varied personal life—certainly for a more varied and fast-paced technological life.

At a time when it is hard to tell one computer book from another, Manning celebrates the inventiveness and initiative of the computer business with book covers based on the rich diversity of regional life of two centuries ago, brought back to life by Grasset de Saint-Sauveur’s pictures.

1 Why deep learning with structured data?

This chapter covers

A high-level overview of deep learning

Benefits and drawbacks of deep learning

Introduction to the deep learning software stack

Structured versus unstructured data

Objections to deep learning with structured data

Advantages of deep learning with structured data

Introduction to the code accompanying this book

Since 2012, we have witnessed what can only be called a renaissance of artificial intelligence. A discipline that had lost its way in the late 1980s is important again. What happened?

In October 2012, a team of students working with Geoffrey Hinton (a leading academic proponent of deep learning based at the University of Toronto) announced a result in the ImageNet computer vision contest that achieved an error rate in identifying objects that was close to half that of the nearest competitor. This result exploited deep learning and ushered in an explosion of interest in the topic. Since then, we have seen deep learning applications with world-class results in many domains, including image processing, audio to text, and machine translation. In the past couple of years, the tools and infrastructure for deep learning have reached a level of maturity and accessibility that make it possible for nonspecialists to take advantage of deep learning’s benefits. This book shows how you can use deep learning to get insights into and make predictions about structured data: data organized as tables with rows and columns, as in a relational database. You will see the capability of deep learning by going step by step through a complete, end-to-end example of deep learning, from ingesting the raw input structured data to making the deep learning model available to end users. By applying deep learning to a problem with a real-world structured dataset, you will see the challenges and opportunities of deep learning with structured data.

1.1 Overview of deep learning

Before reviewing the high-level concepts of deep learning, let’s introduce a simple example that we can use to explore these concepts: detection of credit card fraud. Chapter 2 introduces the real-world dataset and an extensive code example that prepares this dataset and uses it to train a deep learning model. For now, this basic fraud detection example is sufficient for a review of some of the concepts of deep learning.

Why would you want to exploit deep learning for fraud detection? There are several reasons:

Fraudsters can find ways to work around the traditional rules-based approaches to fraud detection (http://mng.bz/emQw).

A deep learning approach that is part of an industrial-strength pipeline—in which the model performance is frequently assessed and the model is automatically retrained if its performance drops below a given threshold—can adapt to changes in fraud patterns.

A deep learning approach has the potential to provide near-real-time assessment of new transactions.

In summary, deep learning is worth considering for fraud detection because it can be the heart of a flexible, fast solution. Note that in addition to these advantages, there is a downside to using deep learning as a solution to the problem of fraud detection: compared with other approaches, deep learning is harder to explain. Other machine learning approaches allow you to determine which input characteristics most influence the outcome, but this relationship can be difficult or impossible to establish with deep learning.

Assume that a credit card company maintains customer transactions as records in a table. Each record in this table contains information about the transaction, including an ID that uniquely identifies the customer, as well as details about the transaction, including the date and time of the transaction, the ID of the vendor, the location of the transaction, and the currency and amount of the transaction. In addition to this information, which is added to the table every time a transaction is reported, every record has a field to indicate whether the transaction was reported as a fraud.

The credit card company plans to train a deep learning model on the historical data in this table and use this trained model to predict whether new incoming transactions are fraudulent. The goal is to identify potential fraud as quickly as possible (and take corrective action) rather than waiting days for the customer or vendor to report that a particular transaction is fraudulent.

Let’s examine the customer transaction table. Figure 1.1 contains a snippet of what some records in this table would look like.

CH01_F01_Ryan

Enjoying the preview?

Page 1 of 1

Deep Learning with Structured Data

About this ebook

Mark Ryan

Related authors

Related to Deep Learning with Structured Data

Related ebooks

Intelligence (AI) & Semantics For You

Related podcast episodes

Related articles

Related categories

Reviews for Deep Learning with Structured Data

What did you think?

Book preview

Deep Learning with Structured Data - Mark Ryan

dedication

contents

front matter

acknowledgments

about this book

Who should read this book

How this book is organized: A roadmap

About the code

liveBook discussion forum

about the author

about the cover illustration

1 Why deep learning with structured data?

This chapter covers

1.1 Overview of deep learning