Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries

Ebook542 pages3 hours

Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries

Name: Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries
Brand: BPB Online LLP
Rating: 5.0 (1 reviews)

By Purna Chander Rao. Kathula

Rating: 5 out of 5 stars

5/5

()

Read preview

About this ebook

The book will start with quick introductions to Python and its ecosystem libraries for data science such as JupyterLab, Numpy, Pandas, SciPy, Matplotlib, and Seaborn.

This book will help in learning python data structures and essential concepts such as Functions, Lambdas, List comprehensions, Datetime objects, etc. required for data engineering. It also covers an in-depth understanding of Python data science packages where JupyterLab used as an IDE for writing, documenting, and executing the python code, Numpy used for computation of numerical operations, Pandas for cleaning and reorganizing the data, handling large datasets and merging the dataframes to get meaningful insights. You will go through the statistics to understand the relation between the variables using SciPy and building visualization charts using Matplotllib and Seaborn libraries.

Skip carousel

Computers

LanguageEnglish

PublisherBPB Online LLP

Release dateAug 13, 2020

ISBN9789389845655

Author

Purna Chander Rao. Kathula

Related authors

Skip carousel

Related to Hands-on Data Analysis and Visualization with Pandas

Related ebooks

Skip carousel

Python Data Visualization Essentials Guide: Become a Data Visualization expert by building strong proficiency in Pandas, Matplotlib, Seaborn, Plotly, Numpy, and Bokeh
Ebook
Python Data Visualization Essentials Guide: Become a Data Visualization expert by building strong proficiency in Pandas, Matplotlib, Seaborn, Plotly, Numpy, and Bokeh
byKalilur Rahman
Rating: 0 out of 5 stars
0 ratings
Practical Data Science with Jupyter: Explore Data Cleaning, Pre-processing, Data Wrangling, Feature Engineering and Machine Learning using Python and Jupyter (English Edition)
Ebook
Practical Data Science with Jupyter: Explore Data Cleaning, Pre-processing, Data Wrangling, Feature Engineering and Machine Learning using Python and Jupyter (English Edition)
byPrateek Gupta
Rating: 0 out of 5 stars
0 ratings
Data Analysis with Python: Introducing NumPy, Pandas, Matplotlib, and Essential Elements of Python Programming (English Edition)
Ebook
Data Analysis with Python: Introducing NumPy, Pandas, Matplotlib, and Essential Elements of Python Programming (English Edition)
byRituraj Dixit
Rating: 0 out of 5 stars
0 ratings
Data Science with Jupyter: Master Data Science skills with easy-to-follow Python examples
Ebook
Data Science with Jupyter: Master Data Science skills with easy-to-follow Python examples
byPrateek Gupta
Rating: 0 out of 5 stars
0 ratings
Pandas in 7 Days: Utilize Python to Manipulate Data, Conduct Scientific Computing, Time Series Analysis, and Exploratory Data Analysis
Ebook
Pandas in 7 Days: Utilize Python to Manipulate Data, Conduct Scientific Computing, Time Series Analysis, and Exploratory Data Analysis
byFabio Nelli
Rating: 0 out of 5 stars
0 ratings
Python For Data Science
Ebook
Python For Data Science
byKevin Clark
Rating: 0 out of 5 stars
0 ratings
Python Data Analysis - Second Edition
Ebook
Python Data Analysis - Second Edition
byArmando Fandango
Rating: 0 out of 5 stars
0 ratings
Python Data Science Essentials
Ebook
Python Data Science Essentials
byBoschetti Alberto
Rating: 0 out of 5 stars
0 ratings
Hands-On Data Analysis with Pandas: Efficiently perform data collection, wrangling, analysis, and visualization using Python
Ebook
Hands-On Data Analysis with Pandas: Efficiently perform data collection, wrangling, analysis, and visualization using Python
byStefanie Molin
Rating: 0 out of 5 stars
0 ratings
Practical Data Science Cookbook - Second Edition
Ebook
Practical Data Science Cookbook - Second Edition
byTony Ojeda
Rating: 0 out of 5 stars
0 ratings
Pandas 1.x Cookbook - Second Edition: Practical recipes for scientific computing, time series analysis, and exploratory data analysis using Python, 2nd Edition
Ebook
Pandas 1.x Cookbook - Second Edition: Practical recipes for scientific computing, time series analysis, and exploratory data analysis using Python, 2nd Edition
byMatt Harrison
Rating: 5 out of 5 stars
5/5
R for Data Science
Ebook
R for Data Science
byDan Toomey
Rating: 5 out of 5 stars
5/5
R High Performance Programming
Ebook
R High Performance Programming
byAloysius Lim
Rating: 4 out of 5 stars
4/5
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next
Ebook
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next
byRupam Kumar Sharma
Rating: 0 out of 5 stars
0 ratings
Advanced Deep Learning with Python: Design and implement advanced next-generation AI solutions using TensorFlow and PyTorch
Ebook
Advanced Deep Learning with Python: Design and implement advanced next-generation AI solutions using TensorFlow and PyTorch
byIvan Vasilev
Rating: 0 out of 5 stars
0 ratings
Learning Data Mining with Python - Second Edition
Ebook
Learning Data Mining with Python - Second Edition
byRobert Layton
Rating: 0 out of 5 stars
0 ratings
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
Ebook
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
bySebastian Raschka
Rating: 5 out of 5 stars
5/5
Python Data Science Essentials - Second Edition
Ebook
Python Data Science Essentials - Second Edition
byBoschetti Alberto
Rating: 4 out of 5 stars
4/5
Getting Started with Python Data Analysis
Ebook
Getting Started with Python Data Analysis
byVo.T.H Phuong
Rating: 0 out of 5 stars
0 ratings
Python Data Analysis
Ebook
Python Data Analysis
byIvan Idris
Rating: 4 out of 5 stars
4/5
Hands-on Supervised Learning with Python
Ebook
Hands-on Supervised Learning with Python
byMadeleine Shang
Rating: 0 out of 5 stars
0 ratings
Learning Data Mining with Python
Ebook
Learning Data Mining with Python
byRobert Layton
Rating: 0 out of 5 stars
0 ratings
Mastering Python for Data Science
Ebook
Mastering Python for Data Science
bySamir Madhavan
Rating: 3 out of 5 stars
3/5
Learning Bayesian Models with R
Ebook
Learning Bayesian Models with R
byM.Koduvely Dr. Hari
Rating: 5 out of 5 stars
5/5
Practical Data Analysis
Ebook
Practical Data Analysis
byHector Cuesta
Rating: 4 out of 5 stars
4/5
Mastering TensorFlow 2.x: Implement Powerful Neural Nets across Structured, Unstructured datasets and Time Series Data
Ebook
Mastering TensorFlow 2.x: Implement Powerful Neural Nets across Structured, Unstructured datasets and Time Series Data
byRajdeep Dua
Rating: 0 out of 5 stars
0 ratings
Mastering Machine Learning with R
Ebook
Mastering Machine Learning with R
byLesmeister Cory
Rating: 0 out of 5 stars
0 ratings
Python Machine Learning By Example
Ebook
Python Machine Learning By Example
byYuxi (Hayden) Liu
Rating: 4 out of 5 stars
4/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Mastering Python Data Analysis
Ebook
Mastering Python Data Analysis
byMagnus Vilhelm Persson
Rating: 0 out of 5 stars
0 ratings

Computers For You

Skip carousel

Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
Ebook
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
byAaron Smith
Rating: 0 out of 5 stars
0 ratings
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
Ebook
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
bySteven Cooper
Rating: 4 out of 5 stars
4/5
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
Ebook
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
byMaximus Wilson
Rating: 0 out of 5 stars
0 ratings
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
Ebook
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
byAlex Parkinson
Rating: 4 out of 5 stars
4/5
Deep Search: How to Explore the Internet More Effectively
Ebook
Deep Search: How to Explore the Internet More Effectively
byAlan Pearce
Rating: 5 out of 5 stars
5/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 5 out of 5 stars
5/5
Elon Musk
Ebook
Elon Musk
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
Ebook
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
byHadelin de Ponteves
Rating: 0 out of 5 stars
0 ratings
The Best Hacking Tricks for Beginners
Ebook
The Best Hacking Tricks for Beginners
byRAJ TYAGI
Rating: 4 out of 5 stars
4/5
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
Ebook
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
byTJ Books
Rating: 0 out of 5 stars
0 ratings
Practical Lock Picking: A Physical Penetration Tester's Training Guide
Ebook
Practical Lock Picking: A Physical Penetration Tester's Training Guide
byDeviant Ollam
Rating: 5 out of 5 stars
5/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
The Designer's Web Handbook: What You Need to Know to Create for the Web
Ebook
The Designer's Web Handbook: What You Need to Know to Create for the Web
byPatrick McNeil
Rating: 0 out of 5 stars
0 ratings
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
People Skills for Analytical Thinkers
Ebook
People Skills for Analytical Thinkers
byGilbert Eijkelenboom
Rating: 5 out of 5 stars
5/5
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
CompTIA Security+ Practice Questions
Ebook
CompTIA Security+ Practice Questions
byIP Specialist
Rating: 2 out of 5 stars
2/5
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
Ebook
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
bySeth Stephens-Davidowitz
Rating: 4 out of 5 stars
4/5
Learning the Chess Openings
Ebook
Learning the Chess Openings
byJef Kaan
Rating: 5 out of 5 stars
5/5
The Mega Box: The Ultimate Guide to the Best Free Resources on the Internet
Ebook
The Mega Box: The Ultimate Guide to the Best Free Resources on the Internet
byChris Mason
Rating: 4 out of 5 stars
4/5
The Professional Voiceover Handbook: Voiceover training, #1
Ebook
The Professional Voiceover Handbook: Voiceover training, #1
byPeter Baker
Rating: 5 out of 5 stars
5/5
Summary of Digital Minimalism: by Cal Newport - Choosing a Focused Life in a Noisy World - A Comprehensive Summary
Ebook
Summary of Digital Minimalism: by Cal Newport - Choosing a Focused Life in a Noisy World - A Comprehensive Summary
byAlexander Cooper
Rating: 5 out of 5 stars
5/5
YouTube: How to Build and Optimize Your First YouTube Channel, Marketing, SEO, Tips and Strategies for YouTube Channel Success
Ebook
YouTube: How to Build and Optimize Your First YouTube Channel, Marketing, SEO, Tips and Strategies for YouTube Channel Success
byTommy Swindali
Rating: 4 out of 5 stars
4/5
Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
Ebook
Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
byKathleen Hale
Rating: 4 out of 5 stars
4/5
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
Ebook
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
byRizwan Virk
Rating: 5 out of 5 stars
5/5
Remote/WebCam Notarization : Basic Understanding
Ebook
Remote/WebCam Notarization : Basic Understanding
byJeannie Eunice Franks
Rating: 3 out of 5 stars
3/5
Web Designer's Idea Book, Volume 4: Inspiration from the Best Web Design Trends, Themes and Styles
Ebook
Web Designer's Idea Book, Volume 4: Inspiration from the Best Web Design Trends, Themes and Styles
byPatrick McNeil
Rating: 4 out of 5 stars
4/5
Summary of Max Tegmark's Life 3.0
Ebook
Summary of Max Tegmark's Life 3.0
byIRB Media
Rating: 0 out of 5 stars
0 ratings
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
Ebook
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
byQuentin Docter
Rating: 0 out of 5 stars
0 ratings

Related podcast episodes

Skip carousel

78: Mindset of a Rockstar Data Analyst w/ Trevor Tapscott: Our focus for this inspiring episode of AOF is mindset, especially if you want to be a standout data analyst! I have brought one of my first ever followers and day ones! Trevor Tapscott is a VP and Analytics Consultant at Wells Fargo and has been in...
Podcast episode
78: Mindset of a Rockstar Data Analyst w/ Trevor Tapscott: Our focus for this inspiring episode of AOF is mindset, especially if you want to be a standout data analyst! I have brought one of my first ever followers and day ones! Trevor Tapscott is a VP and Analytics Consultant at Wells Fargo and has been in...
byAnalytics on Fire
0 ratings
0% found this document useful
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
Podcast episode
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
Podcast episode
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
byHow to Data (Joshiverse- Journey of a Budding Data Scientist)
0 ratings
0% found this document useful
Advantages of Completing Small Python Projects
Podcast episode
Advantages of Completing Small Python Projects
byThe Real Python Podcast
0 ratings
0% found this document useful
#1 Data Science, Past, Present and Future: Hilary Mason talks about the past, present, and future of data science with Hugo. Hilary is the VP of Research at Cloudera Fast Forward, a machine intelligence research company, and the data scientist in residence at Accel. If you want to hear about wh...
Podcast episode
#1 Data Science, Past, Present and Future: Hilary Mason talks about the past, present, and future of data science with Hugo. Hilary is the VP of Research at Cloudera Fast Forward, a machine intelligence research company, and the data scientist in residence at Accel. If you want to hear about wh...
byDataFramed
100%
100% found this document useful
Exploring deep reinforcement learning: with Thomas Simonini of Hugging Face
Podcast episode
Exploring deep reinforcement learning: with Thomas Simonini of Hugging Face
byPractical AI: Machine Learning, Data Science
0 ratings
0% found this document useful
Getting Technical about the Data Center Revolution with Jonathan Friedmann, CEO of Speedata
Podcast episode
Getting Technical about the Data Center Revolution with Jonathan Friedmann, CEO of Speedata
byMaking Data Simple
0 ratings
0% found this document useful
Exploring the Zen of Python & pandas Features for Finance
Podcast episode
Exploring the Zen of Python & pandas Features for Finance
byThe Real Python Podcast
0 ratings
0% found this document useful
Measuring Your Python Learning Progress
Podcast episode
Measuring Your Python Learning Progress
byThe Real Python Podcast
100%
100% found this document useful
Exploring The Evolving Role Of Data Engineers: An interview with Maxime Beauchemin about how the technological progression in the data ecosystem is driving a constant change in the role and responsibilities of data engineers.
Podcast episode
Exploring The Evolving Role Of Data Engineers: An interview with Maxime Beauchemin about how the technological progression in the data ecosystem is driving a constant change in the role and responsibilities of data engineers.
byData Engineering Podcast
100%
100% found this document useful
#40 Becoming a Data Scientist
Podcast episode
#40 Becoming a Data Scientist
byDataFramed
100%
100% found this document useful
#42 Full Stack Data Science
Podcast episode
#42 Full Stack Data Science
byDataFramed
0 ratings
0% found this document useful
Anaconda + Pyston and more: with Peter Wang, CEO of Anaconda
Podcast episode
Anaconda + Pyston and more: with Peter Wang, CEO of Anaconda
byPractical AI: Machine Learning, Data Science
0 ratings
0% found this document useful
#75 The Data Storytelling Skills Data Teams Need with Andy Cotgreave, Technical Evangelist at Tableau
Podcast episode
#75 The Data Storytelling Skills Data Teams Need with Andy Cotgreave, Technical Evangelist at Tableau
byDataFramed
0 ratings
0% found this document useful
[DataFramed Careers Series #2] What Makes a Great Data Science Portfolio
Podcast episode
[DataFramed Careers Series #2] What Makes a Great Data Science Portfolio
byDataFramed
0 ratings
0% found this document useful
Data Visualization and D3.js with Irene Ros: Scott talks to Data Visualization expert Irene Ros. When she isn't contributing to the Miso Project, teaching her d3.js class, or working on making OpenVis Conf the best data visualization conference it can be, she's working on projects that focus on creating engaging interactive visual displays of information.
Podcast episode
Data Visualization and D3.js with Irene Ros: Scott talks to Data Visualization expert Irene Ros. When she isn't contributing to the Miso Project, teaching her d3.js class, or working on making OpenVis Conf the best data visualization conference it can be, she's working on projects that focus on creating engaging interactive visual displays of information.
byHanselminutes with Scott Hanselman
0 ratings
0% found this document useful
Chapter 1: What is Data Science?
Podcast episode
Chapter 1: What is Data Science?
byBuild a Career in Data Science
0 ratings
0% found this document useful
040: Graph Databases: Traditional relational databases like MySQL or Postgres are really good at providing many solutions to the problem of persisting state. But these types of database are really horrible at querying highly connected models in an efficient way. Graph datab...
Podcast episode
040: Graph Databases: Traditional relational databases like MySQL or Postgres are really good at providing many solutions to the problem of persisting state. But these types of database are really horrible at querying highly connected models in an efficient way. Graph datab...
byPHPRoundtable Podcast
0 ratings
0% found this document useful
Algorithmic Trading In Python Using Open Tools And Open Data: An interview about building an open source engine and an open data platform for algorithmic trading and the power of community at QuantConnect
Podcast episode
Algorithmic Trading In Python Using Open Tools And Open Data: An interview about building an open source engine and an open data platform for algorithmic trading and the power of community at QuantConnect
byThe Python Podcast.__init__
0 ratings
0% found this document useful
Data Visualization with Manuel Lima: Gabi Ferrara and Jon Foust are back today and joined by fellow Googler Manuel Lima.
Podcast episode
Data Visualization with Manuel Lima: Gabi Ferrara and Jon Foust are back today and joined by fellow Googler Manuel Lima.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
[DataFramed Careers Series #1] Launching a Data Career in 2022
Podcast episode
[DataFramed Careers Series #1] Launching a Data Career in 2022
byDataFramed
0 ratings
0% found this document useful
084: Yves Hilpisch – Quantitative finance and programming trading strategies w/ The Python Quants: Dr. Yves Hilpisch is the founder of The Python Quants, a keynote speaker, and a three-time published author (most notably, Python For Finance). He regularly contracts to hedge funds, banks and exchanges, and hosts workshops on Python programming and algor
Podcast episode
084: Yves Hilpisch – Quantitative finance and programming trading strategies w/ The Python Quants: Dr. Yves Hilpisch is the founder of The Python Quants, a keynote speaker, and a three-time published author (most notably, Python For Finance). He regularly contracts to hedge funds, banks and exchanges, and hosts workshops on Python programming and algor
byChat With Traders
0 ratings
0% found this document useful
MLA 020 Kubeflow: Conversation with Dirk-Jan Kubeflow (vs cloud native solutions like SageMaker) - Data Scientist at Dept Agency . (From the website:) The Machine Learning Toolkit for Kubernetes. The Kubeflow project is dedicated to making deployments of...
Podcast episode
MLA 020 Kubeflow: Conversation with Dirk-Jan Kubeflow (vs cloud native solutions like SageMaker) - Data Scientist at Dept Agency . (From the website:) The Machine Learning Toolkit for Kubernetes. The Kubeflow project is dedicated to making deployments of...
byMachine Learning Guide
0 ratings
0% found this document useful
SnowflakeDB: The Data Warehouse Built For The Cloud - Episode 110: An interview about how SnowflakeDB was built to provide a performant and flexible data platform for the cloud era
Podcast episode
SnowflakeDB: The Data Warehouse Built For The Cloud - Episode 110: An interview about how SnowflakeDB was built to provide a performant and flexible data platform for the cloud era
byData Engineering Podcast
0 ratings
0% found this document useful
047 Interpretable Machine Learning - Christoph Molnar
Podcast episode
047 Interpretable Machine Learning - Christoph Molnar
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
82: How to Get Started with Advanced Analytics R-Python w/ Ryan Wade: Ryan Wade joins us on AOF today to talk about how to use advanced analytics in your organization! Ryan has been in the analytics game for the last 20 years and is now a Senior Solution Consultant at Blue Granite, based in Indianapolis, Indiana. He...
Podcast episode
82: How to Get Started with Advanced Analytics R-Python w/ Ryan Wade: Ryan Wade joins us on AOF today to talk about how to use advanced analytics in your organization! Ryan has been in the analytics game for the last 20 years and is now a Senior Solution Consultant at Blue Granite, based in Indianapolis, Indiana. He...
byAnalytics on Fire
0 ratings
0% found this document useful
Combining Python And SQL To Build A PyData Warehouse: An interview about how data warehouses fit into the PyData ecosystem for advanced analytics on big data
Podcast episode
Combining Python And SQL To Build A PyData Warehouse: An interview about how data warehouses fit into the PyData ecosystem for advanced analytics on big data
byThe Python Podcast.__init__
0 ratings
0% found this document useful
Harnessing Generative AI For Creating Educational Content With Illumidesk: Generative AI has unlocked a massive opportunity for content creation. There is also an unfulfilled need for experts to be able to share their knowledge and build communities. Illumidesk was built to take advantage of this intersection. In this episode Greg Werner explains how they are using generative AI as an assistive tool for creating educational material, as well as building a data driven experience for learners.
Podcast episode
Harnessing Generative AI For Creating Educational Content With Illumidesk: Generative AI has unlocked a massive opportunity for content creation. There is also an unfulfilled need for experts to be able to share their knowledge and build communities. Illumidesk was built to take advantage of this intersection. In this episode Greg Werner explains how they are using generative AI as an assistive tool for creating educational material, as well as building a data driven experience for learners.
byData Engineering Podcast
0 ratings
0% found this document useful
Agile Applied AI Research with Parvez Ahammad - #492: Today we’re joined by Parvez Ahammad, head of data science applied research at LinkedIn. In our conversation, Parvez shares his interesting take on organizing principles for his organization, starting with how data science teams are broadly...
Podcast episode
Agile Applied AI Research with Parvez Ahammad - #492: Today we’re joined by Parvez Ahammad, head of data science applied research at LinkedIn. In our conversation, Parvez shares his interesting take on organizing principles for his organization, starting with how data science teams are broadly...
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Supercharging Your Process Mining with Python
Podcast episode
Supercharging Your Process Mining with Python
byMining Your Business
0 ratings
0% found this document useful

Skip carousel

Scikit-Learn: The Ultimate Python Library
APC
Article
Scikit-Learn: The Ultimate Python Library
Jul 15, 2019
4 min read
DJANGO Create A Database-driven Website
Linux Format
Article
DJANGO Create A Database-driven Website
Jun 4, 2019
The Django web framework was named after the famous guitarist Django Reinhardt and was first created by web developers at a small newspaper in Kansas. The main goals of Django is to enable fast development of complex websites with database needs. It
7 min read
Want A Job In Data Science? You Might Have To Take A Standardized Test When Applying
Chicago Tribune
Article
Want A Job In Data Science? You Might Have To Take A Standardized Test When Applying
Jul 10, 2018
3 min read
Tensor Flow 101
APC
Article
Tensor Flow 101
Jan 27, 2020
4 min read
How Image Recognition Works
APC
Article
How Image Recognition Works
Nov 4, 2019
4 min read
Manipulate Data Like A Pro With Pandas
Linux Format
Article
Manipulate Data Like A Pro With Pandas
Jul 27, 2021
7 min read
2 The Use of Python in AI and ML
Techfastly
Article
2 The Use of Python in AI and ML
Nov 30, 2020
3 min read
Inform And Enhance Your Business With Open Data
PC Pro Magazine
Article
Inform And Enhance Your Business With Open Data
Jun 10, 2021
7 min read
Make AI Work For You
Linux Format
Article
Make AI Work For You
Apr 2, 2024
8 min read
The Deep Learning Revolution For Artificial Intelligence
Facility Management
Article
The Deep Learning Revolution For Artificial Intelligence
Mar 28, 2019
3 min read
Getting The edge
The European Business Review
Article
Getting The edge
Feb 25, 2021
7 min read
Generative AI: What Leaders Need To Know
Rotman Management
Article
Generative AI: What Leaders Need To Know
Jan 1, 2024
12 min read
A.i. Coding
Linux Format
Article
A.i. Coding
Aug 22, 2023
16 min read
Family History In The AI Era
Family Tree UK
Article
Family History In The AI Era
Apr 12, 2024
7 min read
Seven Questions About Chatgpt Answered
NZBusiness and Management
Article
Seven Questions About Chatgpt Answered
Apr 18, 2023
3 min read
Why We Need To Fear The Risk Of AI Model Collapse
Evening Standard
Article
Why We Need To Fear The Risk Of AI Model Collapse
Dec 17, 2023
4 min read
Q&A: OPENAI CTO MIRA MURATI ON SHEPHERDING CHATGPT
AppleMagazine
Article
Q&A: OPENAI CTO MIRA MURATI ON SHEPHERDING CHATGPT
Apr 28, 2023
4 min read
Code A Cataloguing Application In Python
Linux Format
Article
Code A Cataloguing Application In Python
Nov 15, 2022
Credit: www.djangoproject.com Matt Holder has been a fan of the open source methodology for over two decades and uses Linux and other tools where possible. More featurepacked source code for this project can be downloaded from https://github.com/mat
8 min read
Q&A: OPENAI CTO MIRA MURATI ON SHEPHERDING CHATGPT
TechLife News
Article
Q&A: OPENAI CTO MIRA MURATI ON SHEPHERDING CHATGPT
Apr 29, 2023
4 min read
01 Ready Or Not, AI Is Here To Assist You
HWM Singapore
Article
01 Ready Or Not, AI Is Here To Assist You
Jul 11, 2023
4 min read
Google Answer Box Strategy
Techfastly
Article
Google Answer Box Strategy
Sep 21, 2020
Leveraging the Google PAA (People Also Ask) element on a Search Results Page for Targeted Content Creation with a Python Scraper All businesses that are online today are creating content at a furious pace. According to Technavio, a research firm, con
7 min read
How Can AI Help Your Business?
PC Pro Magazine
Article
How Can AI Help Your Business?
Jun 8, 2023
7 min read
PyScript – Bring Python Coding To The Web
APC
Article
PyScript – Bring Python Coding To The Web
Aug 8, 2022
4 min read
Powering Costing With Artificial Intelligence: The Case Of Vodafone Procurement
The European Business Review
Article
Powering Costing With Artificial Intelligence: The Case Of Vodafone Procurement
May 25, 2021
8 min read
Artificial Intelligence Rules Of The Road
Linux Format
Article
Artificial Intelligence Rules Of The Road
Nov 14, 2023
AI FOR ALL! Anyone who works with computers needs to understand that AI will undoubtedly change how work is executed. That said, I don’t think we are anywhere near the much bleated “Everyone will lose their jobs!” IT-related jobs will change but they
2 min read
Investigating with AI
Writing Magazine
Article
Investigating with AI
Jan 4, 2024
3 min read
Web App Security
Linux Format
Article
Web App Security
Jun 29, 2021
8 min read
Decoding The Impact Of AI
Her World Singapore
Article
Decoding The Impact Of AI
May 5, 2023
6 min read
Buying The Tool
Techfastly
Article
Buying The Tool
Apr 1, 2021
3 min read
Machine-learning On Your Android Phone?
APC
Article
Machine-learning On Your Android Phone?
Dec 30, 2019
4 min read

Related categories

Skip carousel

Reviews for Hands-on Data Analysis and Visualization with Pandas

Rating: 5 out of 5 stars

5/5

1 rating0 reviews

Book preview

Hands-on Data Analysis and Visualization with Pandas - Purna Chander Rao. Kathula

CHAPTER 1

Introduction to Data Analysis

Data analysis is an art. It is a science of extracting insights from the silos of data. This chapter introduces you to the data and its ecosystem components, along with the different stages of the data analysis process, how Python is useful for data analysis and different data science libraries/modules, and their installation process.

Structure

Inspiration for data analysis

What is data science?

Domain expertise

Maths and statistics

Artificial intelligence

Machine learning

Data infrastructure

Data analysis process

Business requirements

Data collection

Data cleansing

Data exploration and visualization

Data modeling

Model validation and testing

Deployment

Why Python for data analysis?

Python libraries for data analysis

Objective

This chapter will guide you through the different processes of data analysis, various concepts such as maths, statistics, and processes that make up this discipline. The concepts covered here will be a heads up for the coming chapters where these concepts and procedures will be applied in the form of Python code with different data related libraries.

Inspiration for data analysis

In this chapter, we will be covering various factors and trends that influence data analysis. In the current world of digitalization, a huge amount of data is produced by IoT devices like sensors, diagnosis reports from healthcare or wellness industry, social network portals such as Facebook, YouTube, LinkedIn, Instagram, and e-commerce sites like Alibaba, Amazon, or Flipkart, where you add an audio, video, comment, add a like, emoji, or you make bank transactions online or use an ATM kiosk to withdraw the money, buy something on e-commerce sites and much more.

This data is not exactly useful information. It is the result of processing, which takes into account a certain set of data that extracts some set of conclusions that can be used in different ways. This process of extracting information from the raw data is data analysis. This analysis of the data becomes the foundation for building predictive models or drawing data visualization charts around the data.

Without Big data and analytics, companies are blind and deaf, wandering on to the web like deer on a freeway.

-Geoffrey Moore, author, and consultant.

What is data science?

Data science is a study of data. It is multidisciplinary that involves maths, statistics, algorithms, domain expertise, processes, and systems to extract insights from data. This data might be structured, semi-structured, and unstructured. The following Figure 1.1 display different structures of data:

Figure 1.1

Structured data

Tabular rows and columns (Databases)

DWH (Tera data systems) and BI Systems

Text files such as comma-separated (.csv), tab-separated (.tsv).

Semi-structured data

Excel, XML, JSON, Logs.

Unstructured data

Audio, Video, Images.

Domain expertise

Domain expertise or domain knowledge is about expertise in a particular field like Healthcare, Insurance, Banking, and so on. A domain expert may or may not relate to technology but has in-depth knowledge of a particular industry, its trends, and practices that impact the industry. The process of data analysis not only requires having good expertise in tools and computational techniques but also needs to have a good understanding of the data. In short, the data analyst must be able to know how to search not only for data but also for information and how to treat that information to get valid insights from it.

For example, you are asked to build an application for e-commerce, banking, or insurance domain. The application has to be that it complements the industry and various dimensions of it. The technical team wouldn‘t know the industry norms or the application features; here is where domain expert and domain knowledge comes into the picture.

Maths and statistics

It is a study of statistics from a mathematical point of view. Data analysis requires a good amount of math. Good knowledge of statistics is also required because the statistical methods are applied to the analysis and interpretation of the data. Python provides a good amount of libraries to solve these mathematical and statistical problems, but one should have a good idea about how the libraries work.

Artificial intelligence

Artificial intelligence is the intelligence demonstrated by machines, in contrast to the natural intelligence displayed by humans. Artificial intelligence is the superset of data science, which is one of the advanced concepts in data analysis. It is the study of training computers for jobs which are done by humans. The term Artificial intelligence is two different words: Artificial means something which is not natural or human-made, and Intelligence means the ability to think or understand.

AI Market is already widespread, and you interact with it on a daily basis. Here are a few examples of Artificial intelligence:

Search engines like Google internally use gigantic algorithms to perform a better search.

Self-driving cars where the vehicles can completely navigate their way from one point to another.

Chatbots help as online messengers to assist customers immediately and effectively.

Voice searches on smartphones use AI to determine the best result for those long-tail keywords and conversational queries.

Online Ads use AI to target specific customers based on past behavior, interest, and search queries.

Machine learning

It is an Algorithmic driven study which makes computers capable of learning based on their own previous experience and improve the performance of the task. Machine learning is the subset of Artificial intelligence, and it is a study of machines where machines learn by themselves without being explicitly trained. Assuming you are asked to write a program for a speech recognition software converting speech to text, based on accent, grammar, pronunciation, vocabulary. It would be a gigantic task that can be easily understood by machine learning.

Technically machine learning is divided into three parts, explained as follows:

Supervised learning

In this learning, we ask machine questions and compare answers with the actual answers and instruct the machines to minimize the errors. Supervised machine learning can do things as follows:

Weather forecasting.

Detecting online frauds.

Market forecasting.

Image classification.

Unsupervised learning

In this learning, you give the machine huge chunks of data and instruct it to find some sort of patterns, and based on these patterns, your machine accomplishes certain tasks. Unsupervised machine learning can do things as follows:

Build recommendation engines

Targeted marketing

Customer segmentation

Reinforcement learning

In this learning, the machine is left in an environment where something is happening, and there is a reward if the machine does what we want, and there is a penalty if it performs incorrectly and based on it we instruct the machine to maximize the reward, and eventually, the machine learns the things which we want it to do. Reinforcement learning works on:

Games

Bidding and advertising

Training self-driven cars

Data infrastructure

Generally, people tend to refer to infrastructure as those things that support what they are doing at work. For example, the roads used for transportation, sewage system, and bridges, all these are considered as infrastructure. The role of data infrastructure is to protect, preserve, process, move, secure, and serve data as well as their applications for information service delivery. Data infrastructure includes software, hardware, and cloud or managed services, servers, storage, and so on.

Thanks to the Big data world, it generates a humongous amount of information that needs to be processed. Sometimes normal desktop systems or servers doesn‘t have enough computation power to read, process, or analyze them. We need systems with a high configuration of RAM or a good amount of disk space to save the data. The cloud-based Amazon (AWS)/GCP/Azure help us meet the challenges through resource allocation and virtualization.

Data analysis process

Data analysis is a series of steps in which the raw data is transformed and processed in order to produce insights about the data and to make predictions. The processing includes mathematical and statistical approaches and charts or graphs for data visualizations. So data analysis is schematized as a process chain consisting of the following sequence of stages, as shown in Figure 1.2:

Figure 1.2

Let‘s discuss these processes in detail.

Business requirements

Data Analysis starts with a problem to be solved, which needs to be defined, like predicting the stock price of a company or identifying credit card fraudulent transactions or detecting tumors based on health data and so on.

Data collection

The data must be chosen with the basic purpose of building a predictive model. This is the most tedious task to analyze anything we need to have data. Mostly data will be shared by the clients in the form of comma-separated, tab-delimited, pipe delimited files. Not all data is available in files or databases; it can be as HTML pages; this process of collecting the data is called Web Scraping. Python libraries such as scrapy, beautiful soup, and requests help in scraping the data from web pages.

Data cleansing

This stage seems to be less problematic but requires more resources and time to complete. The data collected may be from different sources such as excel, CSV, Json, parquet or a scraped data from a web page each of which will have different representation of data like date field might be a string or an integer might be read as float, so all these data needs to be cleaned for data analysis. Cleansing includes invalid data, ambiguous or missing values or outliers in the data.

Data exploring and visualization

Exploration is the process of graphical and statistical representation to find patterns, connections, and relations between variables in the data. Python libraries such as matplotlib and seaborn help us to visualize the data. Different statistical formats like heatmaps, boxplot, violin plot, scatter plots help us to understand the patterns, outliers, and relationships better. Exploration also includes one or more of the following activities:

Grouping the data

Summarizing the data

Construction of regression models to find the deviation of data

Data modeling

It is the process of choosing a suitable statistical model to predict the result. After data exploration, we need to develop a mathematical model that encodes the relationship between data. These models are divided according to the result they produce:

Classification: If the result obtained by the model is categorical.

Regression: If the result obtained by the model is numerical.

Clustering: It involves grouping of the data points to gain valuable insights.

Python’s Scikit Learn library provides methods such as linear regression, logistic regression, classification trees, SVM, Adaboost, and K-nearest neighbor to generate these models.

Model validation and testing

Validation of the model is divided into train and test phase. The data is randomly divided to 70 percent for training, 30 percent for testing. The model gets trained by the 70 percent data, which in turn compares with the remaining 30 percent test data. There are several techniques to validate the effectiveness of the model; the most popular is k-Fold

Enjoying the preview?

Page 1 of 1

Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries

About this ebook

Purna Chander Rao. Kathula

Related authors

Related to Hands-on Data Analysis and Visualization with Pandas

Related ebooks

Computers For You

Related podcast episodes

Related articles

Related categories

Reviews for Hands-on Data Analysis and Visualization with Pandas

What did you think?

Book preview

Hands-on Data Analysis and Visualization with Pandas - Purna Chander Rao. Kathula

CHAPTER 1

Introduction to Data Analysis

Structure

Objective

Inspiration for data analysis

What is data science?

Structured data

Domain expertise

Maths and statistics

Artificial intelligence

Machine learning

Supervised learning

Unsupervised learning

Reinforcement learning

Data infrastructure

Data analysis process

Business requirements

Data collection

Data cleansing

Data exploring and visualization

Data modeling

Model validation and testing