Data Analytics: Principles, Tools, and Practices: A Complete Guide for Advanced Data Analytics Using the Latest Trends, Tools, and Technologies

Ebook814 pages15 hours

Data Analytics: Principles, Tools, and Practices: A Complete Guide for Advanced Data Analytics Using the Latest Trends, Tools, and Technologies

Name: Data Analytics: Principles, Tools, and Practices: A Complete Guide for Advanced Data Analytics Using the Latest Trends, Tools, and Technologies
Author: Dr. Gaurav Aroraa
ISBN: 9789389845914

By Dr. Gaurav Aroraa, Chitra Lele and Dr. Munish Jindal

Rating: 0 out of 5 stars

()

Read preview

About this ebook

These days critical problem solving related to data and data sciences is in demand. Professionals who can solve real data science problems using data science tools are in demand. The book “Data Analytics: Principles, Tools, and Practices” can be considered a handbook or a guide for professionals who want to start their journey in the field of data science.

The journey starts with the introduction of DBMS, RDBMS, NoSQL, and DocumentDB. The book introduces the essentials of data science and the modern ecosystem, including the important steps such as data ingestion, data munging, and visualization. The book covers the different types of analysis, different Hadoop ecosystem tools like Apache Spark, Apache Hive, R, MapReduce, and NoSQL Database. It also includes the different machine learning techniques that are useful for data analytics and how to visualize data with different graphs and charts. The book discusses useful tools and approaches for data analytics, supported by concrete code examples.

After reading this book, you will be motivated to explore real data analytics and make use of the acquired knowledge on databases, BI/DW, data visualization, Big Data tools, and statistical science.

Skip carousel

Computers

LanguageEnglish

PublisherBPB Online LLP

Release dateJan 24, 2022

ISBN9789389845914

Author

Dr. Gaurav Aroraa

Related authors

Skip carousel

Related to Data Analytics

Related ebooks

Skip carousel

Data Scientist Pocket Guide: Over 600 Concepts, Terminologies, and Processes of Machine Learning and Deep Learning Assembled Together
Ebook
Data Scientist Pocket Guide: Over 600 Concepts, Terminologies, and Processes of Machine Learning and Deep Learning Assembled Together
byMohamed Sabri
Rating: 0 out of 5 stars
0 ratings
The Key to Successful Data Migration: Pre-Migration Activities
Ebook
The Key to Successful Data Migration: Pre-Migration Activities
byRajender Kumar
Rating: 0 out of 5 stars
0 ratings
Beginning with Machine Learning: The Ultimate Introduction to Machine Learning, Deep Learning, Scikit-learn, and TensorFlow (English Edition)
Ebook
Beginning with Machine Learning: The Ultimate Introduction to Machine Learning, Deep Learning, Scikit-learn, and TensorFlow (English Edition)
byDr. Amit Dua
Rating: 0 out of 5 stars
0 ratings
Machine Learning: Master Supervised and Unsupervised Learning Algorithms with Real Examples (English Edition)
Ebook
Machine Learning: Master Supervised and Unsupervised Learning Algorithms with Real Examples (English Edition)
byKamalkant Hiran
Rating: 0 out of 5 stars
0 ratings
Machine Learning Cookbook with Python: Create ML and Data Analytics Projects Using Some Amazing Open Datasets (English Edition)
Ebook
Machine Learning Cookbook with Python: Create ML and Data Analytics Projects Using Some Amazing Open Datasets (English Edition)
byRehan Guha
Rating: 0 out of 5 stars
0 ratings
Mastering Data Analysis with Python: A Comprehensive Guide to NumPy, Pandas, and Matplotlib
Ebook
Mastering Data Analysis with Python: A Comprehensive Guide to NumPy, Pandas, and Matplotlib
byRajender Kumar
Rating: 0 out of 5 stars
0 ratings
Practical Machine Learning with Spark: Uncover Apache Spark’s Scalable Performance with High-Quality Algorithms Across NLP, Computer Vision and ML
Ebook
Practical Machine Learning with Spark: Uncover Apache Spark’s Scalable Performance with High-Quality Algorithms Across NLP, Computer Vision and ML
byGourav Gupta
Rating: 0 out of 5 stars
0 ratings
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next
Ebook
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next
byRupam Kumar Sharma
Rating: 0 out of 5 stars
0 ratings
Power BI Data Modeling: Build Interactive Visualizations, Learn DAX, Power Query, and Develop BI Models
Ebook
Power BI Data Modeling: Build Interactive Visualizations, Learn DAX, Power Query, and Develop BI Models
byNisal Mihiranga
Rating: 0 out of 5 stars
0 ratings
Business Analytics with SAS Studio: Deliver Business Intelligence by Combining SQL Processing, Insightful Visualizations, and Various Data Mining Techniques
Ebook
Business Analytics with SAS Studio: Deliver Business Intelligence by Combining SQL Processing, Insightful Visualizations, and Various Data Mining Techniques
byRajinder Kr. Chitoria
Rating: 0 out of 5 stars
0 ratings
Mastering Databricks Lakehouse Platform: Perform Data Warehousing, Data Engineering, Machine Learning, DevOps, and BI into a Single Platform (English Edition)
Ebook
Mastering Databricks Lakehouse Platform: Perform Data Warehousing, Data Engineering, Machine Learning, DevOps, and BI into a Single Platform (English Edition)
bySagar Lad
Rating: 0 out of 5 stars
0 ratings
Data Visualization Strategy Standard Requirements
Ebook
Data Visualization Strategy Standard Requirements
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data
Ebook
The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data
byRalph Kimball
Rating: 4 out of 5 stars
4/5
Data Analytics
Ebook
Data Analytics
byJeffery Short
Rating: 1 out of 5 stars
1/5
Hands-On Azure Data Platform: Building Scalable Enterprise-Grade Relational and Non-Relational database Systems with Azure Data Services
Ebook
Hands-On Azure Data Platform: Building Scalable Enterprise-Grade Relational and Non-Relational database Systems with Azure Data Services
bySagar Lad
Rating: 0 out of 5 stars
0 ratings
Practitioner’s Guide to Data Science: Streamlining Data Science Solutions using Python, Scikit-Learn, and Azure ML Service Platform
Ebook
Practitioner’s Guide to Data Science: Streamlining Data Science Solutions using Python, Scikit-Learn, and Azure ML Service Platform
byNasir Ali Mirza
Rating: 0 out of 5 stars
0 ratings
Simple Data Science (R)
Ebook
Simple Data Science (R)
byNarayana Nemani
Rating: 5 out of 5 stars
5/5
Data Science for Business: Predictive Modeling, Data Mining, Data Analytics, Data Warehousing, Data Visualization, Regression Analysis, Database Querying, and Machine Learning for Beginners
Ebook
Data Science for Business: Predictive Modeling, Data Mining, Data Analytics, Data Warehousing, Data Visualization, Regression Analysis, Database Querying, and Machine Learning for Beginners
byHerbert Jones
Rating: 0 out of 5 stars
0 ratings
The Applied SQL Data Analytics Workshop - Second Edition: Develop your practical skills and prepare to become a professional data analyst, 2nd Edition
Ebook
The Applied SQL Data Analytics Workshop - Second Edition: Develop your practical skills and prepare to become a professional data analyst, 2nd Edition
byMatt Goldwasser
Rating: 0 out of 5 stars
0 ratings
Mastering Data Warehouse Design: Relational and Dimensional Techniques
Ebook
Mastering Data Warehouse Design: Relational and Dimensional Techniques
byClaudia Imhoff
Rating: 4 out of 5 stars
4/5
Instant Creating Data Models with PowerPivot How-to
Ebook
Instant Creating Data Models with PowerPivot How-to
byLeo Taehyung Lee
Rating: 1 out of 5 stars
1/5
PYTHON FOR DATA ANALYSIS: A Practical Guide to Manipulating, Cleaning, and Analyzing Data Using Python (2023 Beginner Crash Course)
Ebook
PYTHON FOR DATA ANALYSIS: A Practical Guide to Manipulating, Cleaning, and Analyzing Data Using Python (2023 Beginner Crash Course)
byIke Beck
Rating: 0 out of 5 stars
0 ratings
Banking on Cloud Data Platforms: A Guide
Ebook
Banking on Cloud Data Platforms: A Guide
byDillip Kumar
Rating: 0 out of 5 stars
0 ratings
SQL in 7 Days: A Quick Crash Course in Manipulating Data, Databases Operations, Writing Analytical Queries, and Server-Side Programming (English Edition)
Ebook
SQL in 7 Days: A Quick Crash Course in Manipulating Data, Databases Operations, Writing Analytical Queries, and Server-Side Programming (English Edition)
byAlex Bolenok
Rating: 0 out of 5 stars
0 ratings
Data Fluency: Empowering Your Organization with Effective Data Communication
Ebook
Data Fluency: Empowering Your Organization with Effective Data Communication
byZach Gemignani
Rating: 2 out of 5 stars
2/5
Developing Analytic Talent: Becoming a Data Scientist
Ebook
Developing Analytic Talent: Becoming a Data Scientist
byVincent Granville
Rating: 3 out of 5 stars
3/5
Apache Spark Machine Learning Blueprints
Ebook
Apache Spark Machine Learning Blueprints
byAlex Liu
Rating: 0 out of 5 stars
0 ratings
Data Engineering A Complete Guide - 2020 Edition
Ebook
Data Engineering A Complete Guide - 2020 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Learning Tableau 2019 - Third Edition: Tools for Business Intelligence, data prep, and visual analytics, 3rd Edition
Ebook
Learning Tableau 2019 - Third Edition: Tools for Business Intelligence, data prep, and visual analytics, 3rd Edition
byJoshua N. Milligan
Rating: 0 out of 5 stars
0 ratings
Data Analytics with Google Cloud Platform
Ebook
Data Analytics with Google Cloud Platform
byMurari Ramuka
Rating: 0 out of 5 stars
0 ratings

Computers For You

Skip carousel

Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 5 out of 5 stars
5/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
Ebook
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
byHadelin de Ponteves
Rating: 0 out of 5 stars
0 ratings
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
Ebook
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
byNigel Tillery
Rating: 0 out of 5 stars
0 ratings
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
Ebook
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
byAlex Parkinson
Rating: 4 out of 5 stars
4/5
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
Ebook
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
byCea West
Rating: 4 out of 5 stars
4/5
Deep Search: How to Explore the Internet More Effectively
Ebook
Deep Search: How to Explore the Internet More Effectively
byAlan Pearce
Rating: 5 out of 5 stars
5/5
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
Ebook
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
Ebook
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
byQuentin Docter
Rating: 0 out of 5 stars
0 ratings
CompTIA Security+ Practice Questions
Ebook
CompTIA Security+ Practice Questions
byIP Specialist
Rating: 2 out of 5 stars
2/5
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
Ebook
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
byTJ Books
Rating: 0 out of 5 stars
0 ratings
Network+ Study Guide & Practice Exams
Ebook
Network+ Study Guide & Practice Exams
byRobert Shimonski
Rating: 4 out of 5 stars
4/5
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
Ebook
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
byRizwan Virk
Rating: 5 out of 5 stars
5/5
Ultimate Guide to Mastering Command Blocks!: Minecraft Keys to Unlocking Secret Commands
Ebook
Ultimate Guide to Mastering Command Blocks!: Minecraft Keys to Unlocking Secret Commands
byTriumph Books
Rating: 5 out of 5 stars
5/5
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
Ebook
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
byAaron Smith
Rating: 0 out of 5 stars
0 ratings
Practical Lock Picking: A Physical Penetration Tester's Training Guide
Ebook
Practical Lock Picking: A Physical Penetration Tester's Training Guide
byDeviant Ollam
Rating: 5 out of 5 stars
5/5
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
Ebook
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
byMaximus Wilson
Rating: 0 out of 5 stars
0 ratings
AP Computer Science Principles Premium, 2024: 6 Practice Tests + Comprehensive Review + Online Practice
Ebook
AP Computer Science Principles Premium, 2024: 6 Practice Tests + Comprehensive Review + Online Practice
bySeth Reichelson
Rating: 0 out of 5 stars
0 ratings
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
Childhood Unplugged: Practical Advice to Get Kids Off Screens and Find Balance
Ebook
Childhood Unplugged: Practical Advice to Get Kids Off Screens and Find Balance
byKatherine Johnson Martinko
Rating: 0 out of 5 stars
0 ratings
The Professional Voiceover Handbook: Voiceover training, #1
Ebook
The Professional Voiceover Handbook: Voiceover training, #1
byPeter Baker
Rating: 5 out of 5 stars
5/5
Summary of Dotcom Secrets: by Russell Brunson - The Underground Playbook for Growing Your Company Online with Sales Funnels - A Comprehensive Summary
Ebook
Summary of Dotcom Secrets: by Russell Brunson - The Underground Playbook for Growing Your Company Online with Sales Funnels - A Comprehensive Summary
byAlexander Cooper
Rating: 5 out of 5 stars
5/5
Dark Aeon: Transhumanism and the War Against Humanity
Ebook
Dark Aeon: Transhumanism and the War Against Humanity
byJoe Allen
Rating: 5 out of 5 stars
5/5
Elon Musk
Ebook
Elon Musk
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
Master Builder Roblox: The Essential Guide
Ebook
Master Builder Roblox: The Essential Guide
byTriumph Books
Rating: 4 out of 5 stars
4/5
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
Ebook
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
byTriumph Books
Rating: 4 out of 5 stars
4/5
How to Write a Book: An 11-Step Process to Build Habits, Stop Procrastinating, Fuel Self-Motivation, Quiet Your Inner Critic, Bust Through Writer's Block, & Let Your Creative Juices Flow (Short Read)
Ebook
How to Write a Book: An 11-Step Process to Build Habits, Stop Procrastinating, Fuel Self-Motivation, Quiet Your Inner Critic, Bust Through Writer's Block, & Let Your Creative Juices Flow (Short Read)
byDavid Kadavy
Rating: 5 out of 5 stars
5/5
Hacking: Ultimate Beginner's Guide for Computer Hacking in 2018 and Beyond: Hacking in 2018, #1
Ebook
Hacking: Ultimate Beginner's Guide for Computer Hacking in 2018 and Beyond: Hacking in 2018, #1
byDexter Jackson
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

An Agile Approach To Master Data Management with Mark Marinelli - Episode 46: Building A Master Data Catalog Using Machine Learning (Interview)
Podcast episode
An Agile Approach To Master Data Management with Mark Marinelli - Episode 46: Building A Master Data Catalog Using Machine Learning (Interview)
byData Engineering Podcast
100%
100% found this document useful
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
Podcast episode
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
040: Graph Databases: Traditional relational databases like MySQL or Postgres are really good at providing many solutions to the problem of persisting state. But these types of database are really horrible at querying highly connected models in an efficient way. Graph datab...
Podcast episode
040: Graph Databases: Traditional relational databases like MySQL or Postgres are really good at providing many solutions to the problem of persisting state. But these types of database are really horrible at querying highly connected models in an efficient way. Graph datab...
byPHPRoundtable Podcast
0 ratings
0% found this document useful
Unlocking The Power of Data Lineage In Your Platform with OpenLineage: An interview with Julien Le Dem about the OpenLineage specification and the opportunity that it offers for simplifying the tracking and analysis of data lineage across your data platform.
Podcast episode
Unlocking The Power of Data Lineage In Your Platform with OpenLineage: An interview with Julien Le Dem about the OpenLineage specification and the opportunity that it offers for simplifying the tracking and analysis of data lineage across your data platform.
byData Engineering Podcast
0 ratings
0% found this document useful
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
Podcast episode
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
byNew Books in Science, Technology, and Society
0 ratings
0% found this document useful
Getting Technical about the Data Center Revolution with Jonathan Friedmann, CEO of Speedata
Podcast episode
Getting Technical about the Data Center Revolution with Jonathan Friedmann, CEO of Speedata
byMaking Data Simple
0 ratings
0% found this document useful
62: Cracking the Data Code w/ Mike Bugembe: Mike Bugembe is a speaker, consultant, and Amazon best selling author of the book Cracking the Data Code. He joins today’s podcast to talk about the things that you can do that will help create successful analytics projects. After being the...
Podcast episode
62: Cracking the Data Code w/ Mike Bugembe: Mike Bugembe is a speaker, consultant, and Amazon best selling author of the book Cracking the Data Code. He joins today’s podcast to talk about the things that you can do that will help create successful analytics projects. After being the...
byAnalytics on Fire
0 ratings
0% found this document useful
Data Visualization with Manuel Lima: Gabi Ferrara and Jon Foust are back today and joined by fellow Googler Manuel Lima.
Podcast episode
Data Visualization with Manuel Lima: Gabi Ferrara and Jon Foust are back today and joined by fellow Googler Manuel Lima.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Reflections On Designing A Data Platform From Scratch: A monologue by Tobias Macey, the host of the show, about the design considerations involved in building a data platform and how the lessons learned from running the Data Engineering Podcast are influencing the choices made.
Podcast episode
Reflections On Designing A Data Platform From Scratch: A monologue by Tobias Macey, the host of the show, about the design considerations involved in building a data platform and how the lessons learned from running the Data Engineering Podcast are influencing the choices made.
byData Engineering Podcast
100%
100% found this document useful
Using Product Driven Development To Improve The Productivity And Effectiveness Of Your Data Teams: With all of the messaging about treating data as a product it is becoming difficult to know what that even means. Vishal Singh is the head of products at Starburst which means that he has to spend all of his time thinking and talking about the details of product thinking and its application to data. In this episode he shares his thoughts on the strategic and tactical elements of moving your work as a data professional from being task-oriented to being product-oriented and the long term improvements in your productivity that it provides.
Podcast episode
Using Product Driven Development To Improve The Productivity And Effectiveness Of Your Data Teams: With all of the messaging about treating data as a product it is becoming difficult to know what that even means. Vishal Singh is the head of products at Starburst which means that he has to spend all of his time thinking and talking about the details of product thinking and its application to data. In this episode he shares his thoughts on the strategic and tactical elements of moving your work as a data professional from being task-oriented to being product-oriented and the long term improvements in your productivity that it provides.
byData Engineering Podcast
0 ratings
0% found this document useful
Brain Fact Friday ”Priming the Reticular Activating System to Achieve Our Goals in 2023” A Tribute to Bob Proctor: “If you have built castles in the air, your work need not be lost: this is where they should be. Now put the foundations under them.” --Henry David Thoreau from Walden Welcome back to The Neuroscience Meets Social and Emotional Learning Podcast wher...
Podcast episode
Brain Fact Friday ”Priming the Reticular Activating System to Achieve Our Goals in 2023” A Tribute to Bob Proctor: “If you have built castles in the air, your work need not be lost: this is where they should be. Now put the foundations under them.” --Henry David Thoreau from Walden Welcome back to The Neuroscience Meets Social and Emotional Learning Podcast wher...
byNeuroscience Meets Social and Emotional Learning
0 ratings
0% found this document useful
EP272 The neuroscience of co-thriving with students (with. Dr. Rebecca Branstetter): Want some practical strategies backed by brain research that can help you and your students co-thrive together? My guest today is Dr. Rebecca Branstetter, Ph.D. She’s a school psychologist, speaker, and author on a mission to help children thrive by...
Podcast episode
EP272 The neuroscience of co-thriving with students (with. Dr. Rebecca Branstetter): Want some practical strategies backed by brain research that can help you and your students co-thrive together? My guest today is Dr. Rebecca Branstetter, Ph.D. She’s a school psychologist, speaker, and author on a mission to help children thrive by...
byAngela Watson's Truth for Teachers
0 ratings
0% found this document useful
Promoting Career Success in STEM with Harinder Singh: Hi! This week I have the pleasure of bringing you someone whose mission is to help graduate students and postdocs with their career development. As Director of UC Irvine's GPS-STEM Program, Harinder Singh achieves this transition support through differ...
Podcast episode
Promoting Career Success in STEM with Harinder Singh: Hi! This week I have the pleasure of bringing you someone whose mission is to help graduate students and postdocs with their career development. As Director of UC Irvine's GPS-STEM Program, Harinder Singh achieves this transition support through differ...
byBeyond the Thesis With Papa PhD
0 ratings
0% found this document useful
Deep Listening with Indi Young: Driving Invention & Inclusion
Podcast episode
Deep Listening with Indi Young: Driving Invention & Inclusion
byUX Cake
0 ratings
0% found this document useful
Episode 112: What Are My Core Values & Why You Should Care?: In the last few weeks, I have been thinking about how I can create sustainable change in my life and my business. This week I am excited to share my 4 core values and why you should care? My values are inspired by the Black Lives Matter Movement and...
Podcast episode
Episode 112: What Are My Core Values & Why You Should Care?: In the last few weeks, I have been thinking about how I can create sustainable change in my life and my business. This week I am excited to share my 4 core values and why you should care? My values are inspired by the Black Lives Matter Movement and...
byHer Stem Story
0 ratings
0% found this document useful
#97 Biohacking Your Impact with Anese Cavanaugh
Podcast episode
#97 Biohacking Your Impact with Anese Cavanaugh
byThe Human Upgrade with Dave Asprey
0 ratings
0% found this document useful
My Simple Productivity System
Podcast episode
My Simple Productivity System
byThe Motivated Mind
0 ratings
0% found this document useful
9 EdTech Ideas to Use Now: Rachelle Dene Poth shares 9 of her favorite EdTech ideas that you can easily use now, no matter what your classroom environment requirements are. Teachers are looking for easy and helpful resources, and right now with so much going on searching for...
Podcast episode
9 EdTech Ideas to Use Now: Rachelle Dene Poth shares 9 of her favorite EdTech ideas that you can easily use now, no matter what your classroom environment requirements are. Teachers are looking for easy and helpful resources, and right now with so much going on searching for...
by10 Minute Teacher Podcast with Cool Cat Teacher
0 ratings
0% found this document useful
Making good typography more accessible and common design pitfalls to avoid with Matthew Paul: On this episode Abadesi talks to Matthew Paul, software product designer, researcher, and front-end engineer. He’s a former product designer at InVision, he’s worked on software and design systems at IBM, and has designed prototypes at Apple. In this episode they talk about... * The open-source design project he’s working on, and how to make good design more accessible. * Common design pitfalls to avoid and advice for working with designers. * How he’s working on his personal development and where he learns the most. We’ll be back next week so be sure to subscribe on Apple Podcasts, Google Podcasts, Spotify, Breaker, Overcast, or wherever you listen to your favorite podcasts. Big thanks to Headspin Mobile for their support. ?
Podcast episode
Making good typography more accessible and common design pitfalls to avoid with Matthew Paul: On this episode Abadesi talks to Matthew Paul, software product designer, researcher, and front-end engineer. He’s a former product designer at InVision, he’s worked on software and design systems at IBM, and has designed prototypes at Apple. In this episode they talk about... * The open-source design project he’s working on, and how to make good design more accessible. * Common design pitfalls to avoid and advice for working with designers. * How he’s working on his personal development and where he learns the most. We’ll be back next week so be sure to subscribe on Apple Podcasts, Google Podcasts, Spotify, Breaker, Overcast, or wherever you listen to your favorite podcasts. Big thanks to Headspin Mobile for their support. ?
byProduct Hunt Radio
0 ratings
0% found this document useful
98. How a Metrics-Based Approach to Well-being Boosts Creativity and Outcomes with Arosha Brouwer
Podcast episode
98. How a Metrics-Based Approach to Well-being Boosts Creativity and Outcomes with Arosha Brouwer
byWhat the Fundraising
0 ratings
0% found this document useful
84: Encouragement for the Overwhelmed Nonprofit Leader: 9 Ways to Recharge: Get on the Fast Track to Grant Writer: www.teresahuff.com/vip. When You're Wearing All The Nonprofit Hats Already in the first quarter of 2022 I’ve had the privilege of attending several conferences and workshops, both in-person and virtual....
Podcast episode
84: Encouragement for the Overwhelmed Nonprofit Leader: 9 Ways to Recharge: Get on the Fast Track to Grant Writer: www.teresahuff.com/vip. When You're Wearing All The Nonprofit Hats Already in the first quarter of 2022 I’ve had the privilege of attending several conferences and workshops, both in-person and virtual....
byGrant Writing Simplified
0 ratings
0% found this document useful
How To Get Your Instructional Coaches Thinking Like an Administrator
Podcast episode
How To Get Your Instructional Coaches Thinking Like an Administrator
byAsk The Tech Coach
0 ratings
0% found this document useful
Psychologist Darshan Pindoria on ”Bringing Mindfulness and Meditation to Our Daily Lives”: "Life gives you exactly what you need to awaken." Watch this interview on YouTube here: https://youtu.be/wpxogphAQqw On today’s Episode #266 we will cover ✔ Tips from Darshan Pindoria, who spent 9 months as a monk, for those beginning their medita...
Podcast episode
Psychologist Darshan Pindoria on ”Bringing Mindfulness and Meditation to Our Daily Lives”: "Life gives you exactly what you need to awaken." Watch this interview on YouTube here: https://youtu.be/wpxogphAQqw On today’s Episode #266 we will cover ✔ Tips from Darshan Pindoria, who spent 9 months as a monk, for those beginning their medita...
byNeuroscience Meets Social and Emotional Learning
0 ratings
0% found this document useful
Making Balanced Living A Reality
Podcast episode
Making Balanced Living A Reality
byGot Clutter? Get Organized! with Janet
0 ratings
0% found this document useful
Anuradha Mayer on How Companies Can Improve Diversity, Equity, Inclusion and Belonging (DEIB): Have you ever experienced a transformational time in your workplace that demanded agile thinking, new combinations of skills, a higher level of collaboration, and a growth mindset? This week I have Anuradha Mayer of Wind River on the podcast to...
Podcast episode
Anuradha Mayer on How Companies Can Improve Diversity, Equity, Inclusion and Belonging (DEIB): Have you ever experienced a transformational time in your workplace that demanded agile thinking, new combinations of skills, a higher level of collaboration, and a growth mindset? This week I have Anuradha Mayer of Wind River on the podcast to...
byWomen Taking the Lead
0 ratings
0% found this document useful
86: A Step by Step Exercise to Address Overwhelm at Work: How Do You Address Overwhelm at Work? I talk to so many nonprofit leaders who are overwhelmed and feel like they’re wearing ALL The hats. I also talk to grant writers who are overwhelmed because they’re juggling so many plates with...
Podcast episode
86: A Step by Step Exercise to Address Overwhelm at Work: How Do You Address Overwhelm at Work? I talk to so many nonprofit leaders who are overwhelmed and feel like they’re wearing ALL The hats. I also talk to grant writers who are overwhelmed because they’re juggling so many plates with...
byGrant Writing Simplified
0 ratings
0% found this document useful
#343: Dr. Nicole LePera (The Holistic Psychologist) on Becoming the Love You Seek, Interdependent Relationships, Attachment & Connecting to the Heart
Podcast episode
#343: Dr. Nicole LePera (The Holistic Psychologist) on Becoming the Love You Seek, Interdependent Relationships, Attachment & Connecting to the Heart
byThe Human Design Podcast
0 ratings
0% found this document useful
100% Jodi: Taking Time to Process: In this episode I’m sharing some things I’ve been up to in the past year and what I’m doing for myself and my business in the coming months. For those of you who are new to Women Taking the Lead…hello and welcome! I’m Jodi Flynn and welcome...
Podcast episode
100% Jodi: Taking Time to Process: In this episode I’m sharing some things I’ve been up to in the past year and what I’m doing for myself and my business in the coming months. For those of you who are new to Women Taking the Lead…hello and welcome! I’m Jodi Flynn and welcome...
byWomen Taking the Lead
0 ratings
0% found this document useful
Shedding Light on Your ADHD with Coaching: Do you ever wonder about the next step you could take on your ADHD journey? I wanted to take the opportunity on this episode to share how coaching could be a great next step! Yes, I am an ADHD life coach and would love the opportunity to work with...
Podcast episode
Shedding Light on Your ADHD with Coaching: Do you ever wonder about the next step you could take on your ADHD journey? I wanted to take the opportunity on this episode to share how coaching could be a great next step! Yes, I am an ADHD life coach and would love the opportunity to work with...
byJourney With Me Through ADHD: A podcast for kids
0 ratings
0% found this document useful
Learning How to Learn, Studying How to Study: What does it take to be an effective learner, and how does a neurodivergent person’s experience differ from that of neurotypical people? Which study methods work better for our neurodivergent population? What role does confidence play in the...
Podcast episode
Learning How to Learn, Studying How to Study: What does it take to be an effective learner, and how does a neurodivergent person’s experience differ from that of neurotypical people? Which study methods work better for our neurodivergent population? What role does confidence play in the...
byNeurodiversity Podcast
0 ratings
0% found this document useful

Skip carousel

The C in CEO Stands for Culture
Fast Company
Article
The C in CEO Stands for Culture
Oct 1, 2017
4 min read
How This Business Grad Became A Tech Non-profit Founder
Her World Singapore
Article
How This Business Grad Became A Tech Non-profit Founder
May 5, 2023
WHO: Nurul Jihadah Hussain HIGHEST EDUCATION: Masters in Business Administration; Master of Arts, Arabic and Politics JOB TITLE & INDUSTRY: Founder of The Codette Project, Tech/Social Impact, and Programme Manager, Accelerating Asia WORK EXPERIENCE:
3 min read
Q: Is Working From Home More productive?
Harper's Bazaar India
Article
Q: Is Working From Home More productive?
Sep 20, 2021
8 min read
An Emerging Leader's Mindset
Rotman Management
Article
An Emerging Leader's Mindset
Sep 1, 2019
You are known as a champion of workplace inclusivity. For anyone who still needs to be convinced, why is it so important? This is something I’ve always been passionate about, even before it was part of my job description. There are two things in part
6 min read
Making It Count
Femina
Article
Making It Count
May 10, 2020
3 min read
Mentorship Matters
Her World Singapore
Article
Mentorship Matters
Aug 1, 2023
6 min read
Simran Lal, CEO, Good Earth and Creative Director, Nicobar on MINDFUL LEADERSHIP
Harper's Bazaar India
Article
Simran Lal, CEO, Good Earth and Creative Director, Nicobar on MINDFUL LEADERSHIP
Mar 15, 2021
3 min read
Chandni Luthra
HELLO India
Article
Chandni Luthra
Jun 6, 2019
3 min read
Market Force
Vogue Australia
Article
Market Force
May 2, 2021
5 min read
Help IS AT HAND
Fairlady
Article
Help IS AT HAND
Aug 18, 2023
5 min read
Better Lives, Better Selves
Her World Singapore
Article
Better Lives, Better Selves
Mar 26, 2020
3 min read
Diipa Khosla and Zara Umrigar
HELLO India
Article
Diipa Khosla and Zara Umrigar
Sep 5, 2019
9 min read
Editor's Note
The Art of Healing
Article
Editor's Note
Aug 31, 2023
Life. I have always had this mantra I recite .. its not what life does to you, it’s what you do to it. Certainly life is challenging. I am sure you will all agree with that, but also, it is how we respond and get through these challenges. I see life
2 min read
Grace Ann Chua
Tatler Singapore
Article
Grace Ann Chua
Nov 6, 2023
2 min read
Real Talk: 6 Women Share How They’ve Been Navigating COVID-19
Best Self Magazine
Article
Real Talk: 6 Women Share How They’ve Been Navigating COVID-19
Sep 9, 2020
5 min read
Living And Learning
Money Magazine
Article
Living And Learning
Oct 2, 2019
5 min read
Goal Achiever
ThinkSales
Article
Goal Achiever
Nov 2, 2016
2 min read
Being Seen
Mindful
Article
Being Seen
Aug 13, 2020
12 min read
10 Most Powerful Innovators
India Today
Article
10 Most Powerful Innovators
Oct 26, 2018
The Design Diary Ishrat Sahgal 29 Founder, Mishcat Co and Ishrat Sahgal Studio, Delhi Meaning of power With power comes a lot of responsibility. We encourage women weavers from single bread earning families to engage. In the past, it was unheard of
10 min read
Q:how Can You Stay Motivated?
Harper's Bazaar India
Article
Q:how Can You Stay Motivated?
Mar 15, 2021
6 min read
Off The Path
Writer's Digest
Article
Off The Path
Feb 13, 2023
7 min read
Success Starts with Self: 35 Must-Read Articles to Set You Up For Long-Term Success in Life, Love and Your Career
OM Yoga & Lifestyle
Article
Success Starts with Self: 35 Must-Read Articles to Set You Up For Long-Term Success in Life, Love and Your Career
Oct 26, 2023
In this collection of articles, the author explores and divulges the three most important aspects in life that apply to everyone: wellness and self-care, connections and relationships, success and self-growth. Divya Chandegra, is an author, wellness
1 min read
Mahesh Hayward: making Men’s Mental Health A Priority
OM Yoga & Lifestyle
Article
Mahesh Hayward: making Men’s Mental Health A Priority
Jul 23, 2023
3 min read
Catching Up With Gitanjali Rao
TIME
Article
Catching Up With Gitanjali Rao
Feb 18, 2022
2 min read
Nothing Is More Important Than Your Buddhanature
Buddhadharma: The Practitioner's Quarterly
Article
Nothing Is More Important Than Your Buddhanature
Sep 20, 2023
BUDDHISTS OFTEN talk about suffering, emptiness, interdependence, and other philosophies and practices, but behind all of it is one simple teaching: buddhanature. Buddhism is notoriously complex, and sometimes it seems difficult for people to know wh
2 min read
Some Of The Themes To Be Curious About Leapt Out Of Our Research:
The European Business Review
Article
Some Of The Themes To Be Curious About Leapt Out Of Our Research:
Sep 25, 2021
• What works for us? After a few months, is it still working, what else could we do? • Are we all being heard? What exactly are our practices and ways of working helping or hindering? • What counts as professional practice for us here? • What are we
1 min read
Faces of NPR: Whitney Maddox
NPR
Article
Faces of NPR: Whitney Maddox
Jan 3, 2022
10 min read
Tailored to You And Your Whānau
Woman NZ
Article
Tailored to You And Your Whānau
Oct 27, 2022
3 min read
Aditi Gupta
Techfastly
Article
Aditi Gupta
Mar 3, 2021
5 min read
The Equity Quest
India Today
Article
The Equity Quest
Feb 17, 2024
6 min read

Related categories

Skip carousel

Reviews for Data Analytics

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Data Analytics - Dr. Gaurav Aroraa

CHAPTER 1

Database Management System

In today’s technology-driven world, especially in corporate settings, data is the biggest player. It is defined as the "new oil or new currency". Data can refer to facts related to any object that is storable, manageable and accessible through some form of the organized electronic system. Examples of such data objects can be students, employees, orders, and so on.

To optimize the use of data and convert it into actionable business intelligence and insights, we need to store and arrange data in an organized manner. However, business intelligence and insights that lead to operational excellence are only achievable if the necessary components are in place. The two key components are a database (DB) and a database management system (DBMS). DB and DBMS facilitate easy maintenance of a huge amount of data. Collectively, these two components, along with the data, form the lifeline of businesses.

Database and database management systems are about building a digital backbone for businesses. After all, what use is a great repository of data and information if no one is able to access it; this is where database and database management systems come into play.

Structure

This chapter will highlight the following topics:

Database management system

Database and database management system

History of DBMS

Characteristics of DBMS

DB three-tier architecture

Keys

Entity-relationship model

Functional dependencies

Normalization

Database objects

Stored procedures and triggers

SQL and PL/SQL

Basic SQL query structure

Joins

Query optimization

Objectives

After studying this chapter, you will be able to.

Understand the concepts of database, database management system, history of DBMS, characteristics of DBMS, and DB three-tier architecture.

Discuss keys, entity-relationship model, functional dependencies, and normalization.

Understand deeper dynamics like stored procedures and triggers, SQL and PL/SQL, basic SQL query structure, join, and query optimization.

Database and database management system

Without any doubt, these days, most organizations have become data-driven. This is where the need for a database emerges. A database (also known as DB) is an electronic data structure that stores an organized collection of data. It is used by organizations as an effective way of storing, managing, and retrieving information. Databases in current times are managed through a database management system. The word "data in database is derived from the Latin word datum and this means a piece of information".

A database management system (also known as DBMS) is a group of software programs designed to define, store, manage, and retrieve data from a database. A database is a logical collection of inter-related information, which is stored in the form of records. The main aim of a DBMS is to centralize and store information into and retrieve information from a database in an organized and efficient manner. Essentially speaking, a DBMS is a set of programs used to access the data in the database.

Examples of popular DBMSs are Sybase, Oracle, Microsoft SQL server, and so on. DBMS organizes, stores, and manages data in such a way that the data manipulation language (DML) operations of addition, updation, and deletion can be performed in an efficient and effective manner. A DML is usually a sublanguage of the parent language such as SQL, and it is used for inserting, modifying, and deleting data in a database.

Database and database management system form the backbone of all major business processes and applications today. Business growth and operational efficiency depend on the various robust DBMS implemented today. For organizations and companies to survive and thrive in the competitive landscape, they need to have an effective DBMS in place so as to collect, manage, analyze, and derive timely business insights for creating the edge. In other words, DB and DBMS provide a centralized repository for data and provide a mechanism for a highly efficient method for handling multiple types of data.

Relational databases (RDBMS) are more widely known and used than their NoSQL peers. NoSQL stands for "not only structured query languages or non-SQL or non-relational" database that provides a mechanism for storage and retrieval of data that is modeled in a way that is different from the relational databases. The main features of modern database systems are: apart from the feature of having a data manipulation language and a data definition language; most modern systems have at least one way of interfacing or communicating with other systems. The type of storage and the purpose are some of the main criteria for classifying the various databases into cloud-based, container, software in silicon, and so on.

A database management system is made up of software, data, hardware, users, and procedures. Each component plays an important role in the database management system environment. Software is the primary component, and it consists of a set of programs that manages the database, and it also acts as an interface between the users and the database. Data in terms of user/operational data and metadata (data about data; description of the structure of the database) is stored, accessed, managed, and processed by the DBMS. Hardware is a collection of storage devices, computers, hard disks, and so on, and these act as a bridge between the computer system and the real world. Users of different types are categorized on the basis of their access level and what operations they can perform: the main types are end-users, database administrators, and application programmers/developers (refer to figure 1.1). Procedures guide how to use the database management system, design the database, install the database management system, generate reports based on the data in the database, create backup copies of the database, and so on.

Figure 1.1: Components of a Database Management System

Database and database management system and their respective components form the backbone of most businesses today. They help organizations to become strategic users of the vast amount of data stored in their setups.

History of DBMS

In the initial period of database systems, they were built on top of a file system. But such systems had a large set of drawbacks like duplication of data leading to redundancy, inconsistency in data leading to integrity problems, difficulty in accessing data leading to writing more and more programs, and so on.

Electronic or computerized database systems entered the digital landscape in the 1960s. There were two main data models that came into the picture from late 1960s: a hierarchical model called IMS and a network model called conference on data systems languages (CODASYL). Intensive research was done during the 1960s. As a result of which North American Aviation, now known as Rockwell International, developed software called generalized update access method (GUAM). Later on, IBM joined hands with North American Aviation to further develop GUAM into information management system (IMS).

Information management system is a hierarchical system, and in this kind of system, data is arranged in binary tree format. It was written for NASA’s space program, Apollo. This software was based on the concept that smaller components come together as a part of larger components and so on until the final product is assembled. Here, the data is organized into a tree-like structure with the records representing the nodes and fields representing the branches of the tree. It represents data using parent/child relationships: each parent can have many children, but each child has only one parent; a hierarchical model cannot represent many-to-many relationships. In this model, files are related in a parent–child manner. Although this model overcame all the cons of the file-based system by giving advantages like less redundant data, data-program independence, better security, and efficient searching, it also had some disadvantages of its own: complex implementation, lack of portability standards, programming became complex as the programmers needed to know the physical access paths to access the data items.

Example of a hierarchical model: The great grandparents are the root/base of the whole model. Parents can have many children, and this indicates one to many relationships. The great grandparents record is termed as the root of the tree. The grandparents and children are the dependents or children of the root. A root may have any number of dependents. Each of these dependents may have any number of lower-level dependents, and this nesting can go on.

Conference on data systems languages is a network model in which data is organized into records of different types, and records are organized into sets of different types. It became the basis and the framework for new commercial database systems, such as the integrated database management system (IDMS) from Cullinane Corporation. This type of system had many disadvantages. Charles W. Bachman, an American computer scientist, and industrial researcher designed the integrated database system (IDS), the first-ever network DBMS, to overcome the disadvantages of the hierarchical DBMS. In the network model, there is a three-schema approach—structure or organization of the database, data management language, and a sub-schema, which refers to the views of the database per user. Both database systems of IMS and CODASYL are described as the forerunners of the DBMS revolution.

In 1970s, Edgar F. Codd, an English computer scientist (also known as the "Father of Database Management Systems), who while working at IBM, invented the relational model for database management—this type of system laid the roadmap for dealing with big data and database setups. Along with the E. F. Codd rules explained in the paper titled A Relational Model of Data for Large Shared Data Banks", E. F. Codd also proposed replacing the then systems with tables and rows. This concept, later on, came to be known as relational DBMS. Eugene Wong and Michael Stonebraker from UC Berkeley performed further research on relational database systems, and their project was named interactive graphics and retrieval system (INGRES); and they were able to successfully show that a relational model is practical and efficient. INGRES used a query language known as QUEL that, in turn, made IBM develop SQL, which was more advanced.

In year 1976, Peter Chen, a Taiwanese American computer scientist, defined and explained the dynamics of the entity-relationship model in his paper "The Entity-Relationship Model: Toward a Unified View of Data". The ER model is based on the concept that the real world consists of entities and relationships. In 1980s, more and more relational databases were developed and implemented.

In mid-1980s, object-oriented database management system (OODBMS) came into the picture; it was incorporated in relational DBMS. Some popular names are Versant, Object, Design, and so on. The need for this new model arose to handle complex objects and structures like those used in CASE, CAD, and so on. To accomplish the complex tasks, the database had to be able to deal with aspects like classes and objects; hence, the object-oriented DBMS (OODBMS) came into play. OODBMS represents and model data in the form of objects and classes. In the decade of 1990s, object-oriented model in relational database lead to development of new areas like online analytical processing (OLAP), data warehousing, and so on.

In year 1995, Internet-based DBMS were introduced and NoSQL became popular with such systems as it provided faster processing of unstructured data. In the year 1997, the XML database design was introduced. An XML database is a database that stores data in XML format. It is usually associated with document-oriented databases (refer to Figure 1.2).

Modern day DBMS are flexible, scalable, dynamic, versatile, and secure systems that are capable of handling extremely large amounts of data.

Some of the popular current relational databases include DB2, Oracle, and so on. Platforms like Caspio, Quickbase, TeamDesk, and so on, are some of the popular online database building platforms that help users to build custom applications in an easy manner through the point and click interface.

Figure 1.2: History of DBMS

We end by discussing the history of modern day DBMS. The main features of modern database systems are apart from the main feature of having a data manipulation language and data definition language, most modern systems have at least one way of interfacing or communicating with other systems. The type of storage and the purpose are the criteria for classifying the various databases in terms of modern-day DBMS like cloud-based, container, software in silicon, and so on.

Characteristics of DBMS

The various characteristics of DBMS help organizations and businesses to become strategic users of the vast amount of data that they possess.

Isolation: DBMS is completely independent of the data it stores. DBMS is said to be an active entity, whereas data is considered passive; hence, there should be a clear differentiation between them. There is also insulation or separation between the data and the programs accessing this data.

Multiple views: DBMS offers multiple views for different users based on their specific requirements. For example, the finance department will have a different view of the database than the people working in the production department. It is an abstract view of the data as DBMS hides all the underlying details.

A view is a subset of the data stored in a database; it is different for each user/group of users depending on their requirements. DBMS provides multiple views of data through various tools like reports. The stored data should be made available for access by different users simultaneously through different views. It provides a multi-user, concurrent access environment and allows users to access and manipulate data in parallel. Multiple users should be able to access and manipulate the same database without affecting the other users.

Major elements: There are five major elements/components of DBMS: software, hardware, procedures, data, and users (which we have already discussed).

Data integrity: It is a critical characteristic and fundamental tenet of data security. DBMS ensures the accuracy, quality, consistency, and reliability of the data in the database. It maintains three types of integrity: domain, referential, and entity. If an integrity violation is detected, DBMS takes appropriate actions like reporting the violation, rejecting the violation, and many more, and if needed, it can also return the database to a consistent state.

ACID: A true DBMS must adhere to the ACID properties of accuracy, consistency, isolation, and durability. DBMS must ensure that the real purpose of data should not be compromised while performing transactions like insertion, deletion, and modification. For example,, if an employee’s name is updated, then it should make sure that there is no duplicate data and no mismatch of employee data anywhere in the whole employee database system.

A transaction represents a logical unit of a program made up of related operations. A transaction must maintain atomicity, consistency, isolation, and durability as these properties are a must to ensure data completeness, accuracy, and integrity. Atomicity indicates that a transaction must be treated as an atomic unit—either all of its operations are executed or none. Consistency means that if the database was in a consistent state before the execution of a transaction, it must remain in a consistent state even after the execution of the transaction is over. Isolation implies that a transaction that is in progress and not yet committed must remain isolated from any other transaction. Durability means once a transaction is completed successfully, the changes it has made should be permanent even if there is a system failure.

Data persistence: All the data is maintained as long as it is not deleted or removed explicitly.

Reduces redundancy: As DBMS follows the normalization rules, it helps to minimize redundancy.

Security and rights: DBMS will have a proper rights and permissions policy in place. Some users will be allowed to see the whole database and some will have only partial rights. It has a system of access controls, permissions, and privileges to create, maintain, and control various types of user accounts. For example, an instructor who teaches a particular subject will have access to view and update the marks of her/his subject. She/he will not have access to other subjects. But the HOD will have full access to all the subjects.

Various utilities: A backup utility that takes regular backups/copies of the database to tackle crashes, a recovery utility to perform recovery of the DB from the history and using backup procedures, a data loading utility that facilitates easy loading of data from an external source and format without any additional programming, and many more.

Because of these characteristics of DBMS, the data in the database is therefore stored, structured, and integrated into the database in a meaningful way. And this makes it easier to retrieve, manipulate, and delete data.

DB three-tier architecture

The database architecture deals with and focuses on designing, developing, implementing, and maintaining software programs that store and organize data.

Figure 1.3 is a pictorial depiction of the types of DBMS architecture.

Figure 1.3: Types of DBMS Architecture

The 1-tier architecture is the simplest form of DB architecture where the client, server, and database all reside on the same machine. Any request made by the client does not require a network connection to perform any action on the database. But such architecture is rarely used in the production environment.

In 2-tier architecture, there is an application layer between the user and the DBMS, and its main responsibility is to communicate the user’s request to the database management system and then transmit the response from the DBMS to the user. There is direct communication between the client and server. The DB is on the server machine, and the DBMS application is on the client machine, and these two are connected through a network.

The 3-tier architecture (referto figure 1.4) is the most widely used architecture to design a DBMS.

Figure 1.4: 3-Tier Architecture

The 3-tier architecture has three inter-related but independent levels/modules/layers: external level/view level, conceptual level, and physical level. At the physical level, the information about the location of database objects in the data store is maintained. It is most commonly used for web applications. Various users of DBMS are unaware of the locations of these objects. At the external level or presentation layer, there is the graphical user interface (GUI) part of the system or application; and for the design purpose, where data is either presented to or input is received from the user. At the conceptual/logic level or business layer, all the logic for insertion, retrieval, calculation, and so on are present, and this layer act as the interface/link between the presentation layer and data layer. At the conceptual level, the design of the database in the form of a schema of data, constraints, relationships, and so on are defined. At the physical level or data layer, the actual database comes into the picture. This level is not only responsible for how the data is stored but also for allocating space to this data.

Keys

A key is an attribute (a field or combination of fields; unique identifier for data) in a table that uniquely identifies each record or a row (tuple) of data in the table. Its main purpose is that of arranging, identifying, and sorting rows. It also uniquely identifies a row in a table by a combination of one or more columns in that table based on a certain requirement or condition. A key is required to establish or identify a relation between tables. It helps to bring about the integrity of data by eliminating duplicate data. It does not allow columns to have null or duplicate values. According to E. F. Codd’s third rule: "Every single data element (value) is guaranteed to be accessible logically with a combination of table-name, primary-key (row value), and attribute-name (column value)." In most relational databases, there are two main keys present: primary and foreign. A primary key enforces entity integrity, and a foreign key maintains referential integrity.

Figure 1.5 is a pictorial depiction of the types of keys.

Figure 1.5: Types of Keys

There are several types of keys:

Super: It is a group of single or multiple keys whose values can help to uniquely identify rows in a table. A super key may have additional attributes that are not needed for unique identification.

For example, the customer master details table is used to store customer details; it will contain columns such as Customer_IDCustomer_Name, Customer_SSN, Customer_Birthdate, Customer_Address. In this case, a certain set of columns can be considered unique to each customer. Examples of super keys in this example could be ID + Name + SSN or Name + SSN + Birthdate or even just ID (which is a candidate key), but to have an absolute uniqueness, a composite key of ID + SSN can be used.

Often, people tend to get confused between a super key and a candidate key. A candidate key is selected from a set of super keys. While doing so, it is important to ensure that a candidate key should not have any redundant attribute. Hence, a candidate key is also called a minimal super key.

Primary: It is used to uniquely identify a record in a table. It has to have a unique value; it cannot be duplicated. This key is a set of one or more fields. The value in a primary key column can never be modified or updated if any foreign key refers to that primary key. The primary key field cannot be null. For example, in the table Employees, the Employee_Id is the primary key as it is unique for each employee. In another example, the Citizens table, there could be multiple primary keys like Citizen_ID, Citizen_PassportNo, or Citizen_SSN; in this case, selection of the primary key is based on requirements. A primary key with two or more attributes is referred to as a composite key.

Alternate: All the keys that are not primary keys are called alternate keys. It is a candidate key that is currently not the primary key, and it is also called the secondary key.

Candidate: A super key with no redundant attribute is known as a candidate key. The minimal set of an attribute that can uniquely identify a row is known as a candidate key. The candidate key is as strong as the primary key. A table can have multiple candidate keys, but each table can have only one primary key. Among the set of candidates, one candidate key is chosen as the primary key. When a key is composed of more than one column, it is known as a composite key.

For example, in the Employees table, Employee_Id is best suited as the primary key, whereas the remaining attributes like Employee_SSN, Employee_PassportNo, and so on are considered as candidate keys.

Foreign: A column that links with another table. It acts as a cross-reference between two tables; it typically links to the primary key of another table.

Composite/compound: is a combination of more than one attribute (and each attribute is a simple key) that can be used to uniquely identify each row. A composite key may be a candidate or primary key.

ER model

ER model stands for the entity-relationship model. It is a high-level data model that describes the logical structure and conceptual design of a database using a diagram known as ER diagram: entity-relationship diagram. This diagram is widely used in database design, and its main aim is to define the data entities/elements and relationships between them.

Entity-relationship model is the blueprint or graphical representation of a database that can be later on be implemented as a database. Each entity has a set of properties, and properties can have values. For example, a particular student studying in a particular school is an entity. If "Asha" is a student at Daffodils School, she will have attributes (properties) like name, age, birthdate, and so on. In most cases, a single attribute will have one value. But it is also possible for attributes to have multiple values. For example, Asha’s age has a single value, but her phone number could have multiple values. An entity has a set of properties, and these properties can have values. ER model is a powerful way of communicating a common way of viewing data and hence avoiding confusion.

There are five main types of attributes: First, the simple attribute is an atomic value, which cannot be divided further. For example, an employee’s phone number is an atomic value of 8 digits, and it cannot be broken down further. Second, the derived attribute does not exist in the physical database, but its value is derived from other attributes present in the database; it is calculated based on other attributes. For example, age can be derived from the attribute of a birthdate. Third, a composite attribute consists of more than one simple attribute. For example, an employee’s complete name has First_Name and Last_Name. Fourth, a single-value attribute contains a single value, for example, Employee_SSN. Last, a multi-value attribute may contain more than one value; for example, a person can have more than one Email_Address.

Entities have relationships with each other. Let us consider an example: assume that each student is given an ID card. Hence Asha has an ID card. Asha’s ID card is also an entity. Asha is using this ID card, and Asha uses the same ID card. There is a mutual relationship between Asha and her ID card. A set of relationships of a similar type is called a relationship set. Just like an entity, a relationship too can have attributes, which are referred to as descriptive attributes. The degree of a relationship is defined by the number of entities that participate in it. There are four main types: unary = degree 1 (one entity is involved in the relationship), binary = degree 2 (two entities are involved in the relationship), ternary = degree 3 (three entities are involved in the relationship), and N-ary (‘n’ number of entities is involved in the relationship).

Enhanced entity-relationship (EER) model is an expanded, specialized, high-level data model which provides extensions to the original entity-relationship (ER) model. EER models provide precise, detailed designs that cater to complex databases.

Entity-relationship model consists of many specialized symbols that have a unique meaning. The main components of the ER model are as follows:

Entity: An entity can be any person, object, or place. In the ER diagram, an entity can be depicted as a rectangle (refer to figure 1.6). For example, in a company, the various entities could be employees, products, departments, and others.

Figure 1.6: Entity Rectangle Symbol

An entity that depends on another entity is called a weak entity as it does not contain any key attribute of its own, whereas an entity that has a primary key (refer to figure 1.7) is called a strong entity.

Figure 1.7: Primary Key

A weak entity is always dependent on strong/owner entities as they have sufficient attributes to form a primary key. The primary key of a strong entity set is represented by underlining it. The weak entity is represented by a double rectangle (refer to figure 1.8).

Figure 1.8: Weak Entity

A member of a strong entity set is called a dominant entity, and a member of a weak entity set is called a subordinate entity. The relationship between a weak entity and a strong entity is denoted with a double diamond.

Attribute: It is shown as ellipses. An attribute (ellipses) represents one that is directly connected to its entity (rectangle) (refer to figure 1.9).

Figure 1.9: Attribute Ellipses Symbol

In the case of a composite attribute (refer to figure 1.10), they are depicted in a tree-like structure, i.e., a composite attribute is represented by an ellipsis that is connected with another parent ellipse.

Figure 1.10: Composite Attribute

A multivalued attribute (refer to figure 1.11) is shown as a double ellipsis. For example, an employee can have more than one phone number.

Figure 1.11: Multivalued Attribute

An attribute that can be derived from another attribute is known as a derived attribute. It is depicted by a dashed ellipse (refer to figure 1.12). For example, an employee’s age and can be derived from another attribute like the date of birth.

Figure 1.12: Derived Attribute

Cardinality: It denotes the relationship between two tables; it is about the maximum number of entities of one entity set that are linked with the maximum number of entities of the other entity set. There are four main types of cardinality ratios.

One-to-one: when only one instance of an entity is related with the relationship (refer to figure 1.13), and it is marked as 1:1; for example, a person can have only one passport, and a passport can be assigned to only one person (one row in table A relates to one row in table B).

Figure 1.13: One-to-One Cardinality

One-to-many: Here a single instance of an entity is associated with more than one instance of another entity (refer to figure 1.14), and it is marked as 1:M, for example, a single customer might place an order for multiple products (one row in table A maps to many rows in table B).

Figure 1.14: One-to-Many Cardinality

Many-to-one: When more than one entity from entity set A can be linked with at most one entity of entity set B (refer to figure 1.15), and it is marked as M:1, for example, an employee can work in at most one department, and a department can have several employees in it (many rows in table A relate to one row in table B).

Figure 1.15: Many-to-One Cardinality

Many-to-many: When one entity from A can be associated with more than one entity from B and vice-versa (refer to figure 1.16), and it is marked as M:N doctors have many patients, and a patient can have several doctors (many rows in table A relate to many rows in table B).

Figure 1.16: Many-to-Many Cardinality

Functional dependencies

A functional dependency (FD) refers to a relationship that exists between two attributes, wherein it determines the relation of one attribute to another. It is a relationship between the primary key and other non-key attributes within a table. It is a critical part of relational database design and helps in the process of normalization and prevents data redundancy. For example, in the Employees table, there are attributes: Employee_Id, Employee_Name, and Employee_Age. Here Employee_Id attribute uniquely identifies the Employee_Name attribute of this table because if we know the employee id, we can easily identify the employee name associated with it. This refers to a functional dependency and can be written as follows:

Employee_Id -> Employee_Name

Or in words, we can say Employee_Name is functionally dependent on Employee_Id

A -> B where A is the left side is known as a determinant, and the right side is called as a dependent; B is functionally dependent on A. For example, we have a student table with attributes: Student_Id, Student_Name, Student_Address, and many more. Here the attribute of Student_Id can uniquely identify the attribute of Student_Name in the student table because once we know the Student_Id, we can identify that particular student name associated with it. Functional dependency can be written as follows:

Student_Id -> Student_Name

We know that Student_Id is unique for each student. So Student_Id -> Student_Name, Student_RollNo -> Student_Address and Student_Birthdate all will be true.

There are three main types of functional dependencies: transitive, trivial, and multivalued.

Transitive dependency: It is an indirect dependency, and it occurs when an indirect relationship results in a functional dependency. It normally follows this pattern

A→B and B→C.

Therefore A→C.

The rule is that A is a transitive dependency of C (A→C) if A is functionally dependent on B (A→B), and B is functionally dependent on C (B→C) but not on A.

Anomalies in regards to insertion, modification, and deletion can occur due to transitive dependency. To achieve the normalization standard of third normal form (3NF), all transitive dependencies need to be eliminated so as to be able to maintain data integrity and minimize data duplication.

Trivial functional dependency: if a functional dependency: A → B holds, where B is a subset of A, then it is called a trivial functional dependency. Non-trivial: if a functional dependency A → B holds where B is not a subset of A, then it is called non-trivial functional dependency.

Multivalued functional dependency: when two attributes in a table are independent of each other, but both depend on a third attribute. A multivalued dependency is shown as follows:

Table 1.1: Multivalued Functional Dependency

In table 1.1, we can see students Asha and Citra have an interest in more than one hobby. This is a multivalued dependency because the discipline of a student is independent of hobbies, but it is dependent on the student. Hence the multivalued dependency is shown as follows: Student_Name->->Student_Discipline and Student_Name->->Student_Hobbies. But such multivalued dependencies violate the normalization rules, so in order to correct such violations, we need to split the preceding table into two tables to dissolve or break these multivalued dependencies.

Normalization

Normalization is a database schema design technique that aims to reduce redundancy, data dependency, and anomalies by systematically organizing tables and their data. The process is about decomposing large tables into smaller, manageable tables and associating them with relationships. The main aim of normalization is to organize data in such a manner that it meets two main requirements: there is no data redundancy, and data dependencies are logical.

Normalization can be carried out by following a set of rules called " forms". E. F. Codd, along with R. F. Boyce, proposed the theory of normalization with the introduction of the first normal form, second normal form, and third normal form. This theory is called the Boyce–Codd normal form.

First normal form (1NF): Here, each column is unique. A table in this form cannot contain sub-columns; for example, if you are listing several cities, you cannot list them in one column and separate them with a semicolon. For a table to be in the first normal form, it should comply with the following four rules:

It should only have single-valued columns; it means they should not contain multiple values.

Values stored in a column should be of the same domain.

All the columns should have unique names.

The order in which data is stored does not matter.

For example, the following table of student contact details is not in the 1NF as the column Student_ContactNumber has multiple values for student, Asha, and as per the rules of this form, each column should hold atomic values.

Table 1.2: Before First Normal Form

After normalization, table 1.2 will appear as follows (refer to table 1.3):

Table 1.3: First Normal Form

Second normal form (2NF): for a table to be in the second normal form (2NF), it should follow these norms:

It has to be in the 1NF.

It should not have any partial dependency. All non-key attributes are fully functional dependent on the primary key.

For example, the following table of a customer order is in the 1NF as all attributes have atomic values, but it is not in 2NF as there are still partial dependencies. In this example, the combined key of Customer_Id and Order_Id form the primary key. Customer_Name is dependent on Customer_Id, and there is no real link between a customer’s name and what she/he have ordered. The order details and order date are dependent on the Order_Id, but they are not dependent on the Customer_Id.

Table 1.4: Before Second Normal Form

To make table 1.4 compliant with 2NF, it needs to be split into three tables as follows (refer to table 1.5): customer details, order details, and third table to keep track of all the orders for a customer.

(a)

(b)

(c)

Table 1.5: The Second Normal Form: (a) Customer details, (b) Order details, (c) Tracking order

Third normal form (3NF): a table is said to be in the third normal form when it meets the following criteria:

It is in the 2NF and 1NF.

It should not have transitive functional dependency: transitive functional dependency of non-prime attribute on any super key should be eliminated. An attribute that is not part of any candidate key is known as a non-prime attribute.

In the following tables, super keys areStudent_Id, Student_Id + Student_Name, Student_Id + Student_Name + Student_Pincode, and so on, and candidate key isStudent_Id. Non-prime attributes: all attributes except Student_Id are non-prime as they are not part of any candidate keys. Here, Student_State, Student_City, and Student_District are dependent on Student_Pincode, and Student_Pincode is dependent on Student_Id, and that makes non-prime attributes of Student_State, Student_City,

Enjoying the preview?

Page 1 of 1

Data Analytics: Principles, Tools, and Practices: A Complete Guide for Advanced Data Analytics Using the Latest Trends, Tools, and Technologies

About this ebook

Dr. Gaurav Aroraa

Related authors

Related to Data Analytics

Related ebooks

Computers For You

Related podcast episodes

Related articles

Related categories

Reviews for Data Analytics

What did you think?

Book preview

Data Analytics - Dr. Gaurav Aroraa

CHAPTER 1

Database Management System

Structure

Objectives

Database and database management system

History of DBMS

Characteristics of DBMS

DB three-tier architecture

Keys

ER model

Functional dependencies

Normalization