Understanding Azure Data Factory: Operationalizing Big Data and Advanced Analytics Solutions

Ebook485 pages3 hours

Understanding Azure Data Factory: Operationalizing Big Data and Advanced Analytics Solutions

Name: Understanding Azure Data Factory: Operationalizing Big Data and Advanced Analytics Solutions
Author: Sudhir Rawat
ISBN: 9781484241226

By Sudhir Rawat and Abhishek Narain

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Improve your analytics and data platform to solve major challenges, including operationalizing big data and advanced analytics workloads on Azure. You will learn how to monitor complex pipelines, set alerts, and extend your organization's custom monitoring requirements.

This book starts with an overview of the Azure Data Factory as a hybrid ETL/ELT orchestration service on Azure. The book then dives into data movement and the connectivity capability of Azure Data Factory. You will learn about the support for hybrid data integration from disparate sources such as on-premise, cloud, or from SaaS applications. Detailed guidance is provided on how to transform data and on control flow. Demonstration of operationalizing the pipelines and ETL with SSIS is included. You will know how to leverage Azure Data Factory to run existing SSIS packages. As you advance through the book, you will wrap up by learning how to create a single pane for end-to-end monitoring, which is a key skill in building advanced analytics and big data pipelines.

What You'll Learn

Understand data integration on Azure cloud
Build and operationalize an ADF pipeline
Modernize a data warehouse
Be aware of performance and security considerations while moving data

Who This Book Is For Data engineers and big data developers. ETL (extract, transform, load) developers also will find the book useful in demonstrating various operations.

Skip carousel

LanguageEnglish

PublisherApress

Release dateDec 18, 2018

ISBN9781484241226

Author

Sudhir Rawat

Related authors

Skip carousel

Related to Understanding Azure Data Factory

Related ebooks

Skip carousel

Demystifying the Azure Well-Architected Framework: Guiding Principles and Design Best Practices for Azure Workloads
Ebook
Demystifying the Azure Well-Architected Framework: Guiding Principles and Design Best Practices for Azure Workloads
byShijimol Ambi Karthikeyan
Rating: 0 out of 5 stars
0 ratings
Cyber Security on Azure: An IT Professional’s Guide to Microsoft Azure Security
Ebook
Cyber Security on Azure: An IT Professional’s Guide to Microsoft Azure Security
byMarshall Copeland
Rating: 0 out of 5 stars
0 ratings
Learning Azure DocumentDB
Ebook
Learning Azure DocumentDB
byBecker Riccardo
Rating: 0 out of 5 stars
0 ratings
Azure Data Factory by Example: Practical Implementation for Data Engineers
Ebook
Azure Data Factory by Example: Practical Implementation for Data Engineers
byRichard Swinbank
Rating: 0 out of 5 stars
0 ratings
Pro SQL Server 2019 Administration: A Guide for the Modern DBA
Ebook
Pro SQL Server 2019 Administration: A Guide for the Modern DBA
byPeter A. Carter
Rating: 0 out of 5 stars
0 ratings
Practical API Architecture and Development with Azure and AWS: Design and Implementation of APIs for the Cloud
Ebook
Practical API Architecture and Development with Azure and AWS: Design and Implementation of APIs for the Cloud
byThurupathan Vijayakumar
Rating: 0 out of 5 stars
0 ratings
Azure Security Handbook: A Comprehensive Guide for Defending Your Enterprise Environment
Ebook
Azure Security Handbook: A Comprehensive Guide for Defending Your Enterprise Environment
byKarl Ots
Rating: 0 out of 5 stars
0 ratings
The Definitive Guide to Azure Data Engineering: Modern ELT, DevOps, and Analytics on the Azure Cloud Platform
Ebook
The Definitive Guide to Azure Data Engineering: Modern ELT, DevOps, and Analytics on the Azure Cloud Platform
byRon C. L'Esteve
Rating: 0 out of 5 stars
0 ratings
Learn Microsoft Azure: Step by Step in 7 day for .NET Developers
Ebook
Learn Microsoft Azure: Step by Step in 7 day for .NET Developers
bySaillesh Pawar
Rating: 0 out of 5 stars
0 ratings
Data Science Solutions on Azure: Tools and Techniques Using Databricks and MLOps
Ebook
Data Science Solutions on Azure: Tools and Techniques Using Databricks and MLOps
byJulian Soh
Rating: 0 out of 5 stars
0 ratings
Scalable Big Data Architecture: A practitioners guide to choosing relevant Big Data architecture
Ebook
Scalable Big Data Architecture: A practitioners guide to choosing relevant Big Data architecture
byBahaaldine Azarmi
Rating: 0 out of 5 stars
0 ratings
SQL Server Data Automation Through Frameworks: Building Metadata-Driven Frameworks with T-SQL, SSIS, and Azure Data Factory
Ebook
SQL Server Data Automation Through Frameworks: Building Metadata-Driven Frameworks with T-SQL, SSIS, and Azure Data Factory
byAndy Leonard
Rating: 0 out of 5 stars
0 ratings
DevOps for Azure Applications: Deploy Web Applications on Azure
Ebook
DevOps for Azure Applications: Deploy Web Applications on Azure
bySuren Machiraju
Rating: 0 out of 5 stars
0 ratings
Beginning Apache Spark Using Azure Databricks: Unleashing Large Cluster Analytics in the Cloud
Ebook
Beginning Apache Spark Using Azure Databricks: Unleashing Large Cluster Analytics in the Cloud
byRobert Ilijason
Rating: 0 out of 5 stars
0 ratings
Building Microservices Applications on Microsoft Azure: Designing, Developing, Deploying, and Monitoring
Ebook
Building Microservices Applications on Microsoft Azure: Designing, Developing, Deploying, and Monitoring
byHarsh Chawla
Rating: 0 out of 5 stars
0 ratings
Exam AZ 900: Azure Fundamental Study Guide-2: Explore Azure Fundamental guide and Get certified AZ 900 exam
Ebook
Exam AZ 900: Azure Fundamental Study Guide-2: Explore Azure Fundamental guide and Get certified AZ 900 exam
byMamta Devi
Rating: 0 out of 5 stars
0 ratings
Getting Started with SQL Server 2014 Administration
Ebook
Getting Started with SQL Server 2014 Administration
byGethyn Ellis
Rating: 0 out of 5 stars
0 ratings
Implementing Azure Solutions
Ebook
Implementing Azure Solutions
byFlorian Klaffenbach
Rating: 0 out of 5 stars
0 ratings
Azure for .NET Core Developers: Implementing Microsoft Azure Solutions Using .NET Core Framework
Ebook
Azure for .NET Core Developers: Implementing Microsoft Azure Solutions Using .NET Core Framework
byKasam Ahmed Shaikh
Rating: 0 out of 5 stars
0 ratings
Beginning Azure Synapse Analytics: Transition from Data Warehouse to Data Lakehouse
Ebook
Beginning Azure Synapse Analytics: Transition from Data Warehouse to Data Lakehouse
byBhadresh Shiyal
Rating: 0 out of 5 stars
0 ratings
Hands-on Azure Boards: Configuring and Customizing Process Workflows in Azure DevOps Services
Ebook
Hands-on Azure Boards: Configuring and Customizing Process Workflows in Azure DevOps Services
byChaminda Chandrasekara
Rating: 0 out of 5 stars
0 ratings
HDInsight Essentials - Second Edition
Ebook
HDInsight Essentials - Second Edition
byRajesh Nadipalli
Rating: 0 out of 5 stars
0 ratings
Instant SQL Server Analysis Services 2012 Cube Security
Ebook
Instant SQL Server Analysis Services 2012 Cube Security
bySatya SK Jayanty
Rating: 0 out of 5 stars
0 ratings
Mastering Azure Synapse Analytics: Learn how to develop end-to-end analytics solutions with Azure Synapse Analytics (English Edition)
Ebook
Mastering Azure Synapse Analytics: Learn how to develop end-to-end analytics solutions with Azure Synapse Analytics (English Edition)
byDebananda Ghosh
Rating: 0 out of 5 stars
0 ratings
Hands-on Cloud Analytics with Microsoft Azure Stack
Ebook
Hands-on Cloud Analytics with Microsoft Azure Stack
byPrashila Naik
Rating: 0 out of 5 stars
0 ratings
Microsoft Azure Security
Ebook
Microsoft Azure Security
byRoberto Freato
Rating: 0 out of 5 stars
0 ratings
Microsoft Azure IaaS Essentials
Ebook
Microsoft Azure IaaS Essentials
byGethyn Ellis
Rating: 4 out of 5 stars
4/5
Mastering Databricks Lakehouse Platform: Perform Data Warehousing, Data Engineering, Machine Learning, DevOps, and BI into a Single Platform (English Edition)
Ebook
Mastering Databricks Lakehouse Platform: Perform Data Warehousing, Data Engineering, Machine Learning, DevOps, and BI into a Single Platform (English Edition)
bySagar Lad
Rating: 0 out of 5 stars
0 ratings
Practical Azure SQL Database for Modern Developers: Building Applications in the Microsoft Cloud
Ebook
Practical Azure SQL Database for Modern Developers: Building Applications in the Microsoft Cloud
byDavide Mauri
Rating: 0 out of 5 stars
0 ratings
Understanding Azure Monitoring: Includes IaaS and PaaS Scenarios
Ebook
Understanding Azure Monitoring: Includes IaaS and PaaS Scenarios
byBapi Chakraborty
Rating: 0 out of 5 stars
0 ratings

Programming For You

Skip carousel

Java for Beginners: A Crash Course to Learn Java Programming in 1 Week
Ebook
Java for Beginners: A Crash Course to Learn Java Programming in 1 Week
byBrady Ellison
Rating: 5 out of 5 stars
5/5
Game Development with Unreal Engine 5: Learn the Basics of Game Development in Unreal Engine 5 (English Edition)
Ebook
Game Development with Unreal Engine 5: Learn the Basics of Game Development in Unreal Engine 5 (English Edition)
byMitchell Lynn
Rating: 0 out of 5 stars
0 ratings
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
Ebook
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
byKevin Clark
Rating: 5 out of 5 stars
5/5
Coding All-in-One For Dummies
Ebook
Coding All-in-One For Dummies
byNikhil Abraham
Rating: 4 out of 5 stars
4/5
Microsoft Office 365 Bible: 10:1 Mastery | Excel in Your Profession, Enhance Time Management, and Foster Exceptional Collaboration [III EDITION]: Career Elevator
Ebook
Microsoft Office 365 Bible: 10:1 Mastery | Excel in Your Profession, Enhance Time Management, and Foster Exceptional Collaboration [III EDITION]: Career Elevator
byKevin Pitch
Rating: 5 out of 5 stars
5/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
HTML & CSS: Learn the Fundaments in 7 Days
Ebook
HTML & CSS: Learn the Fundaments in 7 Days
byMichael Knapp
Rating: 4 out of 5 stars
4/5
C# Programming from Zero to Proficiency (Beginner): C# from Zero to Proficiency, #2
Ebook
C# Programming from Zero to Proficiency (Beginner): C# from Zero to Proficiency, #2
byPatrick Felicia
Rating: 0 out of 5 stars
0 ratings
Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps
Ebook
Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps
byJason Scotts
Rating: 4 out of 5 stars
4/5
Python: For Beginners A Crash Course Guide To Learn Python in 1 Week
Ebook
Python: For Beginners A Crash Course Guide To Learn Python in 1 Week
byTimothy C. Needham
Rating: 4 out of 5 stars
4/5
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
Ebook
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
byNigel Tillery
Rating: 0 out of 5 stars
0 ratings
Python Programming For Beginners: Learn The Basics Of Python Programming (Python Crash Course, Programming for Dummies)
Ebook
Python Programming For Beginners: Learn The Basics Of Python Programming (Python Crash Course, Programming for Dummies)
byJames Tudor
Rating: 5 out of 5 stars
5/5
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
Ebook
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
byGwendolyn Faraday
Rating: 5 out of 5 stars
5/5
Python Programming for Beginners: A Comprehensive Crash Course With Practical Exercises to Quickly Learn Coding and Programming for Data Analysis and Machine Learning
Ebook
Python Programming for Beginners: A Comprehensive Crash Course With Practical Exercises to Quickly Learn Coding and Programming for Data Analysis and Machine Learning
byAnthony Adams
Rating: 4 out of 5 stars
4/5
Learn JavaScript in 24 Hours
Ebook
Learn JavaScript in 24 Hours
byAlex Nordeen
Rating: 3 out of 5 stars
3/5
Python QuickStart Guide: The Simplified Beginner's Guide to Python Programming Using Hands-On Projects and Real-World Applications
Ebook
Python QuickStart Guide: The Simplified Beginner's Guide to Python Programming Using Hands-On Projects and Real-World Applications
byRobert Oliver
Rating: 0 out of 5 stars
0 ratings
Python Programming, Deep Learning: 3 Books in 1: A Complete Guide for Beginners, Python Coding for Ai, Neural Networks, & Machine Learning, Data Science/Analysis with Practical Exercises for Learners
Ebook
Python Programming, Deep Learning: 3 Books in 1: A Complete Guide for Beginners, Python Coding for Ai, Neural Networks, & Machine Learning, Data Science/Analysis with Practical Exercises for Learners
byAnthony Adams
Rating: 4 out of 5 stars
4/5
HTML & CSS QuickStart Guide: The Simplified Beginners Guide to Developing a Strong Coding Foundation, Building Responsive Websites, and Mastering the Fundamentals of Modern Web Design
Ebook
HTML & CSS QuickStart Guide: The Simplified Beginners Guide to Developing a Strong Coding Foundation, Building Responsive Websites, and Mastering the Fundamentals of Modern Web Design
byDavid DuRocher
Rating: 4 out of 5 stars
4/5
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
Ebook
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
byMark Chan
Rating: 5 out of 5 stars
5/5
Python Machine Learning By Example
Ebook
Python Machine Learning By Example
byYuxi (Hayden) Liu
Rating: 4 out of 5 stars
4/5
Problem Solving in C and Python: Programming Exercises and Solutions, Part 1
Ebook
Problem Solving in C and Python: Programming Exercises and Solutions, Part 1
byYana Kortsarts
Rating: 5 out of 5 stars
5/5
Python Data Structures and Algorithms
Ebook
Python Data Structures and Algorithms
byBenjamin Baka
Rating: 5 out of 5 stars
5/5
The Advanced Roblox Coding Book: An Unofficial Guide, Updated Edition: Learn How to Script Games, Code Objects and Settings, and Create Your Own World!
Ebook
The Advanced Roblox Coding Book: An Unofficial Guide, Updated Edition: Learn How to Script Games, Code Objects and Settings, and Create Your Own World!
byHeath Haskins
Rating: 5 out of 5 stars
5/5
Linux: Learn in 24 Hours
Ebook
Linux: Learn in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5
Expert Python Programming - Third Edition: Become a master in Python by learning coding best practices and advanced programming concepts in Python 3.7, 3rd Edition
Ebook
Expert Python Programming - Third Edition: Become a master in Python by learning coding best practices and advanced programming concepts in Python 3.7, 3rd Edition
byMichał Jaworski
Rating: 0 out of 5 stars
0 ratings
The Unofficial Guide to Open Broadcaster Software: OBS: The World's Most Popular Free Live-Streaming Application
Ebook
The Unofficial Guide to Open Broadcaster Software: OBS: The World's Most Popular Free Live-Streaming Application
byPaul Richards
Rating: 0 out of 5 stars
0 ratings
Python GUI Programming Cookbook - Second Edition
Ebook
Python GUI Programming Cookbook - Second Edition
byMeier Burkhard A.
Rating: 5 out of 5 stars
5/5
Learn SQL in 24 Hours
Ebook
Learn SQL in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5

Related podcast episodes

Skip carousel

Building A Cost Effective Data Catalog With Tree Schema - Episode 158: An interview about the Tree Schema data catalog platform and using it to quickly get visibility into your data assets.
Podcast episode
Building A Cost Effective Data Catalog With Tree Schema - Episode 158: An interview about the Tree Schema data catalog platform and using it to quickly get visibility into your data assets.
byData Engineering Podcast
0 ratings
0% found this document useful
Hacking Kubernetes - Jay Beale - PSW #735: Jay comes on the show to talk about container and Kubernetes architecture and security (or lack thereof). Segment Resources: Peirates, a Kubernetes penetration testing tool: Free Kubernetes workshops: DEF CON Kubernetes CTF Jay's Black Hat...
Podcast episode
Hacking Kubernetes - Jay Beale - PSW #735: Jay comes on the show to talk about container and Kubernetes architecture and security (or lack thereof). Segment Resources: Peirates, a Kubernetes penetration testing tool: Free Kubernetes workshops: DEF CON Kubernetes CTF Jay's Black Hat...
bySecurity Weekly Podcast Network (Video)
0 ratings
0% found this document useful
Panel Discussion: Cloud Identity and Access Management
Podcast episode
Panel Discussion: Cloud Identity and Access Management
byCloud Ace
0 ratings
0% found this document useful
#456: Data Architectures with AWS Hero Elliott Cordo: AWS Data Hero and Head of Data at Capsule, Elliott Cordo, has built many ground-up data architecture
Podcast episode
#456: Data Architectures with AWS Hero Elliott Cordo: AWS Data Hero and Head of Data at Capsule, Elliott Cordo, has built many ground-up data architecture
byAWS Podcast
0 ratings
0% found this document useful
Kubernetes Registry with Benjamin Elder: Benjamin Elder is a Senior Software Engineer at Google, a Kubernetes SIG Testing Chair & Tech Lead, and a Kubernetes Steering Committee member. In this episode we got to chat with Benjamin about the new kubernetes registry migration from k8s.gcr.io to...
Podcast episode
Kubernetes Registry with Benjamin Elder: Benjamin Elder is a Senior Software Engineer at Google, a Kubernetes SIG Testing Chair & Tech Lead, and a Kubernetes Steering Committee member. In this episode we got to chat with Benjamin about the new kubernetes registry migration from k8s.gcr.io to...
byKubernetes Podcast from Google
0 ratings
0% found this document useful
Taking A Tour Of PostgreSQL with Jonathan Katz - Episode 42: A Whirlwind Tour Of The PostgreSQL Database (Interview)
Podcast episode
Taking A Tour Of PostgreSQL with Jonathan Katz - Episode 42: A Whirlwind Tour Of The PostgreSQL Database (Interview)
byData Engineering Podcast
100%
100% found this document useful
Cloud Native Security Con with Emily Fox: is a security engineer @Apple Cloud Services, a CNCF Technical Oversight Committee member and co-chair for a bunch of CNCF events including recently the Cloud Native Security Conference in Seattle. We had a chance to talk to Emily about the first...
Podcast episode
Cloud Native Security Con with Emily Fox: is a security engineer @Apple Cloud Services, a CNCF Technical Oversight Committee member and co-chair for a bunch of CNCF events including recently the Cloud Native Security Conference in Seattle. We had a chance to talk to Emily about the first...
byKubernetes Podcast from Google
0 ratings
0% found this document useful
Ali Ghodsi – The Past, Present, and Future of Big Data – [Founder’s Field Guide, EP.18]: My Guest today is Ali Ghodsi, founder and CEO of Databricks, a data analytics platform for data scientists and developers. He's also the founder of Apache Spark, the open-source project that Databricks is built on, and is an accomplished researcher at...
Podcast episode
Ali Ghodsi – The Past, Present, and Future of Big Data – [Founder’s Field Guide, EP.18]: My Guest today is Ali Ghodsi, founder and CEO of Databricks, a data analytics platform for data scientists and developers. He's also the founder of Apache Spark, the open-source project that Databricks is built on, and is an accomplished researcher at...
byInvest Like the Best with Patrick O'Shaughnessy
0 ratings
0% found this document useful
SnowflakeDB: The Data Warehouse Built For The Cloud - Episode 110: An interview about how SnowflakeDB was built to provide a performant and flexible data platform for the cloud era
Podcast episode
SnowflakeDB: The Data Warehouse Built For The Cloud - Episode 110: An interview about how SnowflakeDB was built to provide a performant and flexible data platform for the cloud era
byData Engineering Podcast
0 ratings
0% found this document useful
State of DevOps Report 2021 with Nathen Harvey and Dustin Smith: This week, Stephanie Wong and Carter Morgan are talking about the recently released State of DevOps Report.
Podcast episode
State of DevOps Report 2021 with Nathen Harvey and Dustin Smith: This week, Stephanie Wong and Carter Morgan are talking about the recently released State of DevOps Report.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
KubeCon NA 2022: In this episode we bring you with us to KubeCon NA 2022 in Detroit, Michigan. We interviewed 15 attendees from various backgrounds and learned some cool insights.
Podcast episode
KubeCon NA 2022: In this episode we bring you with us to KubeCon NA 2022 in Detroit, Michigan. We interviewed 15 attendees from various backgrounds and learned some cool insights.
byKubernetes Podcast from Google
0 ratings
0% found this document useful
Engineering interview tips & tricks: with Emma Draper & Jonas
Podcast episode
Engineering interview tips & tricks: with Emma Draper & Jonas
byGo Time: Golang, Software Engineering
0 ratings
0% found this document useful
Every commit is a gift: celebrating Maintainer Week with Brett Cannon
Podcast episode
Every commit is a gift: celebrating Maintainer Week with Brett Cannon
byThe Changelog: Software Development, Open Source
0 ratings
0% found this document useful
Episode 232: Azure Container Instances
Podcast episode
Episode 232: Azure Container Instances
byMicrosoft Azure Cloud Cover Show (HD) - Channel 9
0 ratings
0% found this document useful
Lessons Learned from Cloud Foundry
Podcast episode
Lessons Learned from Cloud Foundry
byThe Cloudcast
0 ratings
0% found this document useful
Managed Kafka with Tom Crayford: Kafka is a distributed log for producers and consumers to publish messages to each other. We’ve done many shows about Kafka as a key building block for distributed systems, but we often leave out the discussion of the complexities of setting up Kafka a...
Podcast episode
Managed Kafka with Tom Crayford: Kafka is a distributed log for producers and consumers to publish messages to each other. We’ve done many shows about Kafka as a key building block for distributed systems, but we often leave out the discussion of the complexities of setting up Kafka a...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Software Architecture with Simon Brown: Software architecture address the challenge of communicating and navigating large, complex systems to stakeholders, both technical and non-technical. Over the years software architecture has gone in and out of fashion.
Podcast episode
Software Architecture with Simon Brown: Software architecture address the challenge of communicating and navigating large, complex systems to stakeholders, both technical and non-technical. Over the years software architecture has gone in and out of fashion.
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Unlocking The Power of Data Lineage In Your Platform with OpenLineage: An interview with Julien Le Dem about the OpenLineage specification and the opportunity that it offers for simplifying the tracking and analysis of data lineage across your data platform.
Podcast episode
Unlocking The Power of Data Lineage In Your Platform with OpenLineage: An interview with Julien Le Dem about the OpenLineage specification and the opportunity that it offers for simplifying the tracking and analysis of data lineage across your data platform.
byData Engineering Podcast
0 ratings
0% found this document useful
Data Security in Snowflake’s Data Cloud with Dan Myers: Snowflake went public last year and is one of the fastest growing companies in the data cloud space. Businesses from all over the world are utilizing Snowflake for data storage, processing, and analytics. Businesses using Snowflake are storing massive am...
Podcast episode
Data Security in Snowflake’s Data Cloud with Dan Myers: Snowflake went public last year and is one of the fastest growing companies in the data cloud space. Businesses from all over the world are utilizing Snowflake for data storage, processing, and analytics. Businesses using Snowflake are storing massive am...
byPartially Redacted: Data Privacy, Security & Compliance
0 ratings
0% found this document useful
Breaking Kubernetes for Fun and Profit with David Flanagan: is a developer, educator and technology enthusiast with a special interest for Kubernetes and Cloud Native technologies. David is the founder of , an online platform aiming at teaching kubernetes to developers. One of the popular shows on RawKode is ....
Podcast episode
Breaking Kubernetes for Fun and Profit with David Flanagan: is a developer, educator and technology enthusiast with a special interest for Kubernetes and Cloud Native technologies. David is the founder of , an online platform aiming at teaching kubernetes to developers. One of the popular shows on RawKode is ....
byKubernetes Podcast from Google
0 ratings
0% found this document useful
007: Data Cleansing & Analysis with Oz du Soleil: Oz du Soleil is an Excel MVP since 2015 and is an expert in data cleansing & analysis. He has an Excel blog over at www.datascopic.net which is his commitment to data literacy. He’s the leading author on the revised version of Guerrilla Data...
Podcast episode
007: Data Cleansing & Analysis with Oz du Soleil: Oz du Soleil is an Excel MVP since 2015 and is an expert in data cleansing & analysis. He has an Excel blog over at www.datascopic.net which is his commitment to data literacy. He’s the leading author on the revised version of Guerrilla Data...
byLearn Microsoft Excel with MyExcelOnline
0 ratings
0% found this document useful
A Multipurpose Database For Transactions And Analytics To Simplify Your Data Architecture With Singlestore: An interview with Shireesh Thota about how the Singlestore database engine allows you to reduce architectural sprawl in your data systems by combining performant and scalable transactional and analytical capabilities into a single platform
Podcast episode
A Multipurpose Database For Transactions And Analytics To Simplify Your Data Architecture With Singlestore: An interview with Shireesh Thota about how the Singlestore database engine allows you to reduce architectural sprawl in your data systems by combining performant and scalable transactional and analytical capabilities into a single platform
byData Engineering Podcast
0 ratings
0% found this document useful
Azure Databricks: I sat down with Ali Ghodsi, CEO and found of Databricks, and John Chirapurath, GM for Data Platform Marketing at Microsoft related to the recent announcement of Azure Databricks. When I heard about the announcement, my first thoughts were...
Podcast episode
Azure Databricks: I sat down with Ali Ghodsi, CEO and found of Databricks, and John Chirapurath, GM for Data Platform Marketing at Microsoft related to the recent announcement of Azure Databricks. When I heard about the announcement, my first thoughts were...
byData Skeptic
0 ratings
0% found this document useful
Massively Parallel Data Processing In Python Without The Effort Using Bodo: An interview about how Bodo converts standard Python code to native MPI automatically for massive speed ups in data processing workloads
Podcast episode
Massively Parallel Data Processing In Python Without The Effort Using Bodo: An interview about how Bodo converts standard Python code to native MPI automatically for massive speed ups in data processing workloads
byData Engineering Podcast
0 ratings
0% found this document useful
Security, Access and War, with Kateryna Ivashchenko: Kateryna Ivashchenko is a Senior Demand Generation Manager at Teleport, an organizer of community events, and a supporter of the developer community in her home country of Ukraine.
Podcast episode
Security, Access and War, with Kateryna Ivashchenko: Kateryna Ivashchenko is a Senior Demand Generation Manager at Teleport, an organizer of community events, and a supporter of the developer community in her home country of Ukraine.
byKubernetes Podcast from Google
0 ratings
0% found this document useful
Build Your Data Analytics Like An Engineer - Episode 81: An interview about how dbt enables your data teams to build better analytics in your data warehouse
Podcast episode
Build Your Data Analytics Like An Engineer - Episode 81: An interview about how dbt enables your data teams to build better analytics in your data warehouse
byData Engineering Podcast
0 ratings
0% found this document useful
All Things Azure with Dwayne Monroe: Dwayne Monroe is a senior cloud architect at Cloudreach, an organization that helps enterprises maximize their cloud investments, who’s focused on Azure. Prior to joining Cloudreach, Dwayne worked as a senior Microsoft and cloud architect at High Availabi
Podcast episode
All Things Azure with Dwayne Monroe: Dwayne Monroe is a senior cloud architect at Cloudreach, an organization that helps enterprises maximize their cloud investments, who’s focused on Azure. Prior to joining Cloudreach, Dwayne worked as a senior Microsoft and cloud architect at High Availabi
byScreaming in the Cloud
0 ratings
0% found this document useful
Building An Internal Database As A Service Platform At Cloudflare: Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? In this episode Vignesh Ravichandran explains how his team at Cloudflare provides PostgreSQL as a service to their developers for low latency and high uptime services at global scale. This is an interesting and insightful look at pragmatic engineering for reliability and scale.
Podcast episode
Building An Internal Database As A Service Platform At Cloudflare: Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? In this episode Vignesh Ravichandran explains how his team at Cloudflare provides PostgreSQL as a service to their developers for low latency and high uptime services at global scale. This is an interesting and insightful look at pragmatic engineering for reliability and scale.
byData Engineering Podcast
0 ratings
0% found this document useful
Find Out About The Technology Behind The Latest PFAD In Analytical Database Development: Building a database engine requires a substantial amount of engineering effort and time investment. Over the decades of research and development into building these software systems there are a number of common components that are shared across implementations. When Paul Dix decided to re-write the InfluxDB engine he found the Apache Arrow ecosystem ready and waiting with useful building blocks to accelerate the process. In this episode he explains how he used the combination of Apache Arrow, Flight, Datafusion, and Parquet to lay the foundation of the newest version of his time-series database.
Podcast episode
Find Out About The Technology Behind The Latest PFAD In Analytical Database Development: Building a database engine requires a substantial amount of engineering effort and time investment. Over the decades of research and development into building these software systems there are a number of common components that are shared across implementations. When Paul Dix decided to re-write the InfluxDB engine he found the Apache Arrow ecosystem ready and waiting with useful building blocks to accelerate the process. In this episode he explains how he used the combination of Apache Arrow, Flight, Datafusion, and Parquet to lay the foundation of the newest version of his time-series database.
byData Engineering Podcast
0 ratings
0% found this document useful
The Changing Faces of Data and Analytics
Podcast episode
The Changing Faces of Data and Analytics
byInsights Tomorrow
0 ratings
0% found this document useful

Skip carousel

AI As A Service
PC Pro Magazine
Article
AI As A Service
Jul 9, 2020
2 min read
What is ELT?
Techfastly
Article
What is ELT?
Apr 1, 2021
It stands for extract, load, and transform- the processes a data pipeline uses for replicating the data from a source system into a target system such as a cloud data warehouse. 1. Extraction is the first step in which data is copied from the source
6 min read
Basic Concepts
Linux Format
Article
Basic Concepts
Jul 2, 2019
A messaging system such as Kafka enables you to send messages between processes, applications and servers. Applications connect to Kafka to send or get data. Strictly speaking, a Kafka ‘topic’ is a unit of storage in Kafka: data in Kafka is stored in
1 min read
Your First Steps In Grafana
Linux Format
Article
Your First Steps In Grafana
Nov 17, 2020
The easiest way to get hold of Grafana and begin using it as soon as possible is by downloading and executing its official Docker image. This means that apart from the Docker image, you won’t need to download, set up or install anything else for Graf
1 min read
4 Windows Command Prompt Tricks Everyone Should Know
PCWorld
Article
4 Windows Command Prompt Tricks Everyone Should Know
Feb 8, 2017
3 min read
Build A Software Analysis Gitlab Pipeline
Linux Format
Article
Build A Software Analysis Gitlab Pipeline
Aug 24, 2021
8 min read
Pentagon Cancels $10bn Jedi Cloud Computing Contract At Centre Of Amazon And Microsoft Dispute
The Independent
Article
Pentagon Cancels $10bn Jedi Cloud Computing Contract At Centre Of Amazon And Microsoft Dispute
Jul 6, 2021
1 min read
Rokoko Studio 2.0
3D World
Article
Rokoko Studio 2.0
Feb 23, 2021
1 min read
KAFKA Build Utilities With The Kafka Server
Linux Format
Article
KAFKA Build Utilities With The Kafka Server
Jul 2, 2019
Nowadays, quite a few data architectures involve both a database and Apache Kafka, which is a distributed streaming platform and the subject of this tutorial. You can also find Kafka described as a publish-subscribe message system, which is a fancy w
7 min read
Seed Your Own Cloud
Linux Format
Article
Seed Your Own Cloud
Oct 22, 2019
10 min read
Build A Search And Analytic Engine
Linux Format
Article
Build A Search And Analytic Engine
Mar 10, 2020
7 min read
Grafana Terminology
Linux Format
Article
Grafana Terminology
Jan 14, 2020
A Grafana data source is a database, file or service that provides data to Grafana – it cannot operate without data. A Grafana panel is the basic building block of Grafana. Panels are made of visualisations or queries. A Grafana query is used for req
1 min read
Year Of The Linux Desktop (on Windows)
TechLife
Article
Year Of The Linux Desktop (on Windows)
Nov 15, 2021
3 min read
MARIADB Optimise And Control Your Databases
Linux Format
Article
MARIADB Optimise And Control Your Databases
Jul 30, 2019
9 min read
Grafana, Telegraf And Influxdb
Linux Format
Article
Grafana, Telegraf And Influxdb
Jun 30, 2020
If you don’t like Netdata or if you want to try something else, you can give Grafana (https://grafana.com), Telegraf (www.influxdata.com/time-series-platform/telegraf) and InfluxDB (www.influxdata.com/products/influxdb-overview) a try. Grafana can’t
1 min read
Elasticsearch And Kibana Basics
Linux Format
Article
Elasticsearch And Kibana Basics
Dec 15, 2020
1 min read
Mainframe Mage
Linux Format
Article
Mainframe Mage
Jul 28, 2020
12 min read
Installing Apache for Linux… on Windows
TechLife
Article
Installing Apache for Linux… on Windows
Jul 27, 2020
5 min read
Understand And Deploy Security Keys
Linux Format
Article
Understand And Deploy Security Keys
Feb 8, 2022
9 min read
AWS Vs Azure What’s The Difference?
PC Pro Magazine
Article
AWS Vs Azure What’s The Difference?
Sep 11, 2022
7 min read
Build A Linux Smart Home Office
APC
Article
Build A Linux Smart Home Office
Feb 22, 2021
18 min read
What Systems And Software Are Used By The Falcon 9?
Techfastly
Article
What Systems And Software Are Used By The Falcon 9?
Oct 21, 2020
Last summer, SpaceX embarked on the first US-manned spaceflight in almost a decade. What made this event even more historic is that it successfully took NASA astronauts into orbit on a privately-manned spacecraft and delivered them to the Internation
3 min read
Build a Better nginx Reverse Proxy
Maximum PC
Article
Build a Better nginx Reverse Proxy
Feb 4, 2020
4 min read
Data-driven Decision Making That Uses Data, Mind And Heart
The European Business Review
Article
Data-driven Decision Making That Uses Data, Mind And Heart
Jan 31, 2020
14 min read
Supercomputer On A Platter
Business Today
Article
Supercomputer On A Platter
Apr 1, 2022
CHENNAI-HEADQUARTERED automobile major TVS Motor Company uses high-performance computing (HPC) for running R&D simulations and testing the aero-dynamics of two-wheelers, which allows it to make the vehicles stable at speed and more efficient, cool en
7 min read
Real World Computing
PC Pro Magazine
Article
Real World Computing
May 11, 2023
Migrating to Azure isn’t necessarily the toughest part of a successful cloud migration, explains our guest columnist Many organisations succeed at deploying resources in or migrating to Microsoft Azure. But many of those same organisations fail to en
6 min read
News
APC
Article
News
Feb 20, 2023
Be careful who you buy from. Some Chinese crypto miners are doing everything they can to offload their heavily used mining GPUs now that they’ve nothing left to mine. They are going to interesting lengths to sell their inventory, like repainting the
4 min read
DataStax The Real-time Data Company, Unveiled “Change Data Capture” (CDC) for Astra DB
Techfastly
Article
DataStax The Real-time Data Company, Unveiled “Change Data Capture” (CDC) for Astra DB
May 1, 2022
3 min read
News
APC
Article
News
Feb 20, 2023
Be careful who you buy from. Some Chinese crypto miners are doing everything they can to offload their heavily used mining GPUs now that they’ve nothing left to mine. They are going to interesting lengths to sell their inventory, like repainting them
4 min read
Edge Computing Ecosystem Architecture, Use Cases, and Examples
Techfastly
Article
Edge Computing Ecosystem Architecture, Use Cases, and Examples
Jun 1, 2022
6 min read

Related categories

Skip carousel

Reviews for Understanding Azure Data Factory

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Understanding Azure Data Factory - Sudhir Rawat

Sudhir Rawat and Abhishek NarainUnderstanding Azure Data Factoryhttps://doi.org/10.1007/978-1-4842-4122-6_1

1. Introduction to Data Analytics

Sudhir Rawat¹ and Abhishek Narain²

(1)

Bangalore, India

(2)

Shanghai, China

The demand for Big Data analytics services is greater than ever before, and this trend will only continue—exponentially so—as data analytics platforms evolve over time. This is a great time to be a data engineer or a data scientist with so many options of analytics platforms to select from.

The purpose of this book is to give you the nitty-gritty details of operationalizing Big Data and advanced analytics solutions on Microsoft Azure.

This book guides you through using Azure Data Factory to coordinate data movement; to perform transformations using technologies such as Hadoop (HDInsight), SQL, Azure Data Lake Analytics, Databricks, files from different kinds of storage, and Cosmos DB; and to execute custom activities for specific tasks (coded in C#). You will learn how to create data pipelines that will allow you to group activities to perform a certain task. This book is hands-on and scenario-driven. It builds on the knowledge gained in each chapter.

The focus of the book is to also highlight the best practices with respect to performance and security, which will be helpful while architecting and developing extract-transform-load (ETL), extract-load-transform (ELT), and advanced analytics projects on Azure.

This book is ideal for data engineers and data scientists who want to gain advanced knowledge in Azure Data Factory (a serverless ETL/ELT service on Azure).

What Is Big Data?

Big Data can be defined by following characteristics:

Volume: As the name says, Big Data consists of extremely large datasets that exceed the processing capacity of conventional systems such as Microsoft SQL, Oracle, and so on. Such data is generated through various data sources such as web applications, the Internet of Things (IoT), social media, and line-of-business applications.

Variety: These sources typically send data in a variety of formats such as text, documents (JSON, XML), images, and video.

Velocity: This is the speed at which data is generated is by such sources. High velocity adds to Big Data. For example a factory installed sensor to keep monitor it’s temperature to avoid any damage. Such sensors sends E/Sec (event per second) or sometime in millisecond. Generally IoT enable places has many such sensors which sends data so frequently.

Veracity: This is the quality of data captured from various sources. System also generates bias, noise and abnormal data which adds to Big Data. High veracity means more data. It not only adds to big data but also add responsibility to correct it to avoid presenting wrong information to the business user.

Let’s think about a fictious retail company called AdventureWorks, which has a customer base across the globe. AdventureWorks has an e-commerce web site and mobile applications for enabling users to shop online, lodge complaints, give feedback, apply for product returns, and so on. To provide the inventory/products to the users, it relies on a business-to-business (B2B) model and partners with vendors (other businesses) that want to list their products on AdventureWorks e-commerce applications. AdventureWorks also has sensors installed on its delivery vans to collect various telemetry data; for example, it provides customers with up-to-date information on consignment delivery and sends alerts to drivers in the case of any issue, for example a high temperature in the delivery van’s engine. The company also sends photographers to various trekking sites. All this data is sent back to the company so it can do image classification to understand the gadgets in demand. This helps AdventureWorks stock the relevant items. AdventureWorks also captures feeds from social media in case any feedback/comment/complaint is raised for AdventureWorks.

To get some valuable insights from the huge volume of data, you must choose a distributed and scalable platform that can process the Big Data. Big Data has great potential for changing the way organizations use information to enhance the customer experience, discover patterns in data, and transform their businesses with the insights.

Why Big Data?

Data is the new currency. Data volumes have been increasing drastically over time. Data is being generated from traditional point-of-sale systems, modern e-commerce applications, social sources like Twitter, and IoT sensors/wearables from across the globe. The challenge for any organization today is to analyze this diverse dataset to make more informed decisions that are predictive and holistic rather than reactive and disconnected.

Big Data analytics is not only used by modern organizations to get valuable insights but is also used by organizations having decades-old data, which earlier was too expensive to process, with the availability of pay-as-you-go cloud offerings. As an example, with Microsoft Azure you can easily spin up a 100-node Apache Spark cluster (for Big Data analytics) in less than ten minutes and pay only for the time your job runs on those clusters, offering both cloud scale and cost savings in a Big Data analytics project.

Big Data Analytics on Microsoft Azure

Today practically every business is moving to the cloud because of lucrative reasons such as no up-front costs, infinite scale possibilities, high performance, and so on. The businesses that store sensitive data that can’t be moved to the cloud can choose a hybrid approach. The Microsoft cloud (aka Azure) provides three types of services .

Infrastructure as a service (IaaS )

Platform as a service (PaaS)

Software as a service (SaaS)

It seems like every organization on this planet is moving to PaaS. This gives companies more time to think about their business while innovating, improving customer experience, and saving money.

Microsoft Azure offers a wide range of cloud services for data analysis. We can broadly categorize them under storage and compute.

Azure SQL Data Warehouse, a cloud-based massively parallel-processing-enabled enterprise data warehouse

Azure Blob Storage, a massively scalable object storage for unstructured data that can be used to search for hidden insights through Big Data analytics

Azure Data Lake Store, a massively scalable data store (for unstructured, semistructured, and structured data) built to the open HDFS standard

Azure Data Lake Analytics, a distributed analytics service that makes it easy for Big Data analytics to support programs written in U-SQL, R, Python, and .NET

Azure Analysis Services, enterprise-grade data modeling tool on Azure (based on SQL Server Analysis Service)

Azure HDInsight, a fully managed, full-spectrum open source analytics service for enterprises (Hadoop, Spark, Hive, LLAP, Storm, and more)

Azure Databricks, a Spark-based high-performance analytics platform optimized for Azure

Azure Machine Learning, an open and elastic AI development tool for finding patterns in existing data and generating models for prediction

Azure Data Factory, a hybrid and scalable data integration (ETL) service for Big Data and advanced analytics solutions

Azure Cosmos DB, an elastic and independent scale throughput and storage tool; it also offers throughput, latency, availability, and consistency guarantees with comprehensive service level agreements (SLAs), something no other database service offers at the moment

What Is Azure Data Factory?

Big Data requires a service that can help you orchestrate and operationalize complex processes that in turn refine the enormous structure/semistructured data into actionable business insights.

Azure Data Factory (ADF) is a cloud-based data integration service that acts as the glue in your Big Data or advanced analytics solution, ensuring your complex workflows integrate with the various dependent services required in your solution. It provides a single pane for monitoring all your data movements and complex data processing jobs. Simply said, it is a serverless, managed cloud service that’s built for these complex hybrid ETL, ELT, and data integration projects (data integration as a service).

Using Azure Data Factory, you can create and schedule data-driven workflows (called pipelines ) that can ingest data from disparate data stores. It can process and transform the data by using compute services such as Azure HDInsight Hadoop, Spark, Azure Data Lake Analytics, and Azure Machine Learning (Figure 1-1).

../images/468833_1_En_1_Chapter/468833_1_En_1_Fig1_HTML.jpg

Figure 1-1

Azure Data Factory

High-Level ADF Concepts

An Azure subscription might have one or more ADF instances. ADF is composed of four key components, covered in the following sections. These components work together to provide the platform on which you can compose data-driven workflows with steps to move and transform data or execute custom tasks using custom activity that could include deleting files on Azure storage after transforms or simply running additional business logic that is not offered out of the box within Azure Data Factory.

Activity

An activity represents an action or the processing step. For example, you copy an activity to copy data between a source and a sink. Similarly, you can have a Databricks notebook activity transform data using Azure Databricks. ADF supports three types of activities: data movement, data transformation, and control flow activities .

Pipeline

A pipeline is a logical grouping of activities. Typically, it will contain a set of activities trying to achieve the same end goal. For example, a pipeline can contain a group of activities ingesting data from disparate sources, including on-premise sources, and then running a Hive query on an on-demand HDInsight cluster to join and partition data for further analysis.

The activities in a pipeline can be chained together to operate sequentially, or they can operate independently in parallel.

Datasets

Datasets represent data structures within the data stores, which simply point to or reference the data you want to use in your activities as inputs or outputs.

Linked Service

A linked service consists of the connection details either to a data source like a file from Azure Blob Storage or a table from Azure SQL or to a compute service such as HDInsight, Azure Databricks, Azure Data Lake Analytics, and Azure Batch.

Integration Runtime

The integration runtime (IR) is the underlying compute infrastructure used by ADF. This is the compute where data movement, activity dispatch, or SSIS package execution happens. It has three different names: Azure, self-hosted, and Azure SQL Server Integration Services (Figure 1-2).

../images/468833_1_En_1_Chapter/468833_1_En_1_Fig2_HTML.jpg

Figure 1-2

Relationship between ADF components

When to Use ADF?

The following are examples of when you should use ADF:

Building a Big Data analytics solution on Microsoft Azure that relies on technologies for handling large numbers of diverse datasets. ADF offers a way to create and run an ADF pipeline in the cloud.

Building a modern data warehouse solution that relies on technologies such as SQL Server, SQL Server Integration Services (SSIS), or SQL Server Analysis Services (SSAS); see Figure 1-3. ADF provides the ability to run SSIS packages on Azure or build a modern ETL/ELT pipeline letting you access both on-premise and cloud data services.

Migrating or coping data from a physical server to the cloud or from a non-Azure cloud to Azure (blob storage, data lake storage, SQL, Cosmos DB). ADF can be used to migrate both structured and binary data.

You will learn more about the ADF constructs in Chapter 2.

../images/468833_1_En_1_Chapter/468833_1_En_1_Fig3_HTML.jpg

Figure 1-3

A typical modern data warehouse solution

Why ADF?

The following are reasons why you should use ADF:

Cost effective: ADF is serverless, and the billing is based on factors such as the number of activities run, the data movement duration, and the SSIS package execution duration. You can find the latest pricing details at https://aka.ms/adfpricing.

For example, if you run your ETL/ ELT pipeline hourly, which also involves data movement (assuming 100GB data movement per hourly run, which should take around 8 minutes with 200MBps bandwidth), then ADF would bill you not more than $12 for the monthly execution (720 pipeline runs).

Note: The charges for any other service (HDInsight, Azure Data Lake Analytics) are not considered in this calculation. This is solely for the ADF orchestration and data movement cost. On the contrary, there are non-Microsoft ETL/ELT tools that may offer similar capabilities with a much higher cost.

On-demand compute: ADF provides additional cost-saving functionality like on-demand provisioning of Hindsight Hadoop clusters. It takes care of the provisioning and teardown of the cluster once the job has executed, saving you a

Enjoying the preview?

Page 1 of 1

Understanding Azure Data Factory: Operationalizing Big Data and Advanced Analytics Solutions

About this ebook

Sudhir Rawat

Related authors

Related to Understanding Azure Data Factory

Related ebooks

Programming For You

Related podcast episodes

Related articles

Related categories

Reviews for Understanding Azure Data Factory

What did you think?

Book preview

Understanding Azure Data Factory - Sudhir Rawat

1. Introduction to Data Analytics

What Is Big Data?

Why Big Data?

Big Data Analytics on Microsoft Azure

What Is Azure Data Factory?

High-Level ADF Concepts

When to Use ADF?

Why ADF?