Ebook693 pages5 hours

Data Mesh in Action

Name: Data Mesh in Action
Author: Jacek Majchrzak
ISBN: 9781638351849

By Jacek Majchrzak, Sven Balnojan and Marian Siwiak

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Revolutionize the way your organization approaches data with a data mesh! This new decentralized architecture outpaces monolithic lakes and warehouses and can work for a company of any size.

In Data Mesh in Action you will learn how to:

    Implement a data mesh in your organization
    Turn data into a data product
    Move from your current data architecture to a data mesh
    Identify data domains, and decompose an organization into smaller, manageable domains
    Set up the central governance and local governance levels over data
    Balance responsibilities between the two levels of governance
    Establish a platform that allows efficient connection of distributed data products and automated governance

Data Mesh in Action reveals how this groundbreaking architecture looks for both small startups and large enterprises. You won’t need any new technology—this book shows you how to start implementing a data mesh with flexible processes and organizational change. You’ll explore both an extended case study and multiple real-world examples. As you go, you’ll be expertly guided through discussions around Socio-Technical Architecture and Domain-Driven Design with the goal of building a sleek data-as-a-product system. Plus, dozens of workshop techniques for both in-person and remote meetings help you onboard colleagues and drive a successful transition.

About the technology
Business increasingly relies on efficiently storing and accessing large volumes of data. The data mesh is a new way to decentralize data management that radically improves security and discoverability. A well-designed data mesh simplifies self-service data consumption and reduces the bottlenecks created by monolithic data architectures.

About the book
Data Mesh in Action teaches you pragmatic ways to decentralize your data and organize it into an effective data mesh. You’ll start by building a minimum viable data product, which you’ll expand into a self-service data platform, chapter-by-chapter. You’ll love the book’s unique “sliders” that adjust the mesh to meet your specific needs. You’ll also learn processes and leadership techniques that will change the way you and your colleagues think about data.

What's inside

    Decompose an organization into manageable domains
    Turn data into a data product
    Set up central and local governance levels
    Build a fit-for-purpose data platform
    Improve management, initiation, and support techniques

About the reader
For data professionals. Requires no specific programming stack or data platform.

About the author
Jacek Majchrzak is a hands-on lead data architect. Dr. Sven Balnojan manages data products and teams. Dr. Marian Siwiak is a data scientist and a management consultant for IT, scientific, and technical projects.

Table of Contents

PART 1 FOUNDATIONS
1 The what and why of the data mesh
2 Is a data mesh right for you?
3 Kickstart your data mesh MVP in a month
PART 2 THE FOUR PRINCIPLES IN PRACTICE
4 Domain ownership
5 Data as a product
6 Federated computational governance
7 The self-serve data platform
PART 3 INFRASTRUCTURE AND TECHNICAL ARCHITECTURE
8 Comparing self-serve data platforms
9 Solution architecture design

Skip carousel

Computers

LanguageEnglish

PublisherManning

Release dateMar 21, 2023

ISBN9781638351849

Author

Jacek Majchrzak

Jacek Majchrzak is a hands-on lead architect in the area of drug discovery where he implements the data mesh idea. Jacek is a workshop facilitator with a strong focus on domain-driven design, software architecture and socio-technical systems design.

Related authors

Skip carousel

Related to Data Mesh in Action

Related ebooks

Skip carousel

Python GUI with PyQt: Learn to build modern and stunning GUIs in Python with PyQt5 and Qt Designer (English Edition)
Ebook
Python GUI with PyQt: Learn to build modern and stunning GUIs in Python with PyQt5 and Qt Designer (English Edition)
bySaurabh Chandrakar
Rating: 0 out of 5 stars
0 ratings
Data Processing and Modeling with Hadoop: Mastering Hadoop Ecosystem Including ETL, Data Vault, DMBok, GDPR, and Various Data-Centric Tools
Ebook
Data Processing and Modeling with Hadoop: Mastering Hadoop Ecosystem Including ETL, Data Vault, DMBok, GDPR, and Various Data-Centric Tools
byVinicius Aquino do Vale
Rating: 0 out of 5 stars
0 ratings
Ultimate Neural Network Programming with Python: Create Powerful Modern AI Systems by Harnessing Neural Networks with Python, Keras, and TensorFlow
Ebook
Ultimate Neural Network Programming with Python: Create Powerful Modern AI Systems by Harnessing Neural Networks with Python, Keras, and TensorFlow
byVishal Rajput
Rating: 0 out of 5 stars
0 ratings
Machine Learning with Quantum Computers
Ebook
Machine Learning with Quantum Computers
byMaria Schuld
Rating: 0 out of 5 stars
0 ratings
Designing Cloud Data Platforms
Ebook
Designing Cloud Data Platforms
byDanil Zburivsky
Rating: 0 out of 5 stars
0 ratings
Data Privacy: A runbook for engineers
Ebook
Data Privacy: A runbook for engineers
byNishant Bhajaria
Rating: 0 out of 5 stars
0 ratings
Data Engineering on Azure
Ebook
Data Engineering on Azure
byVlad Riscutia
Rating: 0 out of 5 stars
0 ratings
Designing Deep Learning Systems: A software engineer's guide
Ebook
Designing Deep Learning Systems: A software engineer's guide
byChi Wang
Rating: 0 out of 5 stars
0 ratings
SAS Visual Analytics for SAS Viya
Ebook
SAS Visual Analytics for SAS Viya
bySAS Institute Inc.
Rating: 0 out of 5 stars
0 ratings
Microsoft Dynamics NAV Administration
Ebook
Microsoft Dynamics NAV Administration
byAmit Sachdev
Rating: 0 out of 5 stars
0 ratings
Big Data Analytics: From Strategic Planning to Enterprise Integration with Tools, Techniques, NoSQL, and Graph
Ebook
Big Data Analytics: From Strategic Planning to Enterprise Integration with Tools, Techniques, NoSQL, and Graph
byDavid Loshin
Rating: 5 out of 5 stars
5/5
MLOps Engineering at Scale
Ebook
MLOps Engineering at Scale
byCarl Osipov
Rating: 0 out of 5 stars
0 ratings
Managing Data in Motion: Data Integration Best Practice Techniques and Technologies
Ebook
Managing Data in Motion: Data Integration Best Practice Techniques and Technologies
byApril Reeve
Rating: 0 out of 5 stars
0 ratings
The Cloud at Your Service: The when, how, and why of enterprise cloud computing
Ebook
The Cloud at Your Service: The when, how, and why of enterprise cloud computing
byArthur Mateos
Rating: 0 out of 5 stars
0 ratings
Graph Databases in Action: Examples in Gremlin
Ebook
Graph Databases in Action: Examples in Gremlin
byJosh Perryman
Rating: 0 out of 5 stars
0 ratings
Mastering Business Intelligence with MicroStrategy
Ebook
Mastering Business Intelligence with MicroStrategy
byNing Ma
Rating: 0 out of 5 stars
0 ratings
Applied Data Mining for Forecasting Using SAS
Ebook
Applied Data Mining for Forecasting Using SAS
byTim Rey
Rating: 0 out of 5 stars
0 ratings
Data Lake Development with Big Data
Ebook
Data Lake Development with Big Data
byPasupuleti Pradeep
Rating: 0 out of 5 stars
0 ratings
Leaders and Innovators: How Data-Driven Organizations Are Winning with Analytics
Ebook
Leaders and Innovators: How Data-Driven Organizations Are Winning with Analytics
byTho H. Nguyen
Rating: 1 out of 5 stars
1/5
Learning Microsoft Windows Server 2012 Dynamic Access Control
Ebook
Learning Microsoft Windows Server 2012 Dynamic Access Control
byJochen Nickel
Rating: 0 out of 5 stars
0 ratings
Introducing Data Science: Big data, machine learning, and more, using Python tools
Ebook
Introducing Data Science: Big data, machine learning, and more, using Python tools
byDavy Cielen
Rating: 5 out of 5 stars
5/5
Data Virtualization for Business Intelligence Systems: Revolutionizing Data Integration for Data Warehouses
Ebook
Data Virtualization for Business Intelligence Systems: Revolutionizing Data Integration for Data Warehouses
byRick van der Lans
Rating: 4 out of 5 stars
4/5
Infrastructure as Code, Patterns and Practices: With examples in Python and Terraform
Ebook
Infrastructure as Code, Patterns and Practices: With examples in Python and Terraform
byRosemary Wang
Rating: 0 out of 5 stars
0 ratings
Data Mesh: Building Scalable, Resilient, and Decentralized Data Infrastructure for the Enterprise. Part 2
Ebook
Data Mesh: Building Scalable, Resilient, and Decentralized Data Infrastructure for the Enterprise. Part 2
byTom Lesley
Rating: 0 out of 5 stars
0 ratings
Model Based Environment: A Practical Guide for Data Model Implementation with Examples in Powerdesigner
Ebook
Model Based Environment: A Practical Guide for Data Model Implementation with Examples in Powerdesigner
byVladimir Pantic
Rating: 0 out of 5 stars
0 ratings
Software Engineering for Embedded Systems: Methods, Practical Techniques, and Applications
Ebook
Software Engineering for Embedded Systems: Methods, Practical Techniques, and Applications
byRobert Oshana
Rating: 3 out of 5 stars
3/5
ASP.Net Web Developer's Guide
Ebook
ASP.Net Web Developer's Guide
bySyngress
Rating: 0 out of 5 stars
0 ratings
Private Cloud Computing: Consolidation, Virtualization, and Service-Oriented Infrastructure
Ebook
Private Cloud Computing: Consolidation, Virtualization, and Service-Oriented Infrastructure
byStephen R Smoot
Rating: 0 out of 5 stars
0 ratings
Agile Metrics in Action: How to measure and improve team performance
Ebook
Agile Metrics in Action: How to measure and improve team performance
byChristopher Davis
Rating: 0 out of 5 stars
0 ratings
An Introduction to SAS Visual Analytics: How to Explore Numbers, Design Reports, and Gain Insight into Your Data
Ebook
An Introduction to SAS Visual Analytics: How to Explore Numbers, Design Reports, and Gain Insight into Your Data
byTricia Aanderud
Rating: 5 out of 5 stars
5/5

Computers For You

Skip carousel

Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
Ebook
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
byAaron Smith
Rating: 0 out of 5 stars
0 ratings
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 5 out of 5 stars
5/5
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
Ebook
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
bySteven Cooper
Rating: 4 out of 5 stars
4/5
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
Ebook
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
byHadelin de Ponteves
Rating: 0 out of 5 stars
0 ratings
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Elon Musk
Ebook
Elon Musk
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
The Best Hacking Tricks for Beginners
Ebook
The Best Hacking Tricks for Beginners
byRAJ TYAGI
Rating: 4 out of 5 stars
4/5
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
Ebook
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
byAlex Parkinson
Rating: 4 out of 5 stars
4/5
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Deep Search: How to Explore the Internet More Effectively
Ebook
Deep Search: How to Explore the Internet More Effectively
byAlan Pearce
Rating: 5 out of 5 stars
5/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
Ebook
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
byMaximus Wilson
Rating: 0 out of 5 stars
0 ratings
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
Ebook
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
byTJ Books
Rating: 0 out of 5 stars
0 ratings
The Designer's Web Handbook: What You Need to Know to Create for the Web
Ebook
The Designer's Web Handbook: What You Need to Know to Create for the Web
byPatrick McNeil
Rating: 0 out of 5 stars
0 ratings
Practical Lock Picking: A Physical Penetration Tester's Training Guide
Ebook
Practical Lock Picking: A Physical Penetration Tester's Training Guide
byDeviant Ollam
Rating: 5 out of 5 stars
5/5
The Mega Box: The Ultimate Guide to the Best Free Resources on the Internet
Ebook
The Mega Box: The Ultimate Guide to the Best Free Resources on the Internet
byChris Mason
Rating: 4 out of 5 stars
4/5
People Skills for Analytical Thinkers
Ebook
People Skills for Analytical Thinkers
byGilbert Eijkelenboom
Rating: 5 out of 5 stars
5/5
Learning the Chess Openings
Ebook
Learning the Chess Openings
byJef Kaan
Rating: 5 out of 5 stars
5/5
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
CompTIA Security+ Practice Questions
Ebook
CompTIA Security+ Practice Questions
byIP Specialist
Rating: 2 out of 5 stars
2/5
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
Ebook
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
bySeth Stephens-Davidowitz
Rating: 4 out of 5 stars
4/5
The Professional Voiceover Handbook: Voiceover training, #1
Ebook
The Professional Voiceover Handbook: Voiceover training, #1
byPeter Baker
Rating: 5 out of 5 stars
5/5
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
Ebook
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
byRizwan Virk
Rating: 5 out of 5 stars
5/5
YouTube: How to Build and Optimize Your First YouTube Channel, Marketing, SEO, Tips and Strategies for YouTube Channel Success
Ebook
YouTube: How to Build and Optimize Your First YouTube Channel, Marketing, SEO, Tips and Strategies for YouTube Channel Success
byTommy Swindali
Rating: 4 out of 5 stars
4/5
Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
Ebook
Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
byKathleen Hale
Rating: 4 out of 5 stars
4/5
Summary of Digital Minimalism: by Cal Newport - Choosing a Focused Life in a Noisy World - A Comprehensive Summary
Ebook
Summary of Digital Minimalism: by Cal Newport - Choosing a Focused Life in a Noisy World - A Comprehensive Summary
byAlexander Cooper
Rating: 5 out of 5 stars
5/5
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
Ebook
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
byTriumph Books
Rating: 4 out of 5 stars
4/5
Web Designer's Idea Book, Volume 4: Inspiration from the Best Web Design Trends, Themes and Styles
Ebook
Web Designer's Idea Book, Volume 4: Inspiration from the Best Web Design Trends, Themes and Styles
byPatrick McNeil
Rating: 4 out of 5 stars
4/5
Dark Aeon: Transhumanism and the War Against Humanity
Ebook
Dark Aeon: Transhumanism and the War Against Humanity
byJoe Allen
Rating: 5 out of 5 stars
5/5
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
Ebook
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
byQuentin Docter
Rating: 0 out of 5 stars
0 ratings

Related podcast episodes

Skip carousel

Bring Your Own Data to LLMs (W/ Jerry Liu of LlamaIndex): Jerry Liu is the CEO and co-founder of LlamaIndex. LlamaIndex is an open-source framework that helps people prep their data for use with large language models in a process called retrieval augmented generation. LLMs are great decision engines, but in...
Podcast episode
Bring Your Own Data to LLMs (W/ Jerry Liu of LlamaIndex): Jerry Liu is the CEO and co-founder of LlamaIndex. LlamaIndex is an open-source framework that helps people prep their data for use with large language models in a process called retrieval augmented generation. LLMs are great decision engines, but in...
byThe Analytics Engineering Podcast
0 ratings
0% found this document useful
Understanding Time-Series Database Patterns
Podcast episode
Understanding Time-Series Database Patterns
byThe Cloudcast
0 ratings
0% found this document useful
Database Monitoring & Observability
Podcast episode
Database Monitoring & Observability
byThe Cloudcast
0 ratings
0% found this document useful
Introducing Data Downtime: From Firefighting to Winning // Barr Moses // MLOps Coffee Sessions #19
Podcast episode
Introducing Data Downtime: From Firefighting to Winning // Barr Moses // MLOps Coffee Sessions #19
byMLOps.community
0 ratings
0% found this document useful
Composable Data Analytics
Podcast episode
Composable Data Analytics
byThe Cloudcast
0 ratings
0% found this document useful
Data Mesh 101 - Zhamak Dehghani
Podcast episode
Data Mesh 101 - Zhamak Dehghani
byDataTalks.Club
0 ratings
0% found this document useful
The Changing Faces of Data and Analytics
Podcast episode
The Changing Faces of Data and Analytics
byInsights Tomorrow
0 ratings
0% found this document useful
Privacy-aware Data Pipelines with Skyflow’s Piper Keyes: A data analytics pipeline is important to modern businesses because it allows them to extract valuable insights from the large amounts of data they generate and collect on a daily basis. This leads to better decision making, improved efficiency, and ...
Podcast episode
Privacy-aware Data Pipelines with Skyflow’s Piper Keyes: A data analytics pipeline is important to modern businesses because it allows them to extract valuable insights from the large amounts of data they generate and collect on a daily basis. This leads to better decision making, improved efficiency, and ...
byPartially Redacted: Data Privacy, Security & Compliance
0 ratings
0% found this document useful
End-to-End Data Science to Drive Business Decisions at LinkedIn with Burcu Baran - TWiML Talk #256: In this episode of our Strata Data conference series, we’re joined by Burcu Baran, Senior Data Scientist at LinkedIn. At Strata, Burcu, along with a few members of her team, delivered the presentation “Using the full spectrum of data science to...
Podcast episode
End-to-End Data Science to Drive Business Decisions at LinkedIn with Burcu Baran - TWiML Talk #256: In this episode of our Strata Data conference series, we’re joined by Burcu Baran, Senior Data Scientist at LinkedIn. At Strata, Burcu, along with a few members of her team, delivered the presentation “Using the full spectrum of data science to...
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Data Sharing Across Business And Platform Boundaries: Sharing data is a simple concept, but complicated to implement well. There are numerous business rules and regulatory concerns that need to be applied. There are also numerous technical considerations to be made, particularly if the producer and consumer of the data aren't using the same platforms. In this episode Andrew Jefferson explains the complexities of building a robust system for data sharing, the techno-social considerations, and how the Bobsled platform that he is building aims to simplify the process.
Podcast episode
Data Sharing Across Business And Platform Boundaries: Sharing data is a simple concept, but complicated to implement well. There are numerous business rules and regulatory concerns that need to be applied. There are also numerous technical considerations to be made, particularly if the producer and consumer of the data aren't using the same platforms. In this episode Andrew Jefferson explains the complexities of building a robust system for data sharing, the techno-social considerations, and how the Bobsled platform that he is building aims to simplify the process.
byData Engineering Podcast
0 ratings
0% found this document useful
Defining A Strategy For Your Data Products: The primary application of data has moved beyond analytics. With the broader audience comes the need to present data in a more approachable format. This has led to the broad adoption of data products being the delivery mechanism for information. In this episode Ranjith Raghunath shares his thoughts on how to build a strategy for the development, delivery, and evolution of data products.
Podcast episode
Defining A Strategy For Your Data Products: The primary application of data has moved beyond analytics. With the broader audience comes the need to present data in a more approachable format. This has led to the broad adoption of data products being the delivery mechanism for information. In this episode Ranjith Raghunath shares his thoughts on how to build a strategy for the development, delivery, and evolution of data products.
byData Engineering Podcast
0 ratings
0% found this document useful
Surveying The Market Of Database Products: Databases are the core of most applications, whether transactional or analytical. In recent years the selection of database products has exploded, making the critical decision of which engine(s) to use even more difficult. In this episode Tanya Bragin shares her experiences as a product manager for two major vendors and the lessons that she has learned about how teams should approach the process of tool selection.
Podcast episode
Surveying The Market Of Database Products: Databases are the core of most applications, whether transactional or analytical. In recent years the selection of database products has exploded, making the critical decision of which engine(s) to use even more difficult. In this episode Tanya Bragin shares her experiences as a product manager for two major vendors and the lessons that she has learned about how teams should approach the process of tool selection.
byData Engineering Podcast
0 ratings
0% found this document useful
Using Data To Illuminate The Intentionally Opaque Insurance Industry: The insurance industry is notoriously opaque and hard to navigate. Max Cho found that fact frustrating enough that he decided to build a business of making policy selection more navigable. In this episode he shares his journey of data collection and analysis and the challenges of automating an intentionally manual industry.
Podcast episode
Using Data To Illuminate The Intentionally Opaque Insurance Industry: The insurance industry is notoriously opaque and hard to navigate. Max Cho found that fact frustrating enough that he decided to build a business of making policy selection more navigable. In this episode he shares his journey of data collection and analysis and the challenges of automating an intentionally manual industry.
byData Engineering Podcast
0 ratings
0% found this document useful
An Exploration Of The Composable Customer Data Platform: The customer data platform is a category of services that was developed early in the evolution of the current era of cloud services for data processing. When it was difficult to wire together the event collection, data modeling, reporting, and activation it made sense to buy monolithic products that handled every stage of the customer data lifecycle. Now that the data warehouse has taken center stage a new approach of composable customer data platforms is emerging. In this episode Darren Haken is joined by Tejas Manohar to discuss how Autotrader UK is addressing their customer data needs by building on top of their existing data stack.
Podcast episode
An Exploration Of The Composable Customer Data Platform: The customer data platform is a category of services that was developed early in the evolution of the current era of cloud services for data processing. When it was difficult to wire together the event collection, data modeling, reporting, and activation it made sense to buy monolithic products that handled every stage of the customer data lifecycle. Now that the data warehouse has taken center stage a new approach of composable customer data platforms is emerging. In this episode Darren Haken is joined by Tejas Manohar to discuss how Autotrader UK is addressing their customer data needs by building on top of their existing data stack.
byData Engineering Podcast
0 ratings
0% found this document useful
Cloud Spanner Revisited with Dilraj Kaur and Christoph Bussler: Mark Mirchandani and Stephanie Wong are back this week as we learn about all the new things happening with Google Cloud Spanner.
Podcast episode
Cloud Spanner Revisited with Dilraj Kaur and Christoph Bussler: Mark Mirchandani and Stephanie Wong are back this week as we learn about all the new things happening with Google Cloud Spanner.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Reconciling The Data In Your Databases With Datafold: A significant portion of data workflows involve storing and processing information in database engines. Validating that the information is stored and processed correctly can be complex and time-consuming, especially when the source and destination speak different dialects of SQL. In this episode Gleb Mezhanskiy, founder and CEO of Datafold, discusses the different error conditions and solutions that you need to know about to ensure the accuracy of your data.
Podcast episode
Reconciling The Data In Your Databases With Datafold: A significant portion of data workflows involve storing and processing information in database engines. Validating that the information is stored and processed correctly can be complex and time-consuming, especially when the source and destination speak different dialects of SQL. In this episode Gleb Mezhanskiy, founder and CEO of Datafold, discusses the different error conditions and solutions that you need to know about to ensure the accuracy of your data.
byData Engineering Podcast
0 ratings
0% found this document useful
An Exploration Of Tobias' Experience In Building A Data Lakehouse From Scratch: Five years of hosting the Data Engineering Podcast has provided Tobias Macey with a wealth of insight into the work of building and operating data systems at a variety of scales and for myriad purposes. In order to condense that acquired knowledge into a format that is useful to everyone Scott Hirleman turns the tables in this episode and asks Tobias about the tactical and strategic aspects of his experiences applying those lessons to the work of building a data platform from scratch.
Podcast episode
An Exploration Of Tobias' Experience In Building A Data Lakehouse From Scratch: Five years of hosting the Data Engineering Podcast has provided Tobias Macey with a wealth of insight into the work of building and operating data systems at a variety of scales and for myriad purposes. In order to condense that acquired knowledge into a format that is useful to everyone Scott Hirleman turns the tables in this episode and asks Tobias about the tactical and strategic aspects of his experiences applying those lessons to the work of building a data platform from scratch.
byData Engineering Podcast
0 ratings
0% found this document useful
Episode 15: Nagios was the Original Call of Duty: Let’s chat about the Cloud and everything in between. The people in this world are pretty comfortable with not running physical servers on their own, but trusting someone else to run them. Yet, people suffer from the psychological barrier of thinking they
Podcast episode
Episode 15: Nagios was the Original Call of Duty: Let’s chat about the Cloud and everything in between. The people in this world are pretty comfortable with not running physical servers on their own, but trusting someone else to run them. Yet, people suffer from the psychological barrier of thinking they
byScreaming in the Cloud
0 ratings
0% found this document useful
Use Your Data Warehouse To Power Your Product Analytics With NetSpring: With the rise of the web and digital business came the need to understand how customers are interacting with the products and services that are being sold. Product analytics has grown into its own category and brought with it several services with generational differences in how they approach the problem. NetSpring is a warehouse-native product analytics service that allows you to gain powerful insights into your customers and their needs by combining your event streams with the rest of your business data. In this episode Priyendra Deshwal explains how NetSpring is designed to empower your product and data teams to build and explore insights around your products in a streamlined and maintainable workflow.
Podcast episode
Use Your Data Warehouse To Power Your Product Analytics With NetSpring: With the rise of the web and digital business came the need to understand how customers are interacting with the products and services that are being sold. Product analytics has grown into its own category and brought with it several services with generational differences in how they approach the problem. NetSpring is a warehouse-native product analytics service that allows you to gain powerful insights into your customers and their needs by combining your event streams with the rest of your business data. In this episode Priyendra Deshwal explains how NetSpring is designed to empower your product and data teams to build and explore insights around your products in a streamlined and maintainable workflow.
byData Engineering Podcast
0 ratings
0% found this document useful
Better Done Than Perfect. Using Surveys for Customer Success with Moritz Dausinger: Today we have another episode of Better Done Than Perfect. Listen in as we talk with Moritz Dausinger, founder of Refiner. Moritz shares the story behind his survey tool, when and how to survey your users, and many other tips for making the most of the survey data.
Podcast episode
Better Done Than Perfect. Using Surveys for Customer Success with Moritz Dausinger: Today we have another episode of Better Done Than Perfect. Listen in as we talk with Moritz Dausinger, founder of Refiner. Moritz shares the story behind his survey tool, when and how to survey your users, and many other tips for making the most of the survey data.
byUI Breakfast: UI/UX Design and Product Strategy
0 ratings
0% found this document useful
Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary: Working with data is a complicated process, with numerous chances for something to go wrong. Identifying and accounting for those errors is a critical piece of building trust in the organization that your data is accurate and up to date. While there are numerous products available to provide that visibility, they all have different technologies and workflows that they focus on. To bring observability to dbt projects the team at Elementary embedded themselves into the workflow. In this episode Maayan Salom explores the approach that she has taken to bring observability, enhanced testing capabilities, and anomaly detection into every step of the dbt developer experience.
Podcast episode
Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary: Working with data is a complicated process, with numerous chances for something to go wrong. Identifying and accounting for those errors is a critical piece of building trust in the organization that your data is accurate and up to date. While there are numerous products available to provide that visibility, they all have different technologies and workflows that they focus on. To bring observability to dbt projects the team at Elementary embedded themselves into the workflow. In this episode Maayan Salom explores the approach that she has taken to bring observability, enhanced testing capabilities, and anomaly detection into every step of the dbt developer experience.
byData Engineering Podcast
0 ratings
0% found this document useful
Streaming alternatives to Kafka
Podcast episode
Streaming alternatives to Kafka
byThe Cloudcast
0 ratings
0% found this document useful
Understanding Graph Database Patterns
Podcast episode
Understanding Graph Database Patterns
byThe Cloudcast
0 ratings
0% found this document useful
Establish A Single Source Of Truth For Your Data Consumers With A Semantic Layer: Maintaining a single source of truth for your data is the biggest challenge in data engineering. Different roles and tasks in the business need their own ways to access and analyze the data in the organization. In order to enable this use case, while maintaining a single point of access, the semantic layer has evolved as a technological solution to the problem. In this episode Artyom Keydunov, creator of Cube, discusses the evolution and applications of the semantic layer as a component of your data platform, and how Cube provides speed and cost optimization for your data consumers.
Podcast episode
Establish A Single Source Of Truth For Your Data Consumers With A Semantic Layer: Maintaining a single source of truth for your data is the biggest challenge in data engineering. Different roles and tasks in the business need their own ways to access and analyze the data in the organization. In order to enable this use case, while maintaining a single point of access, the semantic layer has evolved as a technological solution to the problem. In this episode Artyom Keydunov, creator of Cube, discusses the evolution and applications of the semantic layer as a component of your data platform, and how Cube provides speed and cost optimization for your data consumers.
byData Engineering Podcast
0 ratings
0% found this document useful
Building An Internal Database As A Service Platform At Cloudflare: Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? In this episode Vignesh Ravichandran explains how his team at Cloudflare provides PostgreSQL as a service to their developers for low latency and high uptime services at global scale. This is an interesting and insightful look at pragmatic engineering for reliability and scale.
Podcast episode
Building An Internal Database As A Service Platform At Cloudflare: Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? In this episode Vignesh Ravichandran explains how his team at Cloudflare provides PostgreSQL as a service to their developers for low latency and high uptime services at global scale. This is an interesting and insightful look at pragmatic engineering for reliability and scale.
byData Engineering Podcast
0 ratings
0% found this document useful
Aligning Data Security With Business Productivity To Deploy Analytics Safely And At Speed: As with all aspects of technology, security is a critical element of data applications, and the different controls can be at cross purposes with productivity. In this episode Yoav Cohen from Satori shares his experiences as a practitioner in the space of data security and how to align with the needs of engineers and business users. He also explains why data security is distinct from application security and some methods for reducing the challenge of working across different data systems.
Podcast episode
Aligning Data Security With Business Productivity To Deploy Analytics Safely And At Speed: As with all aspects of technology, security is a critical element of data applications, and the different controls can be at cross purposes with productivity. In this episode Yoav Cohen from Satori shares his experiences as a practitioner in the space of data security and how to align with the needs of engineers and business users. He also explains why data security is distinct from application security and some methods for reducing the challenge of working across different data systems.
byData Engineering Podcast
0 ratings
0% found this document useful
Eliminate The Overhead In Your Data Integration With The Open Source dlt Library: Cloud data warehouses and the introduction of the ELT paradigm has led to the creation of multiple options for flexible data integration, with a roughly equal distribution of commercial and open source options. The challenge is that most of those options are complex to operate and exist in their own silo. The dlt project was created to eliminate overhead and bring data integration into your full control as a library component of your overall data system. In this episode Adrian Brudaru explains how it works, the benefits that it provides over other data integration solutions, and how you can start building pipelines today.
Podcast episode
Eliminate The Overhead In Your Data Integration With The Open Source dlt Library: Cloud data warehouses and the introduction of the ELT paradigm has led to the creation of multiple options for flexible data integration, with a roughly equal distribution of commercial and open source options. The challenge is that most of those options are complex to operate and exist in their own silo. The dlt project was created to eliminate overhead and bring data integration into your full control as a library component of your overall data system. In this episode Adrian Brudaru explains how it works, the benefits that it provides over other data integration solutions, and how you can start building pipelines today.
byData Engineering Podcast
0 ratings
0% found this document useful
Episode 45: Everybody Needs Backup and Recovery: Do you have to deal with data protection? Do you usually mess it up? Some people think data protection architecture is broken and requires too many dependencies. By the time a business needs to backup a lot of data, it’s a complex problem to go back in ti
Podcast episode
Episode 45: Everybody Needs Backup and Recovery: Do you have to deal with data protection? Do you usually mess it up? Some people think data protection architecture is broken and requires too many dependencies. By the time a business needs to backup a lot of data, it’s a complex problem to go back in ti
byScreaming in the Cloud
0 ratings
0% found this document useful
EP 195 - Affordably Manage the Deluge of Unstructured Data: In this week’s episode, we have , the Chief Development Officer at . Quantum helps organizations in harnessing the potential of their expanding unstructured data, offering an affordable solution for storing data for decades to come. During our...
Podcast episode
EP 195 - Affordably Manage the Deluge of Unstructured Data: In this week’s episode, we have , the Chief Development Officer at . Quantum helps organizations in harnessing the potential of their expanding unstructured data, offering an affordable solution for storing data for decades to come. During our...
byIndustrial IoT Spotlight
0 ratings
0% found this document useful
Cloud Native Data Security As Code With Cyral - Episode 156: An interview about the Cyral platform and how it enforces data security as code for protecting databases and object storage in the cloud.
Podcast episode
Cloud Native Data Security As Code With Cyral - Episode 156: An interview about the Cyral platform and how it enforces data security as code for protecting databases and object storage in the cloud.
byData Engineering Podcast
0 ratings
0% found this document useful

Skip carousel

Murena Fairphone 4
Linux Format
Article
Murena Fairphone 4
Aug 22, 2023
5 min read
The A-List
PC Pro Magazine
Article
The A-List
Jul 9, 2022
from apple.com/uk Alongside its 16in sibling, this is simply the world’s best laptop for demanding users. The amount of power on tap via the M1 Pro or M1 Max processor is staggering, and it’s backed up by a terrific screen and stunning battery life.
16 min read
Let’s Make It!
Linux Format
Article
Let’s Make It!
Apr 5, 2022
The first time I saw a Raspberry Pi was sat in a Welsh pub garden with my parents and one-month old daughter, during the summer of 2012. It was hard to know back then how all of these things were going to change my life dramatically in one way or ano
1 min read
Other Pros And Cons
Linux Format
Article
Other Pros And Cons
Feb 7, 2023
1 min read
Seeing The Light
Linux Format
Article
Seeing The Light
Jun 28, 2022
7 min read
The Laptop Features That Matter Most In 2024
Tech Advisor
Article
The Laptop Features That Matter Most In 2024
Mar 27, 2024
8 min read
Horizon
T3
Article
Horizon
Dec 21, 2022
From £1,099, microsoft.com The Surface’s meteoric rise may have passed you by, but it has absolutely happened. The more powerful end of the line now presents one of the most viable Mac alternatives for professionals that need known hardware running W
5 min read
Control Your A.i.
Linux Format
Article
Control Your A.i.
Jun 27, 2023
1 min read
Woohoo! You Can Get A Raspberry Pi Again
PCWorld
Article
Woohoo! You Can Get A Raspberry Pi Again
Sep 5, 2023
1 min read
Microsoft’s ‘AI PC’ Definition: An NPU And A Keyboard Sticker
PCWorld
Article
Microsoft’s ‘AI PC’ Definition: An NPU And A Keyboard Sticker
Apr 30, 2024
2 min read
Saxo Bank And Thoughtworks: Enabling Data Democratization At A Global Investment Bank
Business Today
Article
Saxo Bank And Thoughtworks: Enabling Data Democratization At A Global Investment Bank
Jan 20, 2023
2 min read
The Network NAS appliances 2024
PC Pro Magazine
Article
The Network NAS appliances 2024
Apr 4, 2024
4 min read
Business NAS appliances 2022
PC Pro Magazine
Article
Business NAS appliances 2022
Apr 10, 2022
4 min read
Network-monitoring software 2024
PC Pro Magazine
Article
Network-monitoring software 2024
Feb 8, 2024
4 min read
Cloud File Sharing And Collaboration
PC Pro Magazine
Article
Cloud File Sharing And Collaboration
Mar 9, 2023
4 min read
Buyer’s Guide Network Monitoring
PC Pro Magazine
Article
Buyer’s Guide Network Monitoring
Feb 9, 2023
4 min read
What Should You Know About Cloud Security Solutions?
HWM Singapore
Article
What Should You Know About Cloud Security Solutions?
Apr 9, 2021
3 min read
Hybrid Backup For Business
PC Pro Magazine
Article
Hybrid Backup For Business
Apr 8, 2021
4 min read
One Tree To Rule Them All
Family Tree
Article
One Tree To Rule Them All
Apr 19, 2022
7 min read
Is My Data Really Safe? Your Questions About Cloud-Based Storage, Answered.
Entrepreneur
Article
Is My Data Really Safe? Your Questions About Cloud-Based Storage, Answered.
Nov 1, 2014
2 min read
Cloud Configuration
PC Pro Magazine
Article
Cloud Configuration
Sep 10, 2020
2 min read
There’s A New Career In Town
True Love
Article
There’s A New Career In Town
Oct 21, 2019
2 min read
BUYER'S GUIDE TO Cloud File Sharing In 2021
PC Pro Magazine
Article
BUYER'S GUIDE TO Cloud File Sharing In 2021
Jan 7, 2021
4 min read
Network monitoring 2022
PC Pro Magazine
Article
Network monitoring 2022
Feb 10, 2022
4 min read
“The Biggest Problem I See When People Are Working From Home Is A Poorly Designed Network”
PC Pro Magazine
Article
“The Biggest Problem I See When People Are Working From Home Is A Poorly Designed Network”
Jun 8, 2023
6 min read
Business NAS appliances 2021
PC Pro Magazine
Article
Business NAS appliances 2021
May 13, 2021
4 min read
All-in-one Business Protection 2023
PC Pro Magazine
Article
All-in-one Business Protection 2023
Aug 10, 2023
4 min read
Powering Costing With Artificial Intelligence: The Case Of Vodafone Procurement
The European Business Review
Article
Powering Costing With Artificial Intelligence: The Case Of Vodafone Procurement
May 25, 2021
8 min read
PC Matic For Mac: Don’t Bother
MacWorld
Article
PC Matic For Mac: Don’t Bother
Feb 13, 2024
3 min read
Good Governance for Dark Data: GUIDELINES FOR INDUSTRIAL IOT MANAGERS
The European Business Review
Article
Good Governance for Dark Data: GUIDELINES FOR INDUSTRIAL IOT MANAGERS
Mar 31, 2020
7 min read

Related categories

Skip carousel

Reviews for Data Mesh in Action

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Data Mesh in Action - Jacek Majchrzak

inside front cover

Data mesh development elements—data product development cycle details

Data Mesh in Action

Jacek Majchrzak, Sven Balnojan, and Marian Siwiak, with Mariusz Sieraczkiewicz

Foreword by Jean-Georges Perrin

To comment go to liveBook

Manning

Shelter Island

For more information on this and other Manning titles go to

www.manning.com

Copyright

For online information and ordering of these and other Manning books, please visit www.manning.com. The publisher offers discounts on these books when ordered in quantity.

For more information, please contact

Special Sales Department

Manning Publications Co.

20 Baldwin Road

PO Box 761

Shelter Island, NY 11964

Email: orders@manning.com

No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher.

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps.

♾ Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine.

ISBN: 9781633439979

brief contents

Part 1. Foundations

1 The what and why of the data mesh

2 Is a data mesh right for you?

3 Kickstart your data mesh MVP in a month

Part 2. The four principles in practice

4 Domain ownership

5 Data as a product

6 Federated computational governance

7 The self-serve data platform

Part 3. Infrastructure and technical architecture

8 Comparing self-serve data platforms

9 Solution architecture design

Appendix A.

Appendix B.

Appendix C.

Appendix D.

Front matter

foreword

preface

acknowledgments

about this book

about the authors

about the cover illustration

Part 1. Foundations

1 The what and why of the data mesh

1.1 Data mesh

1.2 Why the data mesh?

Alternatives

Data warehouses and data lakes inside the data mesh

Data mesh benefits

1.3 Use case: A snow-shoveling business

1.4 Data mesh principles

Domain-oriented decentralized data ownership and architecture

Data as a product

Federated computational governance

Self-serve data infrastructure as a platform

1.5 Back to snow shoveling

1.6 Socio-technical architecture

Conway’s law

Team topologies

Cognitive load

1.7 Data mesh challenges

Technological challenges

Data management challenges

Organizational challenges

2 Is a data mesh right for you?

2.1 Analyzing data mesh drivers

Business drivers

Organizational drivers

Domain-data drivers

Minor organizational drivers

Is a data mesh a good fit for me?

2.2 Data mesh alternatives and complementary solutions

Enterprise data warehouse

Data lake

Data lakehouse

Data fabric

Data mesh vs. the rest of the world

2.3 Understanding a data mesh implementation effort

The data mesh development cycle

Development cycle in the shoveling example

Enabling the team

Development cycle in detail

3 Kickstart your data mesh MVP in a month

3.1 Getting the lay of the land

Drawing a system landscape diagram

Performing stakeholder analysis

3.2 Identifying candidates for the MVP implementation team

Choosing development teams

Choosing the cooperation model

Choosing a data governance team

3.3 Setting up MVP governance

Defining data mesh value statement(s)

Defining data governance policies

Federating data governance

3.4 Developing minimal data products

Identifying domain-oriented datasets

Choosing data product owners

Deciding on the minimum viable data product description

Developing the simplest tools to expose your data

3.5 Setting up the minimal platform

Ensuring platform-forced governability

Ensuring platform security

Part 2. The four principles in practice

4 Domain ownership

4.1 Capturing and analyzing domains

Domain-driven design 101

Invite the right people

Choose the correct workshop technique

4.2 Applying ownership using domain decomposition

Domain, subdomain, and business capability

Decompose domains using business capability modeling

How are domains and business capabilities related to data?

Assign responsibilities to the data-product-owning team

Choose the right team to own data

4.3 Applying ownership using data use cases

Data use cases

Model and bounded context

Set up boundaries of use-case-driven data products

Choose the right team to own data

4.4 Applying ownership using design heuristics

What is a heuristic?

Using design heuristics

Designing heuristics and possible boundaries

4.5 Final landscape: The mesh of interconnected data products

Messflix data mesh

Data products form a mesh

Is it already a data mesh?

5 Data as a product

5.1 Applying product thinking

Product thinking analysis

Data product canvas

5.2 What is a data product?

Data product definition

Product, not project

What can be a data product?

5.3 Data product ownership

Data product owner

Data product owner responsibilities

An Agile DevOps team as a base for data product dev team

Data product owner and product owner

5.4 Conceptual architecture of a data product

External architecture view

Internal architecture view

5.5 Data product fundamental characteristics

Self-described data product

Introduction to metadata

Metadata as code

Data product metadata

Domain dataset metadata

Other kinds of metadata

5.6 Additional data product characteristics: FAIR and immutability

Findability

Accessibility

Interoperable

Reusable

Immutable

5.7 Data contracts and sharing agreements inside the data mesh

Data contracts and sharing agreements

Implementing data contracts and sharing agreements

6 Federated computational governance

6.1 Data governance in a nutshell

6.2 Benefits of data governance

Business value perspective

Data usability perspective

Data control perspective

6.3 Planning data governance outcomes

Hierarchy of data governance outcomes

Strategic-level outcomes

Tactical-level outcomes

Implementation-level outcomes

6.4 Federating data governance

Thinking of data governance in terms of sliders

Extreme ends of data governance models

Federated data governance model

Setting-up governance team operations

6.5 Making data governance computational

Making policies computational

Automating policy checks

7 The self-serve data platform

7.1 The MVP platform

Platform definition

Platform thinking

7.2 Improvements with X as a service

X as a service explained

X as a service applied

7.3 Improvements with platform architecture

Platform architecture explained

Platform architecture applied

7.4 Improvements for the data producers

Part 3. Infrastructure and technical architecture

8 Comparing self-serve data platforms

8.1 Data mesh on Google Cloud Platform

Self-serve data platform architecture

Identifying the components of the platform

Identifying the components of the data product

Workflows

Variations

Relation to data mesh ideas

GCP architecture summary

8.2 Data mesh on AWS

Self-serve data platform architecture

Identifying the components of the platform

Identifying the components of the data products

Workflows

Relation to data mesh ideas

Variations

AWS architecture summary

8.3 Data mesh on Databricks

Self-serve data platform architecture

Identifying the components of the platform

Identifying the components of the data product

Workflow considerations

Variations

Databricks architecture summary

8.4 Data mesh on Kafka

Self-serve data platform architecture

Identifying the components

Considerations

Kafka architecture summary

9 Solution architecture design

9.1 Capturing and understanding the current state

What is software architecture?

How to document architecture: The C4 model

9.2 Understanding architectural drivers of a data product design

Architectural drivers

Capturing architectural drivers for a data-product design

9.3 Designing the future architecture of a data product and related systems

Design session

File-based data product: Spreadsheet

From monolith and microservice to a data product

Exposing data for stream processing and batch processing

Appendix A.

Appendix B.

Appendix C.

Appendix D.

index

front matter

foreword

The data mesh is to data as agile is to software engineering, or as microservices are to architecture patterns. It will be an essential component of your future data strategy. Data Mesh in Action addresses both the technology of the data mesh and the methodology your organization can follow to implement it.

This book teleports you into the seat of the chief architect on a data mesh project. The authors will coach you through the chaotic process of your first data product. As you gain more and more of those components, your mesh will build itself. The authors’ collective experience drives this transformation. Your responsibility will be to pick, choose, and adapt this framework to your needs and organization.

The data mesh is based on four key principles: domain ownership, data as a product, federated computational governance, and self-serve data platform. The book details organizational impact of these principles, as well as their technology, in great length. Individually, all those principles are well-known to engineers and architects; the real (r)evolution of the data mesh is its ability to combine them and deliver a global approach to building modern data platforms.

In my more than 15 years of building hybrid data platforms, I have always been missing something. Whether it was due to the strict approach of ingesting data in a warehouse or the lack of governance of a lake, to name two popular patterns, there was always this feeling of it ain’t gonna work. The mesh is different. It does not focus solely on technology; it puts governance and quality at the center and allocates ownership to the real owner, not some central commanding and demanding group. As a result, with adequate self-service tools, the data mesh will liberate the forces of innovation in your organization. And that is what this book will help you achieve.

—Jean-Georges Perrin,

Intelligence platform lead at PayPal,

president and cofounder of AIDAUG,

and Lifetime IBM Champion

preface

Each one of us authors has experienced—at length and at different companies—the old way of doing data, usually through centralized data lakes and data warehouses in combination with a set of central teams organized inside an analytics function. The old way basically looked like this:

Multiple decentralized development teams have data that is accessible through storage systems like a shared drive, a decentralized database, a Representational State Transfer (REST) API, or any other interface.

One or more centralized data teams are tasked with collecting this data into one monolithic pot. This is either a data lake or a data warehouse.

The same set of teams is tasked with transforming this data into something useful.

Multiple decentralized analysts, development teams, or machine learning (ML) teams pick up that transformed data and convert it into value in the form of reports, recommendation systems, or anything else they can think of.

We learned the hard way that this concept has its limits, producing a bottleneck in terms of both technology and team capacities. We all saw companies struggling to get the flow from data to value to be as productive as the companies needed it to be. Then the data mesh and the ideas behind it appeared on the horizon.

The data mesh is a decentralization paradigm. It decentralizes the ownership of data, its transformation into information, and its serving. It aims to increase the value extraction from data by removing bottlenecks in the data value stream by these means.

The concept of the data mesh appeared on the stage in 2019 and has since lit not just the data world, but the whole technology world, on fire. The data mesh concept breaks with the current world of data, which usually treats data as a by-product of software components. This new approach turns the spotlight on data producers and gives them the responsibility to handle the data just as they would handle their software.

With this, the data mesh takes the same journey software components have taken, with microservices architectures and with the DevOps movement. It takes the same journey frontends are currently taking with microfrontends. And just as in these examples, we believe that the data mesh is the right approach to finally gain the flexibility to extract value from our data at scale, be that in business intelligence (BI), ML learning, or any other use case you can think of.

The data mesh concept is often referred to as a socio-technical paradigm shift: its core is not about technology but about the alignment of people, processes, and organizations. This significant complexity is why we wrote this book. However, we don’t just present the available theoretical knowledge that is out there; we focus on parts of the data mesh that are, in our experience, critical for successful implementation. We have organized those parts into a digestible resource to help you put a data mesh in action!

To guide you through the process, we’ve prepared hands-on examples with a lot of architecture sketches, describing various technologies, workshop techniques, team organization forms, and the like. After reading this book, you should be able to do the following:

Evaluate whether a data mesh will suit your organization’s business needs

Lay the groundwork for data mesh development

Develop a minimal data mesh to start your journey

Keep iteratively developing and expanding your data mesh

Don’t expect to find a lot of code in this book, other than a little JavaScript Object Notation (JSON) here and there. That’s because we truly believe the magic is not in the technology, but in the people, processes, and organizations. But, of course, you can expect to find a lot of technology inside this book in the form of deep architecture sketches with reference to various technologies and cloud providers, explanations, and blueprints inspired by multiple real-world examples.

That said, we don’t believe in a black-and-white implementation of the data mesh idea. This book will help you adjust the data mesh idea to your company by offering a lot of degrees of freedom, shortcuts, and a healthy level of pragmatism.

To tie together our experience, we will use an imaginary company called Messflix LLC, which resembles a lot of what we’ve seen out there in the data world. This company will be our go-to example as we go through the mess-to-mesh journey; however, since we also focus on making the data mesh adaptable to many types of companies, not just one, this is not the only example we utilize throughout the book. Later in this front matter, we provide a brief introduction to Messflix by taking a look at the data mess the company has gotten itself into.

acknowledgments

First, we would like to express our gratitude to the community engaged with data mesh development. Their discussions and openness about problems and challenges helped us broaden our perspectives and put our particular experiences into the generalized framework you’ll find in this book.

We owe our thanks to the wonderful people at Manning who made this book possible: Publisher Marjan Bace, Development Editor Ian Hough, and last but not least, Acquisitions Editor Andrew Waldron. Without their patience with our ever-evolving view on the data mesh, and their ability to make us synthesize it into a coherent view, we wouldn’t be able to finish Data Mesh in Action in a form we could so proudly present to you. We would like also to thank the marketing, editorial, and production teams, without whom this book would gather dust in a Manning drawer.

A heartfelt thanks also to Michael Jensen and Al Krinker for technical reviews, which allowed us to further condense and clarify data mesh concepts.

We would also like to thank all our reviewers, who trusted us and invested their time in reading this book, even when no one was sure it would make it to publication. To Alain Couniot, Arnaud Castelltort, Arnaud Estève, Jean-Georges Perrin, Juan Gabriel Guzmán Guerra, Mary Anne Thygesen, Massimo dr, Matthias Busch, Mike Fowler, Milan Sarenac, Nathan B. Crocker, Pradeep Bhattiprolu, Rahul Jain, Richard Vaughan, Salil Athalye, Sampath Chaparala, Shiroshica Kulatilake, Simon Tschöke, Stefano Ongarello, Sumih Damodaran, Suriyanto Bongso, and Yi Wei, your suggestions helped make this a better book.

about this book

This book serves two purposes. First, it organizes and presents knowledge about the new socio-technological paradigm of the data mesh. Second, it will help you implement a data mesh. From considering whether the data mesh is a suitable solution for your organization, to laying the groundwork, to developing a minimum viable product (MVP), to implementing data mesh principles, this book provides the tools needed to get you well on your way on your data mesh journey.

Who should read this book?

The most general description of our reader is someone who is involved in extracting value from data. However, because that describes almost everyone in our modern economy, we’ll outline the benefits this book will bring to various audiences.

The first group is people involved in creating, managing, and utilizing data within companies that have the following:

High socio-technological complexity (e.g., big corporations)

Complex data use cases

Many and diverse data sources

This encompasses, but is not limited to, roles including data architects, data engineers, software architects, tech leads, and senior developers.

The more you feel like these quantifiers apply to your business, the more likely it is that a data mesh could be a good solution. This book will help you understand data mesh concepts, including whose cooperation you need to secure, and what steps to take in both your organization and technical environment to move from a data mess to data mesh.

Beyond that, as the data mesh is a company-wide transformation process, the book’s content will be directly useful to executive-level personnel, including the technical C-suite, engineering directors and managers, enterprise architects, chief and lead architects, and solution/program owners. This book will help you decide to what extent and level of priority you should shift your company’s data environment into a data mesh direction, and help you plan the change management.

How this book is organized: A road map

While the book is meant to be read linearly, it is broken into three main parts and allows you to skip sections. The first part is a quick and hands-on introduction, the second explains the four principles of the data mesh in detail, and the third tackles the technical side of things in detail as well as the complete enterprise journey.

Part 1: Foundations

The goal of the first part of the book is to familiarize you with the data mesh paradigm as quickly as possible. To do so, we first go through the basics of the data mesh and then get our hands dirty by building our first data mesh within a month.

Chapter 1: The what and why of the data mesh

This chapter gives the overview needed to put the rest of the book into the proper context, including why you might want to consider following the data mesh mindset shift as well as a short explanation of the four key principles detailed in part 2.

Chapter 2: Is a data mesh right for you?

This chapter provides you with the context of the data mesh implementation and the drivers to consider when deciding on the transformation. It helps you decide whether you want to start the journey now and to identify your place on the data maturity scale. This helps you to match your data mesh journey to your particular situation.

Chapter 3: Kickstart your data mesh MVP in a month

This chapter is a hands-on example of how to go about building an MVP. The Messflix MVP focuses a lot on the organizational challenges and stays light on the technology side of things, which an MVP should. The technology details will be picked up later. The chapter provides you with tools like stakeholder mappings and FAIR principles (findable, accessible, interoperable, reusable) to get you started.

Part 2: The four principles in practice

The goal of the second part of the book is to provide you with the tools to tackle the four principles of the data mesh so you can advance your data mesh beyond the first month.

Chapter 4: Domain ownership

This chapter is all about domains and business capabilities and how you can identify suitable owners for data inside a company. It provides you with a lot of workshop techniques, including domain storytelling.

Chapter 5: Domain data as a product

Data is often treated as a by-product. This chapter is about changing to a product perspective called data as a product. The chapter provides examples of data products from Messflix and explains in detail concepts like the data product canvas and data ports.

Chapter 6: Federated computational governance

This chapter tackles data governance in the data mesh context. Inside data meshes, this is called federated computational governance, because of the balance of central and distributed governance aspects as well as an automated execution needed to unfold the data mesh. This chapter contains a discussion of centralized versus decentralized aspects, hands-on examples from Messflix, and a guide for setting up a governance team.

Chapter 7: The self-serve data platform

The last chapter on data mesh principles covers the platform, the enabling technology that makes the data mesh work. The chapter works through three iterations on our data platform for Messflix and explains important concepts like platform thinking along with these examples.

Part 3: Infrastructure and technical architecture

The third part focuses on all things technical. We break out of the Messflix example to highlight various architectures and discuss multiple options for moving from your existing structure to a data mesh.

Chapter 8: Comparing self-serve data platforms

This chapter explains blueprints for data mesh platforms that fit various cloud providers as well as different sizes of companies.

Chapter 9: Solution architecture design

In this chapter, we focus on the migration from your existing system to various kinds of architectures step by step and component by component. We talk about data lakes, data warehouses, REST APIs, and more.

How to use this book

We don’t want to present just another theory of the data mesh. This book is more of a structured, collective diary of actions leading to data mesh development in various environments. The emphasis is on actions leading to. We arrived at the data mesh after a long and often painful journey through multiple other solutions. Over the years, we’ve been testing, researching, discussing, and, last but not least, failing a lot in the process. In this book, we share with you the summary of I wish someone had told me earlier insights. We hope you will be able to immediately put the information you’ll get out of it, well, in action.

Depending on your goal, there are a few focal points you could set while reading this book to dive deeper into. If your interest is purely informational, and your goal is to be able to explain the concepts to your team, your management, or your company, we recommend you put a lot of focus on chapters 1 and 2, which provide a quick overview, as well as the MVP presented in chapter 4. In addition, by reading through chapter 9 for a deeper dive into the reasons for this paradigm shift and a lighter look into part 2, you will be well equipped to explain the data mesh paradigm to someone else.

If you want to launch a larger initiative inside your company, you’ll need to be convincing. In that case, we recommend you take a deep dive into the entirety of chapter 9 and pay close attention to chapter 3, which offers insight into the question of whether you should start this journey at all. Chapter 4, presenting the full-scale data mesh MVP development, and chapter 2, offering a quick glance into a lightweight application of data mesh principles, will allow you to balance the big-picture view with notes on requirements of quick implementation and getting results fast. All together, this material should equip you with enough convincing material to get top-level buy-in.

If you’re interested in the technical side of things, like automated governance and the self-serve platform, chapters 5 to 8 will provide you with a lot of interesting content to dig through.

If you work inside a development team, we particularly recommend that you turn your attention to chapter 4. This chapter explains exactly what is broken in the current mode of thinking and should also help you advance your ways of working without ever touching the data mesh concept. Additionally, we recommend chapter 8, as it explains possible architecture alternatives for serving data from a development team’s point of view.

If you want to advance the way you work inside your data team, you could focus on chapters 3 and 4 to deeply understand the source of your current troubles. You could also focus on chapter 6 to understand what platform thinking in a data context means. Both could help you advance your ways of working without actually adopting a full data mesh approach inside the company.

We’re sure there are many more reasons for you to open up this book; these are simply a few possible ways you could go about putting this book into use.

The Messflix case study

To help you conceptualize the practical aspects of putting a data mesh in action, we combined our experiences and merged them into a single data mesh journey of Messflix LLC.

Messflix, a movie- and TV-show streaming platform, just hit a wall. A data wall. The company has all the data in the world but complains about not even being able to build a proper recommendation system for its movies and shows. The competition seems to be able to get it done; in fact, the competition is famous for being the first movers in a lot of technology sectors.

Other companies in equally complex industries seem to be able to put their data to work. Messflix does work with data, and analysts are able to get some insights from it, but the organization’s leaders don’t feel like they can call themselves data driven.

The data science trial runs seem to all end in pretty prototypes with no clear business value. The data scientists tell their managers that it’s because the product team just doesn’t want to put these great prototypes on the roadmap, or, in another instance, because the data from the source is way too messy and inconsistent.

In short, Messflix hopefully sounds like your average business, which for some reason doesn’t feel like it’s able to let the right data flow to the right use cases. The data landscape, just like the technology landscape, has grown organically over time and has become quite complex.

The two key technology components of Messflix are its Messflix Streaming Platform and Hitchcock Movie Maker. The streaming platform does just what it says: enable subscribers to watch shows and movies. The movie maker is a set of tools helping the movie production teams choose good movie topics, themes, and content.

Additionally, Messflix has a data lake with an analytics platform on top of it taking data from everywhere. A few teams manage these components. The teams Orange and White together operate a few of the Hitchcock Movie Maker tools. Team Green is all about the subscriptions, the log-in processes, etc., and team Yellow is responsible for getting things on the screen inside the streaming platform. Figure 1 depicts a rough architecture sketch of a few of these components before we briefly discuss how data is currently handled at Messflix.

The main Messflix software components. The data team handles a large variety of data sources and responsibilities.

The Data team gets data into the data warehouse from a few different places—for example, cost statements from the Hitchcock Movie Maker and subscriptions from the subscriptions service. The team also gets streaming data and subscription profiles from the data lake.

Then the Data team does some number crunching to transform this data into information for fraud analysis and business decisions.

Finally, this information is used by decentralized units to make those business decisions and for other use cases. This currently is a centralized workflow. The data team sits in the middle.

No matter where you’re coming from and where you want to go, you will find yourself somewhere along the Messflix journey. So let’s take one final look at the complete journey Messflix is going through.

No data journey is a simple straight line. Likewise, we don’t pretend that the Messflix journey is a simple linear progression of a series of steps. You’ll see different approaches in the chapters and ways to make the data mesh fit your company, even though the Messflix example illustrates one main thread to guide you.

You can follow that main thread used by Messflix throughout chapters 2 through 6 and chapter 9. Table 1 gives you an overview of the stages of the company, as we highlight two dimensions alongside the journey to a data mesh. The first is the number of organizational units and teams affected. The second is the types of company responsibilities that are decentralized.

The core of the data mesh paradigm shift is the decentralization of the responsibility for data. But responsibility for data today is practically split into multiple parts, all of which need to be decentralized. Thus we highlight all four kinds of responsibility for data in table 1; each corresponds to one of the principles presented in part 2.

Table 1 The Messflix journey

Enjoying the preview?

Page 1 of 1

Data Mesh in Action

About this ebook

Jacek Majchrzak

Related authors

Related to Data Mesh in Action

Related ebooks

Computers For You

Related podcast episodes

Related articles

Related categories

Reviews for Data Mesh in Action

What did you think?

Book preview

Data Mesh in Action - Jacek Majchrzak

Data Mesh in Action

brief contents

contents

Part 1. Foundations

Part 2. The four principles in practice

Part 3. Infrastructure and technical architecture

foreword

preface

acknowledgments

about this book

Who should read this book?

How this book is organized: A road map

Part 1: Foundations

Part 2: The four principles in practice

Part 3: Infrastructure and technical architecture

How to use this book

The Messflix case study