42 min listen
Be Confident In Your Data Integration By Quickly Validating Matching Records With data-diff
Be Confident In Your Data Integration By Quickly Validating Matching Records With data-diff
ratings:
Length:
71 minutes
Released:
Jul 3, 2022
Format:
Podcast episode
Description
The perennial challenge of data engineers is ensuring that information is integrated reliably. While it is straightforward to know whether a synchronization process succeeded, it is not always clear whether every record was copied correctly. In order to quickly identify if and how two data systems are out of sync Gleb Mezhanskiy and Simon Eskildsen partnered to create the open source data-diff utility. In this episode they explain how the utility is implemented to run quickly and how you can start using it in your own data workflows to ensure that your data warehouse isn't missing any records from your source systems.
Released:
Jul 3, 2022
Format:
Podcast episode
Titles in the series (100)
Rebuilding Yelp's Data Pipeline with Justin Cunningham - Episode 5 by Data Engineering Podcast