Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

Data Lineage For Your Pipelines

Data Lineage For Your Pipelines

FromData Engineering Podcast


Data Lineage For Your Pipelines

FromData Engineering Podcast

ratings:
Length:
49 minutes
Released:
May 27, 2019
Format:
Podcast episode

Description

Some problems in data are well defined and benefit from a ready-made set of tools. For everything else, there's Pachyderm, the platform for data science that is built to scale. In this episode Joe Doliner, CEO and co-founder, explains how Pachyderm started as an attempt to make data provenance easier to track, how the platform is architected and used today, and examples of how the underlying principles manifest in the workflows of data engineers and data scientists as they collaborate on data projects. In addition to all of that he also shares his thoughts on their recent round of fund-raising and where the future will take them. If you are looking for a set of tools for building your data science workflows then Pachyderm is a solid choice, featuring data versioning, first class tracking of data lineage, and language agnostic data pipelines.
Released:
May 27, 2019
Format:
Podcast episode

Titles in the series (100)

Weekly deep dives on data management with the engineers and entrepreneurs who are shaping the industry