33 min listen
Data Lineage For Your Pipelines
ratings:
Length:
49 minutes
Released:
May 27, 2019
Format:
Podcast episode
Description
Some problems in data are well defined and benefit from a ready-made set of tools. For everything else, there's Pachyderm, the platform for data science that is built to scale. In this episode Joe Doliner, CEO and co-founder, explains how Pachyderm started as an attempt to make data provenance easier to track, how the platform is architected and used today, and examples of how the underlying principles manifest in the workflows of data engineers and data scientists as they collaborate on data projects. In addition to all of that he also shares his thoughts on their recent round of fund-raising and where the future will take them. If you are looking for a set of tools for building your data science workflows then Pachyderm is a solid choice, featuring data versioning, first class tracking of data lineage, and language agnostic data pipelines.
Released:
May 27, 2019
Format:
Podcast episode
Titles in the series (100)
MarketStore: Managing Timeseries Financial Data with Hitoshi Harada and Christopher Ryan - Episode 24: Fast and Scalable Financial Timeseries Dataframes with MarketStore (Interview) by Data Engineering Podcast