Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

95: Data Science Pipeline Testing with Great Expectations - Abe Gong

95: Data Science Pipeline Testing with Great Expectations - Abe Gong

FromTest and Code


95: Data Science Pipeline Testing with Great Expectations - Abe Gong

FromTest and Code

ratings:
Length:
23 minutes
Released:
Nov 30, 2019
Format:
Podcast episode

Description

Data science and machine learning are affecting more of our lives every day. Decisions based on data science and machine learning are heavily dependent on the quality of the data, and the quality of the data pipeline.
Some of the software in the pipeline can be tested to some extent with traditional testing tools, like pytest.
But what about the data? The data entering the pipeline, and at various stages along the pipeline, should be validated.
That's where pipeline tests come in.
Pipeline tests are applied to data. Pipeline tests help you guard against upstream data changes and monitor data quality.
Abe Gong and Superconductive are building an open source project called Great Expectations. It's a tool to help you build pipeline tests.
This is quite an interesting idea, and I hope it gains traction and takes off.
Special Guest: Abe Gong.
Released:
Nov 30, 2019
Format:
Podcast episode

Titles in the series (100)

Test & Code is a weekly podcast hosted by Brian Okken. The show covers a wide array of topics including software engineering, development, testing, Python programming, and many related topics. When we get into the implementation specifics, that's usually Python, such as Python packaging, tox, pytest, and unittest. However, well over half of the topics are language agnostic, such as data science, DevOps, TDD, public speaking, mentoring, feature testing, NoSQL databases, end to end testing, automation, continuous integration, development methods, Selenium, the testing pyramid, and DevOps.