Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

[MINI] Data Provenance

[MINI] Data Provenance

FromData Skeptic


[MINI] Data Provenance

FromData Skeptic

ratings:
Length:
11 minutes
Released:
Jan 9, 2015
Format:
Podcast episode

Description

This episode introduces a high level discussion on the topic of Data Provenance, with more MINI episodes to follow to get into specific topics. Thanks to listener Sara L who wrote in to point out the Data Skeptic Podcast has focused alot about using data to be skeptical, but not necessarily being skeptical of data.
Data Provenance is the concept of knowing the full origin of your dataset. Where did it come from? Who collected it? How as it collected? Does it combine independent sources or one singular source? What are the error bounds on the way it was measured? These are just some of the questions one should ask to understand their data. After all, if the antecedent of an argument is built on dubious grounds, the consequent of the argument is equally dubious.
For a more technical discussion than what we get into in this mini epiosode, I recommend A Survey of Data Provenance Techniques by authors Simmhan, Plale, and Gannon.
Released:
Jan 9, 2015
Format:
Podcast episode

Titles in the series (100)

Data Skeptic is a data science podcast exploring machine learning, statistics, artificial intelligence, and other data topics through short tutorials and interviews with domain experts.