Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

Performing Fast Data Analytics Using Apache Kudu - Episode 64

Performing Fast Data Analytics Using Apache Kudu - Episode 64

FromData Engineering Podcast


Performing Fast Data Analytics Using Apache Kudu - Episode 64

FromData Engineering Podcast

ratings:
Length:
51 minutes
Released:
Jan 7, 2019
Format:
Podcast episode

Description

The Hadoop platform is purpose built for processing large, slow moving data in long-running batch jobs. As the ecosystem around it has grown, so has the need for fast data analytics on fast moving data. To fill this need the Kudu project was created with a column oriented table format that was tuned for high volumes of writes and rapid query execution across those tables. For a perfect pairing, they made it easy to connect to the Impala SQL engine. In this episode Brock Noland and Jordan Birdsell from PhData explain how Kudu is architected, how it compares to other storage systems in the Hadoop orbit, and how to start integrating it into you analytics pipeline.
Released:
Jan 7, 2019
Format:
Podcast episode

Titles in the series (100)

Weekly deep dives on data management with the engineers and entrepreneurs who are shaping the industry