Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

A High Performance Platform For The Full Big Data Lifecycle

A High Performance Platform For The Full Big Data Lifecycle

FromData Engineering Podcast


A High Performance Platform For The Full Big Data Lifecycle

FromData Engineering Podcast

ratings:
Length:
74 minutes
Released:
Aug 19, 2019
Format:
Podcast episode

Description

Managing big data projects at scale is a perennial problem, with a wide variety of solutions that have evolved over the past 20 years. One of the early entrants that predates Hadoop and has since been open sourced is the HPCC (High Performance Computing Cluster) system. Designed as a fully integrated platform to meet the needs of enterprise grade analytics it provides a solution for the full lifecycle of data at massive scale. In this episode Flavio Villanustre, VP of infrastructure and products at HPCC Systems, shares the history of the platform, how it is architected for scale and speed, and the unique solutions that it provides for enterprise grade data analytics. He also discusses the motivations for open sourcing the platform, the detailed workflow that it enables, and how you can try it for your own projects. This was an interesting view of how a well engineered product can survive massive evolutionary shifts in the industry while remaining relevant and useful.
Released:
Aug 19, 2019
Format:
Podcast episode

Titles in the series (100)

Weekly deep dives on data management with the engineers and entrepreneurs who are shaping the industry