Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

Large-Scale Entity Resolution - Sonal Goyal

Large-Scale Entity Resolution - Sonal Goyal

FromDataTalks.Club


Large-Scale Entity Resolution - Sonal Goyal

FromDataTalks.Club

ratings:
Length:
53 minutes
Released:
Oct 28, 2022
Format:
Podcast episode

Description

We talked about:


Sonal’s background
How the idea for Zingg came about
What Zingg is
The difference between entity resolution and identity resolution
How duplicate detection relates to entity resolution
How Sonal decided to start working on Zingg
How Zingg works
What Zingg runs on
Switching from consultancy to working on a new open source solution
Why Zingg is open source
Open source licensing
Working on Zingg initially vs now
Zingg’s current and future team
Sonal’s biggest current challenge
Avoiding problems with entity/identity resolution through database design
Identity resolution vs basic joins, data fusions, and fuzzy joins
Deterministic matching vs probabilistic machine learning
Identity and entity resolution applications for fraud detection
Graph algorithms vs classic ML in entity resolution
Identity resolution success stories
What Sonal would do differently given the chance to start over with Zingg
Advice for those seeking to realize their own solution to a data problem
Reading suggestion from Sonal
Conclusion


Links:

Open-Source Spotlight demo "Zingg":https://www.youtube.com/watch?v=zOabyZxN9b0
Creative Selection: Inside Apple's Design Process During the Golden Age of Steve Jobs book: https://www.amazon.com/Creative-Selection-Inside-Apples-Process/dp/1250194466


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
Released:
Oct 28, 2022
Format:
Podcast episode

Titles in the series (100)

DataTalks.Club - the place to talk about data!