Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

759: Full Encoder-Decoder Transformers Fully Explained, with Kirill Eremenko

759: Full Encoder-Decoder Transformers Fully Explained, with Kirill Eremenko

FromSuper Data Science: ML & AI Podcast with Jon Krohn


759: Full Encoder-Decoder Transformers Fully Explained, with Kirill Eremenko

FromSuper Data Science: ML & AI Podcast with Jon Krohn

ratings:
Length:
103 minutes
Released:
Feb 20, 2024
Format:
Podcast episode

Description

Encoders, cross attention and masking for LLMs: SuperDataScience Founder Kirill Eremenko returns to the SuperDataScience podcast, where he speaks with Jon Krohn about transformer architectures and why they are a new frontier for generative AI. If you’re interested in applying LLMs to your business portfolio, you’ll want to pay close attention to this episode!

This episode is brought to you by Ready Tensor, where innovation meets reproducibility (https://www.readytensor.ai/), by Oracle NetSuite business software (netsuite.com/superdata), and by Intel and HPE Ezmeral Software Solutions (http://hpe.com/ezmeral/chatbots). Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information.

In this episode you will learn:
• How decoder-only transformers work [15:51]
• How cross-attention works in transformers [41:05]
• How encoders and decoders work together (an example) [52:46]
• How encoder-only architectures excel at understanding natural language [1:20:34]
• The importance of masking during self-attention [1:27:08]

Additional materials: www.superdatascience.com/759
Released:
Feb 20, 2024
Format:
Podcast episode

Titles in the series (64)

The Super Data Science podcast with Jon Krohn brings you the latest and most important machine learning, artificial intelligence, and broader data-world topics from across both academia and industry. As the quantity of data on our planet doubles every couple of years and this trend is set to continue for decades to come, there's an unprecedented opportunity for you to make an enormous impact in your lifetime. Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy. We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, real-world applications, and commercialization − everything you need to crush it with data science.