Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

FromMachine Learning Street Talk (MLST)


Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

FromMachine Learning Street Talk (MLST)

ratings:
Length:
100 minutes
Released:
May 19, 2020
Format:
Podcast episode

Description

In this episode of Machine Learning Street Talk, Tim Scarfe, Yannic Kilcher and Connor Shorten chat about Large-scale Transfer Learning in Natural Language Processing. The Text-to-Text Transfer Transformer (T5) model from Google AI does an exhaustive survey of what’s important for Transfer Learning in NLP and what’s not. In this conversation, we go through the key takeaways of the paper, text-to-text input/output format, architecture choice, dataset size and composition, fine-tuning strategy, and how to best use more computation.
Beginning with these topics, we diverge into exciting ideas such as embodied cognition, meta-learning, and the measure of intelligence. We are still beginning our podcast journey and really appreciate any feedback from our listeners. Is the chat too technical? Do you prefer group discussions, interviewing experts, or chats between the three of us? Thanks for watching and if you haven’t already, Please Subscribe!
Paper Links discussed in the chat:
Text-to-Text Transfer Transformer: https://arxiv.org/abs/1910.10683
Experience Grounds Language (relevant to divergent discussion about embodied cognition): https://arxiv.org/pdf/2004.10151.pdf
On the Measure of Intelligence: https://arxiv.org/abs/1911.01547
Train Large, Then Compress: https://arxiv.org/pdf/2002.11794.pdf
Scaling Laws for Neural Language Models: https://arxiv.org/pdf/2001.08361.pdf
The Illustrated Transformer: http://jalammar.github.io/illustrated...
ELECTRA: https://arxiv.org/pdf/2003.10555.pdf
Transformer-XL: https://arxiv.org/pdf/1901.02860.pdf
Reformer: The Efficient Transformer: https://openreview.net/pdf?id=rkgNKkHtvB
The Evolved Transformer: https://arxiv.org/pdf/1901.11117.pdf
DistilBERT: https://arxiv.org/pdf/1910.01108.pdf
How to generate text (HIGHLY RECOMMEND): https://huggingface.co/blog/how-to-ge...
Tokenizers: https://blog.floydhub.com/tokenization-nlp/
Released:
May 19, 2020
Format:
Podcast episode

Titles in the series (100)

This is the audio podcast for the ML Street Talk YouTube channel at https://www.youtube.com/c/MachineLearningStreetTalk Thanks for checking us out! We think that scientists and engineers are the heroes of our generation. Each week we have a hard-hitting discussion with the leading thinkers in the AI space. Street Talk is unabashedly technical and non-commercial, so you will hear no annoying pitches. Corporate- and MBA-speak is banned on street talk, "data product", "digital transformation" are banned, we promise :) Dr. Tim Scarfe, Dr. Yannic Kilcher and Dr. Keith Duggar.