83 min listen
E33: The Tiny Model Revolution with Ronen Eldan and Yuanzhi Li of Microsoft Research
From"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis
E33: The Tiny Model Revolution with Ronen Eldan and Yuanzhi Li of Microsoft Research
From"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis
ratings:
Length:
121 minutes
Released:
Jun 6, 2023
Format:
Podcast episode
Description
Nathan Labenz sits down with Ronen Eldan and Yuanzhi Li of Microsoft Research to discuss the small natural language dataset they created called TinyStories. Tiny Stories is designed to reflect the full richness of natural language while still being small to support research with modest compute budgets. Using this dataset, they began to explore aspects of language model performance, behavior, and mechanism by training a series of models that range in size from just 1 million to a maximum of 33 million parameters – which is still just 2% the scale of GPT-2. In this conversation, Nathan, Ronen, and Yuanzhi touch on LM reasoning, emergence, interpretability, and what understanding can be extended to LLMs.
LINKS:
Tiny Stories paper: https://huggingface.co/papers/2305.07759
TIMESTAMPS:
(00:00) Episode Preview
(07:12) The inspiration for the Tiny Stories project
(15:07) Sponsor: Omneky
(15:44) Creating the Tiny Stories dataset
(21:27) GPT-4 vs GPT-3.5
(24:13) Did the TinyStories team try any other versions of GPT-4
(29:23) Curriculum models and weirder curriculums
(35:34) What does reasoning mean?
(46:27) What does emergence mean?
(01:01:44) The curriculum development space
(01:11:40) The similarities between models and human development
(01:20:12) Fewer layers vs. more layers
(01:29:22) Attention heads
(01:33:40) Semantic attention head
(01:36:54) Neuron technique used in developing the TinyStories model
(01:52:20) Interpretability work that inspires Ronen and Yuanzhi
TWITTER:
@CogRev_Podcast
@EldanRonen (Ronen)
@labenz (Nathan)
@eriktorenberg (Erik)
Thank you Omneky for sponsoring The Cognitive Revolution. Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work, customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off.
Music Credit: MusicLM
More show notes and reading material released in our Substack: https://cognitiverevolution.substack.com
LINKS:
Tiny Stories paper: https://huggingface.co/papers/2305.07759
TIMESTAMPS:
(00:00) Episode Preview
(07:12) The inspiration for the Tiny Stories project
(15:07) Sponsor: Omneky
(15:44) Creating the Tiny Stories dataset
(21:27) GPT-4 vs GPT-3.5
(24:13) Did the TinyStories team try any other versions of GPT-4
(29:23) Curriculum models and weirder curriculums
(35:34) What does reasoning mean?
(46:27) What does emergence mean?
(01:01:44) The curriculum development space
(01:11:40) The similarities between models and human development
(01:20:12) Fewer layers vs. more layers
(01:29:22) Attention heads
(01:33:40) Semantic attention head
(01:36:54) Neuron technique used in developing the TinyStories model
(01:52:20) Interpretability work that inspires Ronen and Yuanzhi
TWITTER:
@CogRev_Podcast
@EldanRonen (Ronen)
@labenz (Nathan)
@eriktorenberg (Erik)
Thank you Omneky for sponsoring The Cognitive Revolution. Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work, customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off.
Music Credit: MusicLM
More show notes and reading material released in our Substack: https://cognitiverevolution.substack.com
Released:
Jun 6, 2023
Format:
Podcast episode
Titles in the series (100)
E10: The AI Voice Revolution with Mahmoud Felfel of Play.ht: The development of ultra-realistic human voices is upon us, and Mahmoud Felfel's company Play.ht is leading the next generation of text-to-voice models. In this episode, we discuss the challenges and opportunities of automating a more human voice, as well as concerns about deep fakes and user safety. Also, you might be interested to check out the debut of Erik Torenberg's new podcast "Upstream". This coming season features interviews with Marc Andreessen (Episode 1 is out now), David Sacks, Ezra Klein, Balaji Srinivasan, Katherine Boyle, and more. Subscribe here: https://podcasts.apple.com/us/podcast/id1678893467 Timestamps: (0:00) Preview of Mahmoud on this episode (0:55) Sponsor: Omneky.com (1:45) Nathan clones his voice using Play.ht (6:11) Why Mahmoud started Play.ht and the problem they tried to solve (13:08) The job to be done for Play.ht & how they’re thinking about APIs and models (24:45) Mahmoud breaks by "The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis