Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

E33: The Tiny Model Revolution with Ronen Eldan and Yuanzhi Li of Microsoft Research

E33: The Tiny Model Revolution with Ronen Eldan and Yuanzhi Li of Microsoft Research

From"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis


E33: The Tiny Model Revolution with Ronen Eldan and Yuanzhi Li of Microsoft Research

From"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

ratings:
Length:
121 minutes
Released:
Jun 6, 2023
Format:
Podcast episode

Description

Nathan Labenz sits down with Ronen Eldan and Yuanzhi Li of Microsoft Research to discuss the small natural language dataset they created called TinyStories. Tiny Stories is designed to reflect the full richness of natural language while still being small to support research with modest compute budgets.  Using this dataset, they began to explore aspects of language model performance, behavior, and mechanism by training a series of models that range in size from just 1 million to a maximum of 33 million parameters – which is still just 2% the scale of GPT-2. In this conversation, Nathan, Ronen, and Yuanzhi touch on LM reasoning, emergence, interpretability, and what understanding can be extended to LLMs.

LINKS:
Tiny Stories paper: https://huggingface.co/papers/2305.07759

TIMESTAMPS:
(00:00) Episode Preview
(07:12) The inspiration for the Tiny Stories project
(15:07) Sponsor: Omneky
(15:44) Creating the Tiny Stories dataset
(21:27) GPT-4 vs GPT-3.5
(24:13) Did the TinyStories team try any other versions of GPT-4
(29:23) Curriculum models and weirder curriculums
(35:34) What does reasoning mean?
(46:27) What does emergence mean?
(01:01:44) The curriculum development space
(01:11:40) The similarities between models and human development
(01:20:12) Fewer layers vs. more layers
(01:29:22) Attention heads
(01:33:40) Semantic attention head
(01:36:54) Neuron technique used in developing the TinyStories model
(01:52:20) Interpretability work that inspires Ronen and Yuanzhi

TWITTER:
@CogRev_Podcast
@EldanRonen (Ronen)
@labenz (Nathan)
@eriktorenberg (Erik)

Thank you Omneky for sponsoring The Cognitive Revolution. Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work, customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off.

Music Credit: MusicLM
More show notes and reading material released in our Substack: https://cognitiverevolution.substack.com
Released:
Jun 6, 2023
Format:
Podcast episode

Titles in the series (100)

A weekly podcast where hosts Erik Torenberg and Nathan Labenz interview the builders on the edge of AI and explore the dramatic shift it will unlock in the coming years.