Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

The Future of the Transformer Part 2 with Trey Kollmer

The Future of the Transformer Part 2 with Trey Kollmer

From"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis


The Future of the Transformer Part 2 with Trey Kollmer

From"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

ratings:
Length:
65 minutes
Released:
Oct 20, 2023
Format:
Podcast episode

Description

Trey Kollmer returns to discuss the latest AI research revelations with Nathan Labenz. They explore how new techniques will shave 10% off global compute needs, how analogical prompting beats few-shot prompting, and how compressive historical records can increase LLM memory and retention abilities. If you need an ERP platform, check out our sponsor NetSuite: http://netsuite.com/cognitive.

SPONSORS: NetSuite | Omneky
NetSuite has 25 years of providing financial software for all your business needs. More than 36,000 businesses have already upgraded to NetSuite by Oracle, gaining visibility and control over their financials, inventory, HR, eCommerce, and more. If you're looking for an ERP platform ✅ head to NetSuite: http://netsuite.com/cognitive and download your own customized KPI checklist.
Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off.

LINKS:
? The show outline: https://docs.google.com/document/d/1oiSu9X4EVNMf90aRnk4mrfogSmq3QUsRg4GtI95mMCw/edit
Think Before You Speak: https://browse.arxiv.org/pdf/2310.02226.pdf
SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking: https://arxiv.org/pdf/2306.05426.pdf
StreamingLLMs: https://arxiv.org/abs/2309.17453
Large Language Models as Analogical Reasoners: https://arxiv.org/abs/2310.01714
Ring Attention: https://arxiv.org/abs/2310.01889

TIMESTAMPS:
(00:00:00) - Episode Preview
(00:01:11) - Paper: Think Before You Speak
(00:03:13) - Multimodal models for combining vision and language
(00:04:19) - Backspace Paper
(00:06:25) - Chain of thought prompting for step-by-step reasoning
(00:09:14) - Backspacing in language models to correct mistakes
(00:12:05) - Attention sinks for expanding context length
(0012:41) - Paper: Large Language Models as Analogical Reasoners
(00:15:24) - Pause tokens for language models to "think"
(00:18:23) - Analogical prompting to recall relevant examples
(00:20:52) - Long context windows for language models
(00:23:20) - Markdown works best for OpenAI
(00:24:23) - Ring attention to break memory constraints
(00:26:15) - Paper: StreamingLLMs
(00:27:46) - Potential for superhuman performance with longer contexts
(00:31:01) - Dynamic context window adjustment at runtime
(00:33:53) - Retention and memory capabilities for transformers
(00:37:12) - Planning algorithms combined with memory and scale
(00:39:49) - Paper: Ring Attention
(00:42:35) - Executive assistant prompting and critique
(00:45:23) - Self-RAG for language models to find own examples
(00:48:02) - Timelines and predictions for future capabilities
(00:50:37) - Applications like analyzing long texts and scripts
(00:53:15) - Local versus global attention in transformers
(00:55:59) - Architectural changes versus just training adjustments
(00:58:41) - Pre-training strategies like random start points
Released:
Oct 20, 2023
Format:
Podcast episode

Titles in the series (100)

A weekly podcast where hosts Erik Torenberg and Nathan Labenz interview the builders on the edge of AI and explore the dramatic shift it will unlock in the coming years.