Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

Llama 2: The New Open LLM SOTA (ft. Nathan Lambert, Matt Bornstein, Anton Troynikov, Russell Kaplan, Whole Mars Catalog et al.)

Llama 2: The New Open LLM SOTA (ft. Nathan Lambert, Matt Bornstein, Anton Troynikov, Russell Kaplan, Whole Mars Catalog et al.)

FromLatent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0


Llama 2: The New Open LLM SOTA (ft. Nathan Lambert, Matt Bornstein, Anton Troynikov, Russell Kaplan, Whole Mars Catalog et al.)

FromLatent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0

ratings:
Length:
80 minutes
Released:
Jul 19, 2023
Format:
Podcast episode

Description

As first discussed on our May Emergency pod and leaked 4 days ago, Llama (renamed from LLaMA) was upgraded to Llama 2 (pretraining on 2 trillion tokens with 2x the context length - bigger than any dataset discussed in Datasets 101, and adding ~$20m worth of RLHF/preference annotation) and released for commercial use today.It immediately displaced Falcon-40B as the leading open LLM and was immediately converted/quantized to GGML and other formats. Llama 2 seems to outperform all other open source models in their equivalent weight class:Why are open models important? The intersection of Open Source and AI is one of the oldest themes on this publication, and there has been a raging debate on the security and reliability of the OpenAI models and APIs. Users have reported GPT-4’s quality going down, which has been denied and denied and as of today, given some supporting data from Databricks, and complained about the API reliability and rapid deprecation schedules. Last and surely the biggest, there are entire classes of businesses and government/healthcare/military organizations that categorically cannot send any of their sensitive data to an external API provider, even if it is OpenAI through Azure. The only way to have total control is to own and serve your own models, which Llama 2 now pushes forward in terms of the state of the art (your own GPT3.5-quality model, though it is nowhere near Claude 2 or GPT-4).As we do with breaking news, we got on to Twitter Spaces again to chat with two scheduled guests:* , ML Researcher at Huggingface and author of Interconnects who had the best summary of the Llama2 paper* Matt Bornstein, organizer of the a16z infra team that launched Llama2.ai (source here) and has been coding up a storm with AI demo apps, unusual for VCsas well as Anton Troynikov of Chroma, Russell Kaplan of Scale AI, and Omar Qazi of the Whole Mars Catalog.Enjoy!Show Notes* Official links* Website, Paper* GitHub (Llama 2 commit)* Azure Partnership* Use policy, Statement of Support for Open Approach* Where to try* Llama2.ai (source), Perplexity Llama Chat* https://replicate.com/a16z-infra/llama13b-v2-chat* https://huggingface.co/spaces/ysharma/Explore_llamav2_with_TGI * Dev ports - simonw llm-replicate, ggml using llama.cpp (7B, 13B) or pinokio,  ollama, Core ML port* Timeline* 24 Feb - LLaMA 1 announced* 6 May - our No Moats podcast - first mention of Zuck opening up Llama* 14 July - Llama 2 leaked* 18 July - Llama 2 announced* Community notes* Nathan’s research paper recap* 638 LOC, 4 dependencies* Usage restrictions - MAU restriction, derivative models* Grouped Query Attention* System prompt* 2 trillion token dataset* >$20m price tag (rlhf, jimfan), * Separate models for safety and helpfulness (jimfan)* Mistral AI founders left out of paperTimestamps* [00:02:30] Introducing the speakers* [00:03:32] Nathan Lambert intro* [00:04:48] General Summary of Llama 2* [00:05:57] Sarah Silverman killed Dataset Transparency?* [00:08:48] Simon's Recap of Llama 2* [00:11:43] Matt's Intro* [00:12:59] a16z Infra's new AI team?* [00:15:10] Alessio's recap of Llama 2* [00:17:26] Datasets 101 Followup* [00:18:14] Context Length 4k* [00:20:35] Open-ish Source? Usage Policy and Restrictions* [00:23:38] Huggingface Responsible AI License* [00:24:57] Pretraining Llama 2 Base Model beyond Chinchilla* [00:29:55] Llama 2 is incomplete? Race to publish* [00:31:40] Come for the Llama, stay for the (Meta) drama* [00:33:22] Language Translation* [00:35:10] Llama2's coding abilities* [00:35:59] Why we want to know about the training data* [00:37:45] The importance of Meta pushing forward Truly Open AI* [00:40:59] Llama 2 as Enabler of Startups* [00:43:59] Where you can try Llama 2* [00:44:25] Do you need dataset transparency if you have evals?* [00:45:56] >$20m cost of Llama 2 is primarily preference data collection* [00:48:59] Do we even need human annotators?* [00:49:42] Models Rating Models* [00:53:32] How to get Code preference data* [00:54:34] Lla
Released:
Jul 19, 2023
Format:
Podcast episode

Titles in the series (67)

The podcast by and for AI Engineers! We are the first place over 50k developers hear news and interviews about Software 3.0 - Foundation Models changing every domain in Code Generation, Computer Vision, AI Agents, and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from tiny (George Hotz), Databricks, Glean, Replit, Roboflow, MosaicML, UC Berkeley, OpenAI, and more. www.latent.space