Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

? ThursdAI Apr 4 - Weave, CMD R+, SWE-Agent, Everyone supports Tool Use + JAMBA deep dive with AI21

? ThursdAI Apr 4 - Weave, CMD R+, SWE-Agent, Everyone supports Tool Use + JAMBA deep dive with AI21

FromThursdAI - The top AI news from the past week


? ThursdAI Apr 4 - Weave, CMD R+, SWE-Agent, Everyone supports Tool Use + JAMBA deep dive with AI21

FromThursdAI - The top AI news from the past week

ratings:
Length:
110 minutes
Released:
Apr 5, 2024
Format:
Podcast episode

Description

Happy first ThursdAI of April folks, did you have fun on April Fools? ? I hope you did, I made a poll on my feed and 70% did not participate in April Fools, which makes me a bit sad! Well all-right, time to dive into the news of this week, and of course there are TONS of news, but I want to start with our own breaking news! That's right, we at Weights & Biases have breaking new of our own today, we've launched our new product today called Weave! Weave is our new toolkit to track, version and evaluate LLM apps, so from now on, we have Models (what you probably know as Weights & Biases) and Weave. So if you're writing any kind RAG system, anything that uses Claude or OpenAI, Weave is for you! I'll be focusing on Weave and I'll be sharing more on the topic, but today I encourage you to listen to the launch conversation I had with Tim & Scott from the Weave team here at WandB, as they and the rest of the team worked their ass off for this release and we want to celebrate the launch ?TL;DR of all topics covered: * Open Source LLMs * Cohere - CommandR PLUS - 104B RAG optimized Sonnet competitor (Announcement, HF)* Princeton SWE-agent - OSS Devin - gets 12.29% on SWE-bench (Announcement, Github)* Jamba paper is out (Paper)* Mozilla LLamaFile now goes 5x faster on CPUs (Announcement, Blog)* Deepmind - Mixture of Depth paper (Thread, ArXiv)* Big CO LLMs + APIs* Cloudflare AI updates (Blog)* Anthropic adds function calling support (Announcement, Docs)* Groq lands function calling (Announcement, Docs)* OpenAI is now open to customers without login requirements * Replit Code Repair - 7B finetune of deep-seek that outperforms Opus (X)* Google announced Gemini Prices + Logan joins (X)קרמ* This weeks Buzz - oh so much BUZZ!* Weave lunch! Check weave out! (Weave Docs, Github)* Sign up with Promo Code THURSDAI at fullyconnected.com * Voice & Audio* OpenAI Voice Engine will not be released to developers (Blog)* Stable Audio v2 dropped (Announcement, Try here)* Lightning Whisper MLX - 10x faster than whisper.cpp (Announcement, Github)* AI Art & Diffusion & 3D* Dall-e now has in-painting (Announcement) * Deep dive* Jamba deep dive with Roi Cohen from AI21 and Maxime Labonne Open Source LLMs Cohere releases Command R+, 104B RAG focused model (Blog)Cohere surprised us, and just 2.5 weeks after releasing Command-R (which became very popular and is No 10 on Lmsys arena) gave us it's big brother, Command R PLUSWith 128K tokens in the context window, this model is multilingual as well, supporting 10 languages and is even beneficial on tokenization for those languages (a first!) The main focus from Cohere is advanced function calling / tool use, and RAG of course, and this model specializes in those tasks, beating even GPT-4 turbo. It's clear that Cohere is positioning themselves as RAG leaders as evident by this accompanying tutorial on starting with RAG apps and this model further solidifies their place as the experts in this field. Congrats folks, and thanks for the open weights ?SWE-Agent from PrincetonFolks remember Devin? The super cracked team born agent with a nice UI that got 13% on the SWE-bench a very hard (for LLMs) benchmark that requires solving real world issues?Well now we have an open source agent that comes very very close to that called SWE-AgentSWE agent has a dedicated terminal and tools, and utilizes something called ACI (Agent Computer Interface) allowing the agent to navigate, search, and edit code. The dedicated terminal in a docker environment really helps as evident by a massive 12.3% score on SWE-bench where GPT-4 gets only 1.4%! Worth mentioning that SWE-bench is a very hard benchmark that was created by the folks who released SWE-agent, and here's some videos of them showing the agent off, this is truly an impressive achievement!Deepmind publishes Mixture of Depth (arXiv)Thanks to Hassan who read the paper and wrote a deep dive, this paper by Deepmind shows their research into optimizing model inference. Apparently there's
Released:
Apr 5, 2024
Format:
Podcast episode

Titles in the series (49)

Every ThursdAI, Alex Volkov hosts a panel of experts, ai engineers, data scientists and prompt spellcasters on twitter spaces, as we discuss everything major and important that happened in the world of AI for the past week. Topics include LLMs, Open source, New capabilities, OpenAI, competitors in AI space, new LLM models, AI art and diffusion aspects and much more. sub.thursdai.news