Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

? ThursdAI - Feb 15, 2024 - OpenAI changes the Video Game, Google changes the Context game, and other AI news from past week

? ThursdAI - Feb 15, 2024 - OpenAI changes the Video Game, Google changes the Context game, and other AI news from past week

FromThursdAI - The top AI news from the past week


? ThursdAI - Feb 15, 2024 - OpenAI changes the Video Game, Google changes the Context game, and other AI news from past week

FromThursdAI - The top AI news from the past week

ratings:
Length:
118 minutes
Released:
Feb 16, 2024
Format:
Podcast episode

Description

Holy SH*T, These two words have been said on this episode multiple times, way more than ever before I want to say, and it's because we got 2 incredible exciting breaking news announcements in a very very short amount of time (in the span of 3 hours) and the OpenAI announcement came as we were recording the space, so you'll get to hear a live reaction of ours to this insanity. We also had 3 deep-dives, which I am posting on this weeks episode, we chatted with Yi Tay and Max Bane from Reka, which trained and released a few new foundational multi modal models this week, and with Dome and Pablo from Stability who released a new diffusion model called Stable Cascade, and finally had a great time hanging with Swyx (from Latent space) and finally got a chance to turn the microphone back at him, and had a conversation about Swyx background, Latent Space, and AI Engineer. I was also very happy to be in SF today of all days, as my day is not over yet, there's still an event which we Cohost together with A16Z, folks from Nous Research, Ollama and a bunch of other great folks, just look at all these logos! Open Source FTW ? TL;DR of all topics covered: * Breaking AI News* ? OpenAI releases SORA - text to video generation (Sora Blogpost with examples)* ? Google teases Gemini 1.5 with a whopping 1 MILLION tokens context window (X, Blog)* Open Source LLMs * Nvidia releases Chat With RTX local models (Blog, Download)* Cohere open sources Aya 101 - 101 languages supporting 12.8B model (X, HuggingFace)* Nomic releases Nomic Embed 1.5 + with Matryoshka embeddings (X)* Big CO LLMs + APIs* Andrej Karpathy leaves OpenAI (Announcement)* OpenAI adds memory to chatGPT (X)* This weeks Buzz (What I learned at WandB this week)* We launched a new course with Hamel Husain on enterprise model management (Course)* Vision & Video* Reka releases Reka-Flash, 21B & Reka Edge MM models (Blog, Demo)* Voice & Audio* WhisperKit runs on WatchOS now! (X)* AI Art & Diffusion & 3D* Stability releases Stable Casdade - new AI model based on Würstchen v3 (Blog, Demo)* Tools & Others* Goody2ai - A very good and aligned AI that does NOT want to break the rules (try it)? Let's start with Breaking News (in the order of how they happened) Google teases Gemini 1.5 with a whopping 1M context windowThis morning, Jeff Dean released a thread, full of crazy multi modal examples of their new 1.5 Gemini model, which can handle up to 1M tokens in the context window. The closest to that model so far was Claude 2.1 and that was not multi modal. They also claim they are researching up to 10M tokens in the context window. The thread was chock full of great examples, some of which highlighted the multimodality of this incredible model, like being able to pinpoint and give a timestamp of an exact moment in an hour long movie, just by getting a sketch as input. This, honestly blew me away. They were able to use the incredible large context window, break down the WHOLE 1 hour movie to frames and provide additional text tokens on top of it, and the model had near perfect recall. They used Greg Kamradt needle in the haystack analysis on text, video and audio and showed incredible recall, near perfect which highlights how much advancement we got in the area of context windows. Just for reference, less than a year ago, we had this chart from Mosaic when they released MPT. This graph Y axis at 60K the above graph is 1 MILLION and we're less than a year apart, not only that, Gemini Pro 1.5 is also multi modal I got to give promps to the Gemini team, this is quite a huge leap for them, and for the rest of the industry, this is a significant jump in what users will expect going forward! No longer will we be told "hey, your context is too long" ? A friend of the pod Enrico Shipolle joined the stage, you may remember him from our deep dive into extending Llama context window to 128K and showed that a bunch of new research makes all this possible also for open source, so we're waiting for OSS to catch u
Released:
Feb 16, 2024
Format:
Podcast episode

Titles in the series (49)

Every ThursdAI, Alex Volkov hosts a panel of experts, ai engineers, data scientists and prompt spellcasters on twitter spaces, as we discuss everything major and important that happened in the world of AI for the past week. Topics include LLMs, Open source, New capabilities, OpenAI, competitors in AI space, new LLM models, AI art and diffusion aspects and much more. sub.thursdai.news