42 min listen
A Deep Dive Into Generative's Newest Models: Gemini vs Mistral (Mixtral-8x7B)–Part I
FromDeep Papers
ratings:
Length:
48 minutes
Released:
Dec 27, 2023
Format:
Podcast episode
Description
For the last paper read of the year, Arize CPO & Co-Founder, Aparna Dhinakaran, is joined by a Dat Ngo (ML Solutions Architect) and Aman Khan (Product Manager) for an exploration of the new kids on the block: Gemini and Mixtral-8x7B. There's a lot to cover, so this week's paper read is Part I in a series about Mixtral and Gemini. In Part I, we provide some background and context for Mixtral 8x7B from Mistral AI, a high-quality sparse mixture of experts model (SMoE) that outperforms Llama 2 70B on most benchmarks with 6x faster inference Mixtral also matches or outperforms GPT3.5 on most benchmarks. This open-source model was optimized through supervised fine-tuning and direct preference optimization. Stay tuned for Part II in January, where we'll build on this conversation in and discuss Gemini-developed by teams at DeepMind and Google Research. Link to transcript and live recording: https://arize.com/blog/a-deep-dive-into-generatives-newest-models-mistral-mixtral-8x7b/To learn more about ML observability, join the Arize AI Slack community or get the latest on our LinkedIn and Twitter.
Released:
Dec 27, 2023
Format:
Podcast episode
Titles in the series (22)
Hungry Hungry Hippos - H3 by Deep Papers