RAG Is A Hack - with Jerry Liu from LlamaIndex

FromLatent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0

Start listening View podcast show

RAG Is A Hack - with Jerry Liu from LlamaIndex

FromLatent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0

ratings:

Length:

68 minutes

Released:

Oct 5, 2023

Format:

Podcast episode

Description

In October 2022, Robust Intelligence hosted an internal hackathon to play around with LLMs which led to the creation of two of the most important AI Engineering tools: LangChain ?⛓️ (our interview with Harrison here) and LlamaIndex ? by Jerry Liu, which we’ll cover today. In less than a year, LlamaIndex has crossed 600,000 monthly downloads, raised $8.5M from Greylock, has a fast growing open source community that contributes to LlamaHub, and it doesn’t seem to be slowing down.LlamaIndex’s Origin (aka GPT Tree Index)Jerry struggled to make large amounts of data work with GPT-3 (which had a 4,096 tokens context window). Today LlamaIndex is at the forefront of the RAG wave (Retrieval Augmented Generation), but in the beginning Jerry wasn’t focused on embeddings and search, but rather on understanding how models could summarize, link, and reason about data. On November 5th, Jerry pushed the first version to Github under the name “GPT Tree Index”: The GPT Tree Index first takes in a large dataset of unprocessed text data as input. It then builds up a tree-index in a bottom-up fashion; each parent node is able to summarize the children nodes using a general summarization prompt; each intermediate node containing summary text summarizing the components below. Once the index is built, it can be saved to disk and loaded for future use.Then, say the user wants to use GPT-3 to answer a question. Using a query prompt template, GPT-3 will be able to recursively perform tree traversal in a top-down fashion in order to answer a question. For example, in the very beginning GPT-3 is tasked with selecting between *n* top-level nodes which best answers a provided query, by outputting a number as a multiple-choice problem. The GPT Tree Index then uses the number to select the corresponding node, and the process repeats recursively among the children nodes until a leaf node is reached.[…]How is this better than an embeddings-based approach / other state-of-the-art QA and retrieval methods?The intent is not to compete against existing methods. A simpler embedding-based technique could be to just encode each chunk as an embedding and do a simple question-document embedding look-up to retrieve the result. This project is a simple exercise to test how GPT can organize and lookup information.The project attracted a lot of attention early on (the announcement tweet has ~330 likes), but it wasn’t until ~February 2023 that the open source community really started to explode, which was around the same time that LlamaHub was released. LlamaHub made it easy for developers to import data from Google Drive, Discord, Slack, databases, and more into their LlamaIndex projects. What is LlamaIndex? As we mentioned, LlamaIndex is leading the charge in the development of the RAG stack. RAG boils down to two parts:* Indexing (i.e. how do you load and index the data in your knowledge base)* Querying (i.e. how do you surface the data and fit it in the model context) IndexingTo get your data from all your sources to your RAG knowledge base, you can leverage a few tools: * Documents / Nodes: A Document is a generic container around any data source - for instance, a PDF, an API output, or retrieved data from a database. A Node is the atomic unit of data in LlamaIndex and represents a “chunk” of a source Document (i.e. one Document has many Node) as well as its relationship to other Node objects.* Data Connectors: A data connector ingest data from different sources and turn them into Document representations (text and simple metadata). These connectors are offered through LlamaHub, and there are over 200 of them today.* Data Indexes: Once you’ve ingested your data, LlamaIndex will help you index the data into a format that’s easy to retrieve. There are many types of indexes (Summary, Tree, Vector, etc). Under the hood, LlamaIndex parses the raw documents into intermediate representations, calculates vector embeddings, and infers metadata. The most commonly used index is t

Released:

Oct 5, 2023

Format:

Podcast episode

Titles in the series (67)

The podcast by and for AI Engineers! We are the first place over 50k developers hear news and interviews about Software 3.0 - Foundation Models changing every domain in Code Generation, Computer Vision, AI Agents, and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from tiny (George Hotz), Databricks, Glean, Replit, Roboflow, MosaicML, UC Berkeley, OpenAI, and more. www.latent.space

Skip carousel

More Episodes from Latent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0

Skip carousel

Related podcast episodes

Skip carousel

Discover this podcast and so much more

RAG Is A Hack - with Jerry Liu from LlamaIndex

RAG Is A Hack - with Jerry Liu from LlamaIndex

Description

Titles in the series (67)

More Episodes from Latent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0

Related podcast episodes