Overview
In this tutorial, you'll build a complete RAG (Retrieval-Augmented Generation) pipeline from scratch. By the end, you'll have a working system that can answer questions about your own documents using an LLM grounded in real data.
Prerequisites
- Python 3.10+
- Basic familiarity with LLMs and embeddings
- An OpenAI API key (or any compatible provider)
pip install langchain chromadb openai tiktoken
Step 1: Load Your Documents
LangChain provides document loaders for PDFs, text files, web pages, and more.
from langchain.document_loaders import DirectoryLoader, TextLoader
loader = DirectoryLoader('./docs', glob='**/*.txt', loader_cls=TextLoader)
documents = loader.load()
print(f'Loaded {len(documents)} documents')
Step 2: Split into Chunks
LLMs have context limits. Split documents into overlapping chunks for better retrieval.
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(documents)
print(f'Created {len(chunks)} chunks')
Step 3: Create Embeddings and Store in ChromaDB
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(chunks, embeddings, persist_directory='./chroma_db')
vectorstore.persist()
Step 4: Build the Retrieval Chain
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(model_name='gpt-4', temperature=0)
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
chain_type='stuff',
retriever=vectorstore.as_retriever(search_kwargs={'k': 3})
)
Step 5: Query Your Pipeline
result = qa_chain.run('What are the main findings in the dataset?')
print(result)
The LLM now answers using the retrieved context from your documents rather than relying solely on its training data.
Next Steps
- Add metadata filtering for more precise retrieval
- Experiment with different chunk sizes and overlap
- Try alternative vector databases like Pinecone or Weaviate
- Add a streaming web interface with FastAPI or Gradio