Monday, September 2, 2024
Posted by
Large Language Models (LLMs) have revolutionized how we build AI applications. In this guide, we'll explore how to combine OpenAI's powerful models with vector databases and LangChain to create sophisticated AI applications. We'll cover everything from basic setup to advanced patterns for production deployment.
First, let's install the required dependencies:
pip install openai langchain chromadb python-dotenv
First, let's set up OpenAI integration with proper environment management:
from dotenv import load_dotenv
import os
from openai import OpenAI
load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
def get_completion(prompt: str, model="gpt-3.5-turbo") -> str:
messages = [{"role": "user", "content": prompt}]
response = client.chat.completions.create(
model=model,
messages=messages,
temperature=0.7
)
return response.choices[0].message.content
We'll use Chroma as our vector store. Here's how to set it up:
import chromadb
from chromadb.config import Settings
client = chromadb.Client(Settings(
chroma_db_impl="duckdb+parquet",
persist_directory="db"
))
collection = client.create_collection(
name="documents",
metadata={"hnsw:space": "cosine"}
)
LangChain helps orchestrate the interaction between OpenAI and our vector store:
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI
embeddings = OpenAIEmbeddings()
text_splitter = CharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)
def create_knowledge_base(documents):
texts = text_splitter.split_documents(documents)
vectorstore = Chroma.from_documents(
documents=texts,
embedding=embeddings,
persist_directory="db"
)
return vectorstore
Let's combine these components to create a question-answering system:
def create_qa_chain(vectorstore):
qa_chain = RetrievalQA.from_chain_type(
llm=OpenAI(),
chain_type="stuff",
retriever=vectorstore.as_retriever(),
return_source_documents=True
)
return qa_chain
def query_documents(qa_chain, query: str):
response = qa_chain({"query": query})
return {
"answer": response["result"],
"sources": [doc.page_content for doc in response["source_documents"]]
}
Always implement robust error handling:
from typing import Optional, Dict, Any
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
def safe_query(qa_chain, query: str) -> Optional[Dict[str, Any]]:
try:
return query_documents(qa_chain, query)
except Exception as e:
logger.error(f"Error processing query: {str(e)}")
return None
Implement rate limiting and caching to optimize API usage:
from functools import lru_cache
from ratelimit import limits, sleep_and_retry
ONE_MINUTE = 60
MAX_CALLS_PER_MINUTE = 60
@sleep_and_retry
@limits(calls=MAX_CALLS_PER_MINUTE, period=ONE_MINUTE)
@lru_cache(maxsize=1000)
def cached_completion(prompt: str) -> str:
return get_completion(prompt)
Use environment variables for configuration:
# .env
OPENAI_API_KEY=your-api-key
EMBEDDING_MODEL=text-embedding-ada-002
COMPLETION_MODEL=gpt-3.5-turbo
MAX_TOKENS=500
async def async_process_queries(queries: List[str]):
tasks = [safe_query(qa_chain, query) for query in queries]
return await asyncio.gather(*tasks)
import prometheus_client as prom
query_latency = prom.Histogram('query_latency_seconds', 'Time spent processing queries')
query_counter = prom.Counter('queries_total', 'Total number of queries processed')
def estimate_tokens(text: str) -> int:
return len(text.split()) * 1.3 # Rough estimate
def track_usage(prompt: str, response: str):
prompt_tokens = estimate_tokens(prompt)
response_tokens = estimate_tokens(response)
logger.info(f"Usage - Prompt: {prompt_tokens}, Response: {response_tokens}")
Building LLM applications requires careful consideration of various components and their integration. By following these patterns and best practices, you can create robust, production-ready applications that leverage the power of OpenAI's models while maintaining scalability and reliability.
Remember to:
The complete code for this tutorial is available on GitHub.