Estimated reading time: 5 minutes

Implementing RAG with vector database

import os
from typing import List, Tuple
from langchain..openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain. import OpenAI

# Load environment variables (replace with your actual  key or use a .env file)
os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"  # Replace with your actual API key

def load_data(data_path: str) -> str:
    """
    Loads data from a file.  Supports text, and markdown.  For other file types,
    add appropriate loaders.

    Args:
        data_path: Path to the data file.

    Returns:
        The loaded data as a string.
    """
    try:
        with open(data_path, "r", encoding="utf-8") as f:
            data = f.read()
        return data
    except Exception as e:
        print(f"Error loading data from {data_path}: {e}")
        return ""

def chunk_data(data: str, chunk_size: int = 1000, chunk_overlap: int = 200) -> List[str]:
    """
    Splits the data into chunks.

    Args:
        data: The data to be chunked.
        chunk_size: The size of each chunk.
        chunk_overlap: The overlap between chunks.

    Returns:
        A list of text chunks.
    """
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=chunk_size, chunk_overlap=chunk_overlap
    )
    chunks = text_splitter.split_text(data)
    return chunks

def create_embeddings(chunks: List[str]) -> OpenAIEmbeddings:
    """
    Creates embeddings for the text chunks using OpenAI.

    Args:
        chunks: A list of text chunks.

    Returns:
        An OpenAIEmbeddings object.
    """
    embeddings = OpenAIEmbeddings()
    return embeddings

def create_vector_store(
    chunks: List[str], embeddings: OpenAIEmbeddings
) -> FAISS:
    """
    Creates a  store from the text chunks and embeddings using FAISS.

    Args:
        chunks: A list of text chunks.
        embeddings: An OpenAIEmbeddings object.

    Returns:
        A FAISS vector store.
    """
    vector_store = FAISS.from_texts(chunks, embeddings)
    return vector_store

def create_rag_chain(
    vector_store: FAISS, : OpenAI = OpenAI(temperature=0)
) -> RetrievalQA:
    """
    Creates a  chain using the vector store and a language model.

    Args:
        vector_store: A FAISS vector store.
        llm: A language model (default: OpenAI with temperature=0).

    Returns:
        A RetrievalQA chain.
    """
    rag_chain = RetrievalQA.from_chain_type(
        llm=llm, chain_type="stuff", retriever=vector_store.as_retriever()
    )
    return rag_chain

def rag_query(rag_chain: RetrievalQA, query: str) -> str:
    """
    Queries the RAG chain.

    Args:
        rag_chain: A RetrievalQA chain.
        query: The query string.

    Returns:
        The answer from the RAG chain.
    """
    answer = rag_chain.run(query)
    return answer

def main(data_path: str, query: str) -> str:
    """
    Main function to run the RAG process.

    Args:
        data_path: Path to the data file.
        query: The query string.

    Returns:
        The answer to the query using RAG.
    """
    data = load_data(data_path)
    if not data:
        return "No data loaded. Please check the data path."
    chunks = chunk_data(data)
    embeddings = create_embeddings(chunks)
    vector_store = create_vector_store(chunks, embeddings)
    rag_chain = create_rag_chain(vector_store)
    answer = rag_query(rag_chain, query)
    return answer

if __name__ == "__main__":
    # Example usage
    data_path = "data/my_data.txt"  # Replace with your data file
    query = "What is the main topic of this document?"
    answer = main(data_path, query)
    print(f"Query: {query}")
    print(f"Answer: {answer}")

Explanation:

  1. Import Libraries: Imports necessary libraries, including os, typing, Langchain modules for embeddings, vector stores, text splitting, RAG chains, and LLMs.
  2. load_data(data_path):
  • Loads data from a file.
  • Supports text and markdown files. You can extend it to handle other file types.
  • Handles potential file loading errors.
  1. chunk_data(data, chunk_size, chunk_overlap):
  • Splits the input text into smaller, overlapping chunks.
  • This is crucial for handling long documents and improving retrieval accuracy.
  1. create_embeddings(chunks):
  • Generates numerical representations (embeddings) of the text chunks using OpenAI’s embedding model.
  • Embeddings capture the semantic meaning of the text.
  1. create_vector_store(chunks, embeddings):
  • Creates a vector store (FAISS) to store the text chunks and their corresponding embeddings.
  • FAISS allows for efficient similarity search, which is essential for retrieval.
  1. create_rag_chain(vector_store, llm):
  • Creates a RAG chain using Langchain’s RetrievalQA class.
  • This chain combines the vector store (for retrieval) with a language model (for generation).
  • The stuff chain type is used, which passes all retrieved documents to the LLM in the prompt. Other chain types are available for different .
  1. rag_query(rag_chain, query):
  • Executes a query against the RAG chain.
  • The chain retrieves relevant chunks from the vector store and uses the LLM to generate an answer based on the retrieved information.
  1. main(data_path, query):
  • Orchestrates the entire RAG process: loads data, chunks it, creates embeddings and a vector store, creates the RAG chain, and queries it.
  1. if __name__ == “__main__”::
  • Provides an example of how to use the main function.
  • Replace “data/my_data.txt” with the actual path to your data file and modify the query.

Key Points:

  • Vector : A vector database (like FAISS, in this example) is essential for efficient retrieval of relevant information based on semantic similarity.
  • Embeddings: Embeddings are numerical representations of text that capture its meaning. OpenAI’s embedding models are used here, but others are available.
  • Chunking: Chunking is necessary to break down large documents into smaller, more manageable pieces that can be effectively processed by the LLM.
  • RAG Chain: The RAG chain orchestrates the retrieval and generation steps, combining the capabilities of the vector store and the LLM.
  • Prompt Engineering: The retrieved information is combined with the user’s query in a prompt that is passed to the LLM. Effective prompt engineering is crucial for getting good results.

Remember to:

  • Replace “YOUR_OPENAI_API_KEY” with your actual OpenAI API key. Consider using a .env file for secure storage of your API key.
  • Replace “data/my_data.txt” with the path to your data file.
  • Modify the query to ask a question about your data.
  • Install the required libraries: langchain, openai, faiss- (or faiss- if you have a compatible GPU). pip install langchain openai faiss-cpu

Agentic AI (45) AI Agent (35) airflow (6) Algorithm (35) Algorithms (88) apache (57) apex (5) API (135) Automation (67) Autonomous (60) auto scaling (5) AWS (73) aws bedrock (1) Azure (47) BigQuery (22) bigtable (2) blockchain (3) Career (7) Chatbot (23) cloud (143) cosmosdb (3) cpu (45) cuda (14) Cybersecurity (19) database (138) Databricks (25) Data structure (22) Design (113) dynamodb (10) ELK (2) embeddings (39) emr (3) flink (12) gcp (28) Generative AI (28) gpu (25) graph (49) graph database (15) graphql (4) image (50) indexing (33) interview (7) java (43) json (79) Kafka (31) LLM (59) LLMs (55) Mcp (6) monitoring (128) Monolith (6) mulesoft (4) N8n (9) Networking (16) NLU (5) node.js (16) Nodejs (6) nosql (29) Optimization (91) performance (193) Platform (121) Platforms (96) postgres (5) productivity (31) programming (54) pseudo code (1) python (110) pytorch (22) Q&A (2) RAG (65) rasa (5) rdbms (7) ReactJS (1) realtime (2) redis (16) Restful (6) rust (3) salesforce (15) Spark (39) sql (70) tensor (11) time series (17) tips (14) tricks (29) use cases (93) vector (60) vector db (9) Vertex AI (23) Workflow (67)