Skip to content

RAG Chatbot Full Guide#

1. Introduction & Overview#

This guide will walk you through building a state-of-the-art Retrieval-Augmented Generation (RAG) chatbot using Jac Cloud, Jac-Streamlit, LangChain, ChromaDB, and modern LLMs. You’ll learn to: - Upload and index your own documents (PDFs) - Chat with an AI assistant that uses both your documents and LLMs - Add advanced dialogue routing for smarter conversations

2. Features & Architecture#

  • Document Upload & Ingestion: Upload PDFs, which are processed and indexed for semantic search.
  • Retrieval-Augmented Generation: Combines LLMs with document retrieval for context-aware answers.
  • Web Search Integration: Optionally augments responses with real-time web search results.
  • Streamlit Frontend: User-friendly chat interface.
  • Dialogue Routing: Classifies queries and routes them to the best model (RAG or QA).
  • Session Management: Maintains chat history and user sessions.

Project Structure: - client.jac: Streamlit frontend for chat and document upload - server.jac: Jac Cloud API server, session, LLM, and web search logic - rag.jac: RAG engine for document loading, splitting, embedding, and vector search - docs/: Example PDFs for testing

3. Full Source Code#

import streamlit as st;
import requests;
import base64;


def bootstrap_frontend(token: str) {
    st.set_page_config(layout="wide");
    st.title("Welcome to your RAG Chatbot!");
    # Initialize chat history
    if "messages" not in st.session_state {
        st.session_state.messages = [];
    }

    uploaded_file = st.file_uploader('Upload PDF');
    if uploaded_file {
        file_b64 = base64.b64encode(uploaded_file.read()).decode('utf-8');
        response = requests.post(
            "http://localhost:8000/walker/upload_pdf",
            json={"file_name": uploaded_file.name, "file_data": file_b64},
            headers={"Authorization": f"Bearer {token}"}
        );
        if response.status_code == 200 {
            st.success(f"Uploaded {uploaded_file.name}");
        } else {
            st.error(f"Failed to upload {uploaded_file.name}");
        }
    }

    if prompt := st.chat_input("What is up?") {
        # Add user message to chat history
        st.session_state.messages.append({"role": "user", "content": prompt});

        # Display user message in chat message container
        with st.chat_message("user") {
            st.markdown(prompt);
        }
        # Display assistant response in chat message container
        with st.chat_message("assistant") {

            # Call walker API
            response = requests.post("http://localhost:8000/walker/interact", json={"message": prompt, "session_id": "123"},
                headers={"Authorization": f"Bearer {token}"}
            );

            if response.status_code == 200 {
                response = response.json();
                print("response is",response);
                st.write(response["reports"][0]["response"]);

                # Add assistant response to chat history
                st.session_state.messages.append({"role": "assistant", "content": response["reports"][0]["response"]});
            }
        }
    }
}

with entry {

    INSTANCE_URL = "http://localhost:8000";
    TEST_USER_EMAIL = "test@mail.com";
    TEST_USER_PASSWORD = "password";

    response = requests.post(
        f"{INSTANCE_URL}/user/login",
        json={"email": TEST_USER_EMAIL, "password": TEST_USER_PASSWORD}
    );

    if response.status_code != 200 {
        # Try registering the user if login fails
        response = requests.post(
            f"{INSTANCE_URL}/user/register",
            json={
                "email": TEST_USER_EMAIL,
                "password": TEST_USER_PASSWORD
            }
        );
        assert response.status_code == 201;

        response = requests.post(
            f"{INSTANCE_URL}/user/login",
            json={"email": TEST_USER_EMAIL, "password": TEST_USER_PASSWORD}
        );
        assert response.status_code == 200;
    }

    token = response.json()["token"];

    print("Token:", token);

    bootstrap_frontend(token);
}
import os;
import from langchain_community.document_loaders {PyPDFDirectoryLoader, PyPDFLoader}
import from langchain_text_splitters {RecursiveCharacterTextSplitter}
import from langchain.schema.document {Document}
import from langchain_openai {OpenAIEmbeddings}
import from langchain_community.vectorstores.chroma {Chroma}


obj RagEngine {
    has file_path: str = "docs";
    has chroma_path: str = "chroma";

    def postinit {
        if not os.path.exists(self.file_path) {
            os.makedirs(self.file_path);
        }
        documents: list = self.load_documents();
        chunks: list = self.split_documents(documents);
        self.add_to_chroma(chunks);
        print("Documents loaded from", self.file_path);
    }

    def load_documents {
        document_loader = PyPDFDirectoryLoader(self.file_path);
        print("Loading documents from", document_loader);
        print("Document loader is", document_loader.load());
        return document_loader.load();
    }

    def load_document(file_path: str) {
        loader = PyPDFLoader(file_path);
        return loader.load();
    }

    def add_file(file_path: str) {
        documents = self.load_document(file_path);
        chunks = self.split_documents(documents);
        self.add_to_chroma(chunks);
    }

    def split_documents(documents: list[Document]) {
        text_splitter = RecursiveCharacterTextSplitter(chunk_size=800,
        chunk_overlap=80,
        length_function=len,
        is_separator_regex=False);
        return text_splitter.split_documents(documents);
    }

    def get_embedding_function {
        embeddings = OpenAIEmbeddings();
        return embeddings;
    }

    def add_chunk_id(chunks: str) {
        last_page_id = None;
        current_chunk_index = 0;

        for chunk in chunks {
            source = chunk.metadata.get('source');
            page = chunk.metadata.get('page');
            current_page_id = f'{source}:{page}';

            if current_page_id == last_page_id {
                current_chunk_index +=1;
            } else {
                current_chunk_index = 0;
            }

            chunk_id = f'{current_page_id}:{current_chunk_index}';
            last_page_id = current_page_id;

            chunk.metadata['id'] = chunk_id;
        }

        return chunks;
    }

    def add_to_chroma(chunks: list[Document]) {
        db = Chroma(persist_directory=self.chroma_path, embedding_function=self.get_embedding_function());
        chunks_with_ids = self.add_chunk_id(chunks);

        existing_items = db.get(include=[]);
        existing_ids = set(existing_items['ids']);

        new_chunks = [];
        for chunk in chunks_with_ids {
            if chunk.metadata['id'] not in existing_ids {
                new_chunks.append(chunk);
            }
        }

        if len(new_chunks) {
            print('adding new documents');
            new_chunk_ids = [chunk.metadata['id'] for chunk in new_chunks];
            db.add_documents(new_chunks, ids=new_chunk_ids);
        } else {
            print('no new documents to add');
        }
    }

    def get_from_chroma(query: str,chunck_nos: int=5) {
        db = Chroma(
            persist_directory=self.chroma_path,
            embedding_function=self.get_embedding_function()
        );
        results = db.similarity_search_with_score(query,k=chunck_nos);
        return results;
    }
}
import from mtllm.llms {OpenAI}
import from rag {RagEngine}
import os;
import base64;
import requests;

glob rag_engine:RagEngine = RagEngine();

glob llm = OpenAI(model_name='gpt-4o');

glob SERPER_API_KEY: str = os.getenv('SERPER_API_KEY', '');

obj WebSearch {
    has api_key: str = SERPER_API_KEY;
    has base_url: str = "https://google.serper.dev/search";

    def search(query: str) {
        headers = {"X-API-KEY": self.api_key, "Content-Type": "application/json"};
        payload = {"q": query};
        resp = requests.post(self.base_url, headers=headers, json=payload);
        if resp.status_code == 200 {
            data = resp.json();
            summary = "";
            results = data.get("organic", []) if isinstance(data, dict) else [];
            for r in results[:3] {
                summary += f"{r.get('title', '')}: {r.get('link', '')}\n";
                if r.get('snippet') {
                    summary += f"{r['snippet']}\n";
                }
            }
            return summary;
        }
        return f"Serper request failed: {resp.status_code}";
    }
}

glob web_search: WebSearch = WebSearch();

node Session {
    has id: str;
    has chat_history: list[dict];
    has status: int = 1;

    def respond(message:str, chat_history:str, agent_role:str,  context:str) -> str by llm();
}


walker interact {
    has message: str;
    has session_id: str;

    can init_session with `root entry {
         visit [-->](`?Session)(?id == self.session_id) else {
            session_node = here ++> Session(id=self.session_id, chat_history=[], status=1);
            print("Session Node Created");

            visit session_node;
        }
    }

    can chat with Session entry {
        here.chat_history.append({"role": "user", "content": self.message});
        docs = rag_engine.get_from_chroma(query=self.message);
        web = web_search.search(query=self.message);
        context = {"docs": docs, "web": web};
        response = here.respond(
            message=self.message,
            chat_history=here.chat_history,
            agent_role="You are a conversation agent designed to help users with their queries based on the documents provided and web search results",
            context=context
        );

        here.chat_history.append({"role": "assistant", "content": response});

        report {"response": response};
    }
}

walker upload_pdf {
    has file_name: str;
    has file_data: str;

    can save_doc with `root entry {
        if not os.path.exists(rag_engine.file_path) {
            os.makedirs(rag_engine.file_path);
        }
        file_path = os.path.join(rag_engine.file_path, self.file_name);
        data = base64.b64decode(self.file_data.encode('utf-8'));
        with open(file_path, 'wb') as f {
            f.write(data);
        }
        rag_engine.add_file(file_path);
        report {"status": "uploaded"};
    }
}

4. Setup Instructions#

  1. Install dependencies (Python 3.12+ recommended):
    pip install jaclang jac-cloud jac-streamlit mtllm langchain-openai langchain-community chromadb pypdf
    
  2. Set environment variables (for LLMs and web search):
    export OPENAI_API_KEY=<your-openai-key>
    export SERPER_API_KEY=<your-serper-key>
    
    Get a free Serper API key at serper.dev.
  3. Start the Jac Chatbot server:
    jac serve server.jac
    
  4. Run the Streamlit frontend:
    jac run client.jac
    

4. Streamlit Frontend (client.jac)#

The frontend is built with Jac-Streamlit and handles authentication, PDF upload, and chat. Here’s how it works:

Authentication and Token Handling:

response = requests.post(
    f"{INSTANCE_URL}/user/login",
    json={"email": TEST_USER_EMAIL, "password": TEST_USER_PASSWORD}
);
if response.status_code != 200 {
    # Try registering the user if login fails
    response = requests.post(
        f"{INSTANCE_URL}/user/register",
        json={"email": TEST_USER_EMAIL, "password": TEST_USER_PASSWORD}
    );
    ...
}
token = response.json()["token"];
- The app tries to log in a test user. If not found, it registers and logs in, then retrieves the token for API calls.

PDF Upload:

uploaded_file = st.file_uploader('Upload PDF');
if uploaded_file {
    file_b64 = base64.b64encode(uploaded_file.read()).decode('utf-8');
    response = requests.post(
        "http://localhost:8000/walker/upload_pdf",
        json={"file_name": uploaded_file.name, "file_data": file_b64},
        headers={"Authorization": f"Bearer {token}"}
    );
    ...
}
- Lets users upload PDFs, which are base64-encoded and sent to the backend for processing.

Chat Logic:

if prompt := st.chat_input("What is up?") {
    st.session_state.messages.append({"role": "user", "content": prompt});
    ...
    response = requests.post("http://localhost:8000/walker/interact", ...);
    ...
    st.session_state.messages.append({"role": "assistant", "content": response["reports"][0]["response"]});
}
- Captures user input, sends it to the backend, and displays both user and assistant messages in the chat UI.

5. RAG Engine (rag.jac)#

The RAG engine manages document ingestion, chunking, embedding, and retrieval.

Document Loading and Chunking:

def load_documents {
    document_loader = PyPDFDirectoryLoader(self.file_path);
    return document_loader.load();
}
def split_documents(documents: list[Document]) {
    text_splitter = RecursiveCharacterTextSplitter(...);
    return text_splitter.split_documents(documents);
}
- Loads all PDFs from the docs directory and splits them into manageable chunks for embedding and retrieval.

Embedding and Storing:

def get_embedding_function {
    embeddings = OpenAIEmbeddings();
    return embeddings;
}
def add_to_chroma(chunks: list[Document]) {
    db = Chroma(persist_directory=self.chroma_path, embedding_function=self.get_embedding_function());
    ...
    db.add_documents(new_chunks, ids=new_chunk_ids);
}
- Generates embeddings for each chunk and stores them in ChromaDB, using unique IDs to avoid duplicates.

Semantic Search:

def get_from_chroma(query: str,chunck_nos: int=5) {
    db = Chroma(...);
    results = db.similarity_search_with_score(query,k=chunck_nos);
    return results;
}
- Retrieves the most relevant document chunks for a user query using vector similarity search.

6. Backend Logic & Session Handling (server.jac)#

The backend manages sessions, chat history, and combines RAG and LLM responses.

Session Node:

node Session {
    has id: str;
    has chat_history: list[dict];
    ...
    def respond(message:str, chat_history:str, agent_role:str,  context:str) -> str by llm();
}
- Each user session has a unique ID and chat history. The respond method uses an LLM to generate answers, optionally using context from documents and web search.

Chat Walker:

walker interact {
    has message: str;
    has session_id: str;
    ...
    can chat with Session entry {
        here.chat_history.append({"role": "user", "content": self.message});
        docs = rag_engine.get_from_chroma(query=self.message);
        web = web_search.search(query=self.message);
        context = {"docs": docs, "web": web};
        response = here.respond(..., context=context);
        here.chat_history.append({"role": "assistant", "content": response});
        report {"response": response};
    }
}
- Handles incoming chat messages, retrieves relevant docs and web results, and generates a response using the LLM.

PDF Upload Walker:

walker upload_pdf {
    has file_name: str;
    has file_data: str;
    ...
    can save_doc with `root entry {
        ...
        rag_engine.add_file(file_path);
        report {"status": "uploaded"};
    }
}
- Saves uploaded PDFs to disk and triggers ingestion into the RAG engine.

7. Usage Guide#

  • Open the Streamlit app in your browser.
  • Upload one or more PDF files. The backend will process and index them.
  • Start chatting! Ask questions about the uploaded documents or general queries.
  • The bot will use both your documents and LLMs to answer.

8. API Endpoints (selected)#

  • POST /user/register — Register a new user
  • POST /user/login — Login and receive an access token
  • POST /walker/upload_pdf — Upload a PDF (requires Bearer token)
  • POST /walker/interact — Chat endpoint (requires Bearer token)

See http://localhost:8000/docs for full Swagger API documentation.

9. Advanced Features#

  • Web Search: If enabled and API key is set, the bot can augment answers with real-time web search results.
  • ChromaDB Vector Search: Efficient semantic search over your documents.
  • Session Management: Each chat session is tracked for context.
  • Extensible: Add new walkers or endpoints in Jac for custom logic.

10. Troubleshooting#

  • Ensure all dependencies are installed and compatible with your Python version.
  • If document upload fails, check server logs for errors.
  • For LLM/API issues, verify your API keys and environment variables.

For a quick overview, see Overview.