RAG Chatbot Full Guide#

1. Introduction & Overview#

This guide will walk you through building a state-of-the-art Retrieval-Augmented Generation (RAG) chatbot using Jac Cloud, Jac-Streamlit, LangChain, ChromaDB, and modern LLMs. You’ll learn to: - Upload and index your own documents (PDFs) - Chat with an AI assistant that uses both your documents and LLMs - Add advanced dialogue routing for smarter conversations

2. Features & Architecture#

Document Upload & Ingestion: Upload PDFs, which are processed and indexed for semantic search.
Retrieval-Augmented Generation: Combines LLMs with document retrieval for context-aware answers.
Web Search Integration: Optionally augments responses with real-time web search results.
Streamlit Frontend: User-friendly chat interface.
Dialogue Routing: Classifies queries and routes them to the best model (RAG or QA).
Session Management: Maintains chat history and user sessions.

Project Structure: - client.jac: Streamlit frontend for chat and document upload - server.jac: Jac Cloud API server, session, LLM, and web search logic - rag.jac: RAG engine for document loading, splitting, embedding, and vector search - docs/: Example PDFs for testing

3. Full Source Code#

client.jacrag.jacserver.jac

import streamlit as st;
import requests;
import base64;


def bootstrap_frontend(token: str) {
    st.set_page_config(layout="wide");
    st.title("Welcome to your RAG Chatbot!");
    # Initialize chat history
    if "messages" not in st.session_state {
        st.session_state.messages = [];
    }

    uploaded_file = st.file_uploader('Upload PDF');
    if uploaded_file {
        file_b64 = base64.b64encode(uploaded_file.read()).decode('utf-8');
        response = requests.post(
            "http://localhost:8000/walker/upload_pdf",
            json={"file_name": uploaded_file.name, "file_data": file_b64},
            headers={"Authorization": f"Bearer {token}"}
        );
        if response.status_code == 200 {
            st.success(f"Uploaded {uploaded_file.name}");
        } else {
            st.error(f"Failed to upload {uploaded_file.name}");
        }
    }

    if prompt := st.chat_input("What is up?") {
        # Add user message to chat history
        st.session_state.messages.append({"role": "user", "content": prompt});

        # Display user message in chat message container
        with st.chat_message("user") {
            st.markdown(prompt);
        }
        # Display assistant response in chat message container
        with st.chat_message("assistant") {

            # Call walker API
            response = requests.post("http://localhost:8000/walker/interact", json={"message": prompt, "session_id": "123"},
                headers={"Authorization": f"Bearer {token}"}
            );

            if response.status_code == 200 {
                response = response.json();
                print("response is",response);
                st.write(response["reports"][0]["response"]);

                # Add assistant response to chat history
                st.session_state.messages.append({"role": "assistant", "content": response["reports"][0]["response"]});
            }
        }
    }
}

with entry {

    INSTANCE_URL = "http://localhost:8000";
    TEST_USER_EMAIL = "test@mail.com";
    TEST_USER_PASSWORD = "password";

    response = requests.post(
        f"{INSTANCE_URL}/user/login",
        json={"email": TEST_USER_EMAIL, "password": TEST_USER_PASSWORD}
    );

    if response.status_code != 200 {
        # Try registering the user if login fails
        response = requests.post(
            f"{INSTANCE_URL}/user/register",
            json={
                "email": TEST_USER_EMAIL,
                "password": TEST_USER_PASSWORD
            }
        );
        assert response.status_code == 201;

        response = requests.post(
            f"{INSTANCE_URL}/user/login",
            json={"email": TEST_USER_EMAIL, "password": TEST_USER_PASSWORD}
        );
        assert response.status_code == 200;
    }

    token = response.json()["token"];

    print("Token:", token);

    bootstrap_frontend(token);
}

import os;
import from langchain_community.document_loaders {PyPDFDirectoryLoader, PyPDFLoader}
import from langchain_text_splitters {RecursiveCharacterTextSplitter}
import from langchain.schema.document {Document}
import from langchain_openai {OpenAIEmbeddings}
import from langchain_community.vectorstores.chroma {Chroma}


obj RagEngine {
    has file_path: str = "docs";
    has chroma_path: str = "chroma";

    def postinit {
        if not os.path.exists(self.file_path) {
            os.makedirs(self.file_path);
        }
        documents: list = self.load_documents();
        chunks: list = self.split_documents(documents);
        self.add_to_chroma(chunks);
        print("Documents loaded from", self.file_path);
    }

    def load_documents {
        document_loader = PyPDFDirectoryLoader(self.file_path);
        print("Loading documents from", document_loader);
        print("Document loader is", document_loader.load());
        return document_loader.load();
    }

    def load_document(file_path: str) {
        loader = PyPDFLoader(file_path);
        return loader.load();
    }

    def add_file(file_path: str) {
        documents = self.load_document(file_path);
        chunks = self.split_documents(documents);
        self.add_to_chroma(chunks);
    }

    def split_documents(documents: list[Document]) {
        text_splitter = RecursiveCharacterTextSplitter(chunk_size=800,
        chunk_overlap=80,
        length_function=len,
        is_separator_regex=False);
        return text_splitter.split_documents(documents);
    }

    def get_embedding_function {
        embeddings = OpenAIEmbeddings();
        return embeddings;
    }

    def add_chunk_id(chunks: str) {
        last_page_id = None;
        current_chunk_index = 0;

        for chunk in chunks {
            source = chunk.metadata.get('source');
            page = chunk.metadata.get('page');
            current_page_id = f'{source}:{page}';

            if current_page_id == last_page_id {
                current_chunk_index +=1;
            } else {
                current_chunk_index = 0;
            }

            chunk_id = f'{current_page_id}:{current_chunk_index}';
            last_page_id = current_page_id;

            chunk.metadata['id'] = chunk_id;
        }

        return chunks;
    }

    def add_to_chroma(chunks: list[Document]) {
        db = Chroma(persist_directory=self.chroma_path, embedding_function=self.get_embedding_function());
        chunks_with_ids = self.add_chunk_id(chunks);

        existing_items = db.get(include=[]);
        existing_ids = set(existing_items['ids']);

        new_chunks = [];
        for chunk in chunks_with_ids {
            if chunk.metadata['id'] not in existing_ids {
                new_chunks.append(chunk);
            }
        }

        if len(new_chunks) {
            print('adding new documents');
            new_chunk_ids = [chunk.metadata['id'] for chunk in new_chunks];
            db.add_documents(new_chunks, ids=new_chunk_ids);
        } else {
            print('no new documents to add');
        }
    }

    def get_from_chroma(query: str,chunck_nos: int=5) {
        db = Chroma(
            persist_directory=self.chroma_path,
            embedding_function=self.get_embedding_function()
        );
        results = db.similarity_search_with_score(query,k=chunck_nos);
        return results;
    }
}

import from mtllm.llms {OpenAI}
import from rag {RagEngine}
import os;
import base64;
import requests;

glob rag_engine:RagEngine = RagEngine();

glob llm = OpenAI(model_name='gpt-4o');

glob SERPER_API_KEY: str = os.getenv('SERPER_API_KEY', '');

obj WebSearch {
    has api_key: str = SERPER_API_KEY;
    has base_url: str = "https://google.serper.dev/search";

    def search(query: str) {
        headers = {"X-API-KEY": self.api_key, "Content-Type": "application/json"};
        payload = {"q": query};
        resp = requests.post(self.base_url, headers=headers, json=payload);
        if resp.status_code == 200 {
            data = resp.json();
            summary = "";
            results = data.get("organic", []) if isinstance(data, dict) else [];
            for r in results[:3] {
                summary += f"{r.get('title', '')}: {r.get('link', '')}\n";
                if r.get('snippet') {
                    summary += f"{r['snippet']}\n";
                }
            }
            return summary;
        }
        return f"Serper request failed: {resp.status_code}";
    }
}

glob web_search: WebSearch = WebSearch();

node Session {
    has id: str;
    has chat_history: list[dict];
    has status: int = 1;

    def respond(message:str, chat_history:str, agent_role:str,  context:str) -> str by llm();
}


walker interact {
    has message: str;
    has session_id: str;

    can init_session with `root entry {
         visit [-->](`?Session)(?id == self.session_id) else {
            session_node = here ++> Session(id=self.session_id, chat_history=[], status=1);
            print("Session Node Created");

            visit session_node;
        }
    }

    can chat with Session entry {
        here.chat_history.append({"role": "user", "content": self.message});
        docs = rag_engine.get_from_chroma(query=self.message);
        web = web_search.search(query=self.message);
        context = {"docs": docs, "web": web};
        response = here.respond(
            message=self.message,
            chat_history=here.chat_history,
            agent_role="You are a conversation agent designed to help users with their queries based on the documents provided and web search results",
            context=context
        );

        here.chat_history.append({"role": "assistant", "content": response});

        report {"response": response};
    }
}

walker upload_pdf {
    has file_name: str;
    has file_data: str;

    can save_doc with `root entry {
        if not os.path.exists(rag_engine.file_path) {
            os.makedirs(rag_engine.file_path);
        }
        file_path = os.path.join(rag_engine.file_path, self.file_name);
        data = base64.b64decode(self.file_data.encode('utf-8'));
        with open(file_path, 'wb') as f {
            f.write(data);
        }
        rag_engine.add_file(file_path);
        report {"status": "uploaded"};
    }
}

4. Setup Instructions#

Install dependencies (Python 3.12+ recommended):

pip install jaclang jac-cloud jac-streamlit mtllm langchain-openai langchain-community chromadb pypdf

Set environment variables (for LLMs and web search):
```
export OPENAI_API_KEY=<your-openai-key>
export SERPER_API_KEY=<your-serper-key>
```
Get a free Serper API key at serper.dev.
Start the Jac Chatbot server:
```
jac serve server.jac
```
Run the Streamlit frontend:
```
jac run client.jac
```

4. Streamlit Frontend (`client.jac`)#

The frontend is built with Jac-Streamlit and handles authentication, PDF upload, and chat. Here’s how it works:

Authentication and Token Handling:

response = requests.post(
    f"{INSTANCE_URL}/user/login",
    json={"email": TEST_USER_EMAIL, "password": TEST_USER_PASSWORD}
);
if response.status_code != 200 {
    # Try registering the user if login fails
    response = requests.post(
        f"{INSTANCE_URL}/user/register",
        json={"email": TEST_USER_EMAIL, "password": TEST_USER_PASSWORD}
    );
    ...
}
token = response.json()["token"];

- The app tries to log in a test user. If not found, it registers and logs in, then retrieves the token for API calls.

PDF Upload:

uploaded_file = st.file_uploader('Upload PDF');
if uploaded_file {
    file_b64 = base64.b64encode(uploaded_file.read()).decode('utf-8');
    response = requests.post(
        "http://localhost:8000/walker/upload_pdf",
        json={"file_name": uploaded_file.name, "file_data": file_b64},
        headers={"Authorization": f"Bearer {token}"}
    );
    ...
}

- Lets users upload PDFs, which are base64-encoded and sent to the backend for processing.

Chat Logic:

if prompt := st.chat_input("What is up?") {
    st.session_state.messages.append({"role": "user", "content": prompt});
    ...
    response = requests.post("http://localhost:8000/walker/interact", ...);
    ...
    st.session_state.messages.append({"role": "assistant", "content": response["reports"][0]["response"]});
}

- Captures user input, sends it to the backend, and displays both user and assistant messages in the chat UI.

5. RAG Engine (`rag.jac`)#

The RAG engine manages document ingestion, chunking, embedding, and retrieval.

Document Loading and Chunking:

def load_documents {
    document_loader = PyPDFDirectoryLoader(self.file_path);
    return document_loader.load();
}
def split_documents(documents: list[Document]) {
    text_splitter = RecursiveCharacterTextSplitter(...);
    return text_splitter.split_documents(documents);
}

- Loads all PDFs from the docs directory and splits them into manageable chunks for embedding and retrieval.

Embedding and Storing:

def get_embedding_function {
    embeddings = OpenAIEmbeddings();
    return embeddings;
}
def add_to_chroma(chunks: list[Document]) {
    db = Chroma(persist_directory=self.chroma_path, embedding_function=self.get_embedding_function());
    ...
    db.add_documents(new_chunks, ids=new_chunk_ids);
}

- Generates embeddings for each chunk and stores them in ChromaDB, using unique IDs to avoid duplicates.

Semantic Search:

def get_from_chroma(query: str,chunck_nos: int=5) {
    db = Chroma(...);
    results = db.similarity_search_with_score(query,k=chunck_nos);
    return results;
}

- Retrieves the most relevant document chunks for a user query using vector similarity search.

6. Backend Logic & Session Handling (`server.jac`)#

The backend manages sessions, chat history, and combines RAG and LLM responses.

Session Node:

node Session {
    has id: str;
    has chat_history: list[dict];
    ...
    def respond(message:str, chat_history:str, agent_role:str,  context:str) -> str by llm();
}

- Each user session has a unique ID and chat history. The respond method uses an LLM to generate answers, optionally using context from documents and web search.

Chat Walker:

walker interact {
    has message: str;
    has session_id: str;
    ...
    can chat with Session entry {
        here.chat_history.append({"role": "user", "content": self.message});
        docs = rag_engine.get_from_chroma(query=self.message);
        web = web_search.search(query=self.message);
        context = {"docs": docs, "web": web};
        response = here.respond(..., context=context);
        here.chat_history.append({"role": "assistant", "content": response});
        report {"response": response};
    }
}

- Handles incoming chat messages, retrieves relevant docs and web results, and generates a response using the LLM.

PDF Upload Walker:

walker upload_pdf {
    has file_name: str;
    has file_data: str;
    ...
    can save_doc with `root entry {
        ...
        rag_engine.add_file(file_path);
        report {"status": "uploaded"};
    }
}

- Saves uploaded PDFs to disk and triggers ingestion into the RAG engine.

7. Usage Guide#

Open the Streamlit app in your browser.
Upload one or more PDF files. The backend will process and index them.
Start chatting! Ask questions about the uploaded documents or general queries.
The bot will use both your documents and LLMs to answer.

8. API Endpoints (selected)#

POST /user/register — Register a new user
POST /user/login — Login and receive an access token
POST /walker/upload_pdf — Upload a PDF (requires Bearer token)
POST /walker/interact — Chat endpoint (requires Bearer token)

See http://localhost:8000/docs for full Swagger API documentation.

9. Advanced Features#

Web Search: If enabled and API key is set, the bot can augment answers with real-time web search results.
ChromaDB Vector Search: Efficient semantic search over your documents.
Session Management: Each chat session is tracked for context.
Extensible: Add new walkers or endpoints in Jac for custom logic.

10. Troubleshooting#

Ensure all dependencies are installed and compatible with your Python version.
If document upload fails, check server logs for errors.
For LLM/API issues, verify your API keys and environment variables.

For a quick overview, see Overview.