This tutorial demonstrates how to build a 100% local microeconomics chatbot using Google's open-source Gemma 3 model and Retrieval-Augmented Generation (RAG). By leveraging Kolosal AI for local inference and BM25 for document retrieval, users can create a private, cost-effective AI assistant that provides context-aware answers based on a local economics knowledge base.
What You'll Build
A microeconomics Q&A chatbot that:
- Runs 100% locally on your machine using Gemma 3 with Kolosal AI
- Retrieves economics context from your custom documents using BM25
- Uses RAG (Retrieval-Augmented Generation) to generate accurate, grounded answers
- Is deployed via Streamlit inside a Docker container
Demo
System Architecture
The chatbot pipeline is composed of:
- User Query: The user types a question (e.g. "What is price elasticity?")
- Query Optimizer: The local LLM rewrites the query into optimized search terms
- BM25 Retriever: Finds the top 3 relevant documents from your economics notes
- Answer Synthesizer: The LLM generates an answer using the question and retrieved docs
- Response: Answer is streamed back to the UI with sources shown
Key Technologies
- Gemma 3: Google's lightweight 1B open-source LLM, runs locally via Kolosal AI
- Kolosal AI: Local inference engine with OpenAI-compatible API
- BM25 Retriever: Classic sparse retriever for fast document lookup
- Streamlit: Web UI to chat with the bot
- Docker: Isolated deployment environment
Run the Chatbot
To run the chatbot on your machine:
# Clone the repo
git clone https://github.com/FarrelRamdhani/Microeconomic-Chatbot.git
cd Microeconomic-Chatbot
# Build and run the container
docker build -t microeconomic-chatbot .
docker run -p 8501:8501 microeconomic-chatbot
# Visit the app
http://localhost:8501
Try It Yourself
Once deployed, try asking the chatbot questions like:
- "What is the difference between monopoly and perfect competition?"
- "How does price elasticity affect consumer behavior?"
- "Can you give an example of opportunity cost?"