How to build a custom Q&A chatbot using Ollama, LangChain, and Streamlit.
A local Ollama server generates answers, LangChain handles prompt construction and retrieval, and a lightweight BM25 retriever keeps the FAQ search local and Python 3.14-compatible. The assistant is tuned for Sunrise Realty Group and answers only from the supplied real-estate FAQ context.
brew install pyenv
pyenv install 3.14.3
pyenv local 3.14.3Installed via requirements.txt:
- LangChain: Framework to interface with LLMs and orchestrate prompt chaining.
- Ollama: Local language model and embedding runtime.
- BM25: Lightweight keyword-based retrieval over the local FAQ content.
- python-dotenv: Loads environment variables.
- Streamlit: Interactive UI framework.
- watchdog: Improves Streamlit file watching and local dev responsiveness.
- Others:
colorama,requests,dateutil.
MacOS/Linux:
python3 -m venv env
source env/bin/activateWindows:
python -m venv env
env\Scripts\activatepip install -r requirements.txtIf you already have an older virtualenv, recreate it after switching Python versions:
rm -rf env
python3 -m venv env
source env/bin/activate
pip install -r requirements.txtMake sure your local Ollama server is running and the required models are already available:
ollama serve
ollama pull gemma3:4bThen duplicate the template:
cp .env.example .envDefault .env values:
OLLAMA_BASE_URL=http://localhost:11434
LANGUAGE_MODEL=gemma3:4bpython main.pystreamlit run app.pyAlternative (minimalist UI):
streamlit run app-nb.pyThen open http://localhost:8501
| Component | Purpose |
|---|---|
| LangChain | Manages prompt templates, chaining, and LLM interactions. |
| Ollama | Serves the local chat model. |
| BM25Retriever | Retrieves relevant FAQ chunks without loading an embedding model. |
| Streamlit | Builds a user-friendly, interactive web interface. |
| Docker | Containers for environment consistency and ease of deployment. |
| Docker Compose | Orchestrates CLI and UI services simultaneously with shared config. |
| dotenv | Loads and manages API keys securely in local development. |
-
Document Ingestion
- Raw text (
faq_real_estate.txt) is loaded and split withRecursiveCharacterTextSplitterinto overlapping chunks for retrieval.
- Raw text (
-
Retrieval
- Chunks are indexed with a BM25 retriever for fast local keyword search without loading a separate embedding model.
-
Query Flow
- User questions are matched against the FAQ chunks, and the most relevant passages are passed as context.
-
Prompt Assembly & LLM Output
- LangChain constructs a system + human prompt that tells the assistant to act as a Sunrise Realty Group real-estate assistant, stay grounded in the provided context, and refuse unsupported or unrelated questions.
-
Response Output
- The chatbot returns a refined, context-aware response through CLI or Streamlit UI.
.
βββ app.py # Streamlit app (model selector)
βββ app-nb.py # Streamlit app (simplified)
βββ main.py # CLI chatbot + core logic
βββ Dockerfile
βββ docker-compose.yml
βββ docs/
β βββ faq_real_estate.txt
βββ requirements.txt
βββ .env.example
raw_documents = TextLoader("./docs/faq_real_estate.txt").load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
documents = text_splitter.split_documents(raw_documents)retriever = BM25Retriever.from_documents(documents)
retriever.k = 4template = (
"You are a knowledgeable and friendly real estate assistant at Sunrise Realty Group.\n"
"Use only the information provided in the context.\n"
"If the question cannot be answered from the context or is unrelated to real estate, reply with:\n"
"'I'm sorry, but I don't have information about that based on the provided materials.'\n"
"Context:\n{context}"
)
chat_prompt = ChatPromptTemplate.from_messages([
SystemMessagePromptTemplate.from_template(template),
HumanMessagePromptTemplate.from_template("{question}")
])chain = (
{"context": retriever, "question": RunnablePassthrough()}
| chat_prompt
| ChatOllama(model="gemma3:4b")
| StrOutputParser()
)
response = chain.invoke("What are the closing costs?")docker build -t custom-chatbot-cli .docker run -it --rm --env-file .env custom-chatbot-clidocker-compose up --buildFor Docker Compose, the app containers default to http://host.docker.internal:11434 so they can reach an Ollama server running on your host machine.
Rebuild with changes:
docker-compose up --build --force-recreateMake builds faster by ignoring:
env/
.idea/
__pycache__/
- Real Estate Agents β e.g., Sunrise Realty FAQ bot
- Internal Knowledgebase β HR, IT support, SOPs
- Legal/Compliance Q&A β Clause-specific search
- Education β Course notes and FAQ retrieval
- β
Swap out
faq_real_estate.txtwith any domain-specific.txtcontent indocs/. - β
Update prompt template in
main.pyto reflect your brand tone. - β Replace BM25 with Chroma, FAISS, or Weaviate if you later want semantic search or persistence.
- β
Replace
OllamaEmbeddingswith another local or hosted embedding model if needed. - β Store chat history with SQLite or connect Streamlit to Supabase for persistence.