A Flask-based chatbot platform that combines retrieval augmented generation (RAG), feedback collection, and lightweight admin tools. The backend stores question-answer pairs in PostgreSQL, retrieves related context with FAISS embeddings, and generates answers through an Ollama-hosted model.
- Flask API (
app/app.py) with blueprints for chat, feedback, autocomplete, admin, chat history, and suggested questions. - PostgreSQL schema managed through SQLAlchemy models (
app/db/models.py) and helper scripts (scripts/create_db.py,scripts/db_utils.py). - Vector search (FAISS) loaded from
data/embeddings/faiss_indexvia LangChain (scripts/rag_utils.py) and autocomplete utilities (scripts/faiss_utils.py). - LLM inference routed to an Ollama server using a Qwen model (
scripts/model_inference.py). Generated answers can be stored for later review. - Client assets: minimal HTML/JS in
app/templates/chat.htmlandapp/static/chat.jsfor manual testing, plus React dependencies listed inpackage.jsonfor richer front-end work.
- Python 3.10+ and Node.js (if you plan to build a React UI)
- PostgreSQL running locally (default URI:
postgresql://postgres:123456@localhost:5432/hu_chatbot2) - An Ollama server listening on
http://localhost:11434/v1with thehf.co/unsloth/Qwen3-4B-Instruct-2507-GGUF:Q8_0model pulled or aliased - FAISS index files at
data/embeddings/faiss_index(see Data preparation)
- Create a virtual environment and install backend dependencies:
python -m venv .venv source .venv/bin/activate pip install -r requirements.txt - Configure PostgreSQL access. Update
app/app.pyor set theSQLALCHEMY_DATABASE_URIenvironment variable before launching (defaults topostgresql://postgres:123456@localhost:5432/hu_chatbot2). - Initialize the database tables:
python -m scripts.create_db
- Prepare vector data (see below) so RAG and autocomplete can run.
- Start the Flask app:
The development server binds to
python -m app.app
http://0.0.0.0:5000.
- Embeddings / RAG context: Build a FAISS index into
data/embeddings/faiss_indexusing scripts such asscripts/create_faiss_from_pdf.py,scripts/generate_questions_from_pdfs.py, orscripts/augment_faiss_with_paraphrases.pydepending on your source documents. - Autocomplete index: Use
scripts/create_autocomplete_index.pyto generate question embeddings for fast prefix matching. - Manual Q&A loading: Import curated pairs with
scripts/load_manual_questions.pyorscripts/load_reference_answers.pybefore fine-tuning or review. - Fine-tuning dataset export: Admin endpoints can write approved answers to
fine_tuning_data.jsonl; the file path defaults to the repository root.
POST /chat— Main chat endpoint; logs user/bot messages, runs DB lookup, RAG retrieval, and Ollama generation.POST /api/autocomplete— Returns up to 5 similar questions for a given prefix.POST /api/suggested_questions— Combines FAISS-based similar questions with random prompts fromdata/finetune/so.jsonl.POST /api/feedback— Records like/dislike feedback linked to aquestion_id.GET /api/chat_sessionsandGET /api/chat_history/<session_id>— Fetch stored chat logs per session.- Admin utilities (prefixed with
/api/): review unapproved answers, approve/update/reject answers, export fine-tuning data, fetch analytics, generate alternative answers, and retrieve Q&A lists (app/routes/admin_routes.py).
- Chat logic:
app/chat/utils.pynormalizes input, checks curated answers, falls back to FAISS context, evaluates answer quality, and logs messages. - Model selection: Update
MODEL_MAPinscripts/model_inference.pyif you swap Ollama models. - Quality controls:
scripts/quality_utils.py,scripts/quality_decision.py, andscripts/evaluate_answer_quality_llm.pyhelp benchmark generated answers and mark low-quality responses. - Testing the UI: Open
app/templates/chat.htmlin a browser while the server runs to submit questions via the lightweight JS client.
React dependencies are declared in package.json. If you build a richer front end:
- Initialize the client (e.g., in a
client/directory) with your preferred React tooling. - Install packages with
npm install. - Point API calls to the Flask server (default
http://localhost:5000).
- Ensure PostgreSQL extensions that provide
similarity(e.g.,pg_trgm) are available if fuzzy matching fails inscripts/db_utils.get_answer_from_db. - Verify
data/embeddings/faiss_indexexists and is readable; otherwise/chatwill fall back to suggestions without RAG context. - If Ollama is unreachable,
/chatrequests will raise server errors—test connectivity with the script footer inscripts/model_inference.py.