Convert natural language questions into SQL queries using AI.
- Backend: FastAPI + Python
- Frontend: Next.js 14 + TypeScript + Tailwind CSS
- Database: MySQL + Redis Cache
- AI: OpenAI GPT-4
- Python 3.8+
- Node.js 18+
- MySQL running
- Redis running
- OpenAI API key
cd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
pip install -r query_engine/requirements.txt
# Configure environment
cp .env.example .env
# Edit .env with your credentials (OpenAI key, DB credentials, Redis settings)
# Run backend
python main.pyBackend runs on: http://localhost:8000
cd frontend
# Install dependencies
npm install
# Configure environment
cp .env.local.example .env.local
# Edit .env.local with your API URL (NEXT_PUBLIC_API_URL=http://localhost:8000)
# Run development server
npm run devFrontend runs on: http://localhost:3000
llm_sql/
├── backend/
│ ├── main.py (FastAPI server)
│ ├── requirements.txt
│ ├── .env.example
│ └── query_engine/ (SQL query generator)
│
├── frontend/
│ ├── app/ (Next.js pages)
│ ├── components/ (React components)
│ ├── lib/api.ts (API client)
│ ├── package.json
│ └── .env.local.example
│
└── assets/
-
Terminal 1 - Start Backend:
cd backend source venv/bin/activate python main.py
-
Terminal 2 - Start Frontend:
cd frontend npm run dev -
Browser - Open:
http://localhost:3000
backend/.env.example- Backend configuration templatefrontend/.env.local.example- Frontend configuration template
Copy these files and fill in your credentials:
- OpenAI API key
- MySQL connection details
- Redis connection details
- API URL (frontend only)
cd backend
# Setup
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
pip install -r query_engine/requirements.txt
# Configure
cp .env.example .env
# Edit .env with your credentials
# Run
python main.pyBackend ready at: http://localhost:8000
API docs at: http://localhost:8000/docs
cd frontend
# Install dependencies
npm install
# Configure
cp .env.local.example .env.local
# Update NEXT_PUBLIC_API_URL if needed (default: http://localhost:8000)
# Run
npm run devFrontend ready at: http://localhost:3000
Open your browser to: http://localhost:3000
Browser (Next.js) http://localhost:3000
↓ HTTP + JSON
↓ TypeScript API client
↓
FastAPI Backend http://localhost:8000
├─ REST API Endpoints
├─ Session Management
└─ Query Routing
↓
Query Engine (Python + MCP)
├─ OpenAI GPT-4 Integration
├─ MySQL Database Access
└─ Redis Caching
✅ Natural Language to SQL
- Convert English questions to optimized SQL queries
- Uses OpenAI GPT-4 for intelligent query generation
✅ Intelligent Caching
- Redis-based caching for repeated queries
- Template management for common questions
- Fast retrieval of frequently asked questions
✅ Modern UI
- Beautiful, responsive chat interface
- Built with Next.js + TypeScript + Tailwind CSS
- Mobile-friendly design
- Dark mode ready (extensible)
✅ Type-Safe Development
- Full TypeScript support frontend & backend
- API response types
- Component prop types
- Better IDE support
✅ Session Management
- Conversation history per session
- Multiple independent sessions
- Clear history option
✅ Statistics & Analytics
- View query statistics
- Track cached queries
- Monitor usage patterns
GET /health- Server health check
POST /api/chat- Send query?message=How many users?&session_id=default
GET /api/history- Get conversation historyPOST /api/clear-history- Clear history
GET /api/stats- Get usage statisticsGET /api/suggestions- Get query suggestionsGET /api/templates- Get cached queries
- FastAPI - Modern Python web framework
- Uvicorn - ASGI server
- MCP (Model Context Protocol) - LLM integration
- OpenAI - GPT-4 API
- MySQL - Database
- Redis - Caching
- Python 3.8+
- Next.js 14 - React framework
- TypeScript - Type safety
- Tailwind CSS - Styling
- React Icons - Icon library
- Axios - HTTP client
- Node.js 18+
API_HOST=0.0.0.0
API_PORT=8000
DEBUG=False
OPENAI_API_KEY=sk-...
DB_HOST=localhost
DB_USER=root
DB_PASSWORD=password
DB_NAME=database
REDIS_HOST=localhost
REDIS_PORT=6379NEXT_PUBLIC_API_URL=http://localhost:8000- backend/README.md - Backend API setup & details
- frontend/README.md - Frontend setup & customization
- backend/query_engine/README.md - Query engine details
cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt -r query_engine/requirements.txt
python main.py # Start servercd frontend
npm install
npm run dev # Development
npm run build # Production build
npm run start # Production server
npm run type-check # TypeScript checkFROM python:3.11-slim
WORKDIR /app
COPY requirements.txt query_engine/requirements.txt ./
RUN pip install -r requirements.txt -r query_engine/requirements.txt
COPY . .
CMD ["python", "main.py"]FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
EXPOSE 3000
CMD ["npm", "start"]Error: "Connection refused" (MySQL/Redis)
# Check services running
mysql -u root -p -e "SELECT 1"
redis-cli pingError: "Module not found"
pip install -r requirements.txt -r query_engine/requirements.txtError: "Cannot find module"
rm -rf node_modules package-lock.json
npm installError: API calls failing
- Check
NEXT_PUBLIC_API_URLin.env.local - Verify backend is running
- Check browser console for CORS errors
pip install gunicorn
gunicorn -w 4 -b 0.0.0.0:8000 main:appvercel deploy# Backend
API_PORT=8000
DEBUG=False
OPENAI_API_KEY=sk-prod-...
# Frontend
NEXT_PUBLIC_API_URL=https://api.yourdomain.com- Enable Redis caching (recommended)
- Use production ASGI server (Gunicorn/Gunicorn-gevent)
- Enable Next.js output caching
- Use CDN for static files
- Database connection pooling
Contributions welcome! Please:
- Create a new branch
- Make your changes
- Test thoroughly
- Submit a pull request
See LICENSE file
- Backend:
cd backend && python main.py - Frontend:
cd frontend && npm run dev - Browser:
http://localhost:3000
Enjoy natural language querying! 🚀
This project bridges the gap between natural language and SQL by leveraging Large Language Models (LLM) to automatically understand user questions and generate optimized database queries. The system includes intelligent caching, template management, and token cost tracking.
- Natural Language Processing: Convert plain English questions into SQL queries
- Intelligent Table Selection: LLM automatically identifies relevant tables needed to answer questions
- Smart Caching: Redis-based caching with TTL to avoid redundant queries and API calls
- Template System: Store and reuse common queries and their answers
- Cost Tracking: Monitor OpenAI API usage and estimated costs
- Error Handling: Robust error management for database and API failures
- MCP Server: Built on Model Context Protocol for seamless integration
┌─────────────────────────────────────────────────┐
│ User Question (Natural Language) │
└─────────────────────┬───────────────────────────┘
│
▼
┌─────────────────────────────┐
│ Template/Cache Check │
│ (Redis + templates.json) │
└─────────────────┬───────────┘
│
┌────────────────┴────────────────┐
│ Cache Hit? │
│ (Return Cached Answer) │
│ Cache Miss ↓ │
│ │
▼ │
┌────────────────────────┐ │
│ Get Relevant Tables │ │
│ (LLM: GPT-4o) │ │
└────────────┬───────────┘ │
│ │
▼ │
┌────────────────────────┐ │
│ Fetch Table Schema │ │
│ (MySQL Database) │ │
└────────────┬───────────┘ │
│ │
▼ │
┌────────────────────────┐ │
│ Generate SQL Query │ │
│ (LLM: GPT-4o) │ │
└────────────┬───────────┘ │
│ │
▼ │
┌────────────────────────┐ │
│ Execute SQL Query │────────────┐
│ (MySQL Database) │ │
└────────────┬───────────┘ │
│ │
▼ ▼
┌────────────────────────────────────────┐
│ Format Answer & Cache Result │
│ (Store in Redis + templates.json) │
└────────────┬─────────────────────────┘
│
▼
┌────────────────────────┐
│ Return Answer to User │
└────────────────────────┘
┌────────────────────────────────────────────────────────────────┐
│ User Browser │
│ (Web Chat Interface) │
│ http://localhost:5000 - Beautiful UI Interface │
└────────────────────┬───────────────────────────────────────────┘
│ HTTP Requests
│ (JSON over HTTP)
▼
┌─────────────────────────────┐
│ Flask Web Client │
│ (client.py) │
│ │
│ - REST API Endpoints │
│ - Session Management │
│ - Message Routing │
│ - Error Handling │
└────────────┬────────────────┘
│
│ MCP Protocol / Function Calls
│
▼
┌─────────────────────────────────────────────┐
│ MCP Server (Main Query Processor) │
│ (main.py or main.py) │
│ │
│ - Natural Language Understanding │
│ - Table Selection (LLM) │
│ - SQL Generation (LLM) │
│ - Query Execution │
│ - Result Caching │
└────────┬──────────────────┬─────────────────┘
│ │
┌────────▼──────┐ ┌──────▼──────────┐
│ MySQL DB │ │ Redis Cache │
│ │ │ │
│ - Execute │ │ - Store Results │
│ Queries │ │ - TTL Cache │
│ - Schema Ops │ │ - Fast Lookups │
└───────────────┘ └────────┬────────┘
│
┌────────▼─────────┐
│ templates.json │
│ │
│ - Query Templates │
│ - Preset Answers │
│ - Usage Stats │
└───────────────────┘
┌────────────────────────────────────────┐
│ OpenAI GPT-4o (External LLM) │
│ │
│ - Table Selection │
│ - SQL Generation │
│ - Answer Formatting │
└────────────────────────────────────────┘
The web interface includes:
- Chat Interface: Beautiful message-based UI
- Dark Mode: Toggle between light and dark themes
- Real-time Updates: Instant message display
- Query History: See all recent queries
- Statistics: Monitor usage and cache hits
- Templates: Quick suggestions for common queries
Add your screenshot to static/image.png for visual documentation
llm_sql/
├── main.py # Primary MCP server with core functionality
├── main.py # Alternative version with enhanced structure
├── client.py # Flask web interface client
├── database.py # MySQL database operations
├── cache.py # Redis caching utility
├── templates.json # Cached queries and answers storage
├── requirements.txt # Python dependencies
├── README.md # Main documentation
├── WEBINTERFACE.md # Web interface guide
├── static/ # Frontend assets
│ ├── style.css # CSS styling
│ ├── script.js # JavaScript frontend
│ └── image.png # Interface screenshot (add your image here)
└── templates/ # HTML templates
└── index.html # Chat interface HTML
-
main.py: FastMCP server that orchestrates the entire workflow
ask_product_data(): Main tool for processing natural language questionsget_relevant_tables(): Identifies necessary tables using LLMget_sql_from_llm(): Generates SQL from natural language- Template management for caching
-
main.py: Enhanced version (recommended for production)
- Better error handling
- Improved token usage tracking
- Extended template system with usage analytics
- More robust logging
-
database.py: MySQL integration layer
get_table_list(): Retrieves all table namesget_specific_schema(): Gets schema for targeted tables (token-efficient)execute_read_query(): Executes SELECT queries safely- Connection pooling and error handling
-
cache.py: Redis caching layer
get_cached_query(): Retrieves cached resultsset_cached_query(): Stores query results with TTL- Environment-based configuration
-
templates.json: Template storage
- Pre-computed query-answer pairs
- Stores raw database results
- LLM-generated answers
- Token usage information
-
client.py: Flask web client
- RESTful API endpoints for chat
- Session management
- Query caching and statistics
- Error handling and logging
-
WEBINTERFACE.md: Web interface documentation
- Setup and running instructions
- Feature overview
- Usage guide and examples
- Keyboard shortcuts
- Troubleshooting
-
static/: Frontend assets
style.css: Professional CSS with dark modescript.js: Interactive JavaScript with WebSocket supportimage.png: Interface screenshot/diagram
-
templates/: HTML templates
index.html: Main chat interface
- Python 3.8+
- MySQL Server running
- Redis Server running
- OpenAI API key
cd /home/anaghk/Public/Code/llm_sqlpython3 -m venv venv
source venv/bin/activatepip install -r requirements.txtCreate a .env file in the project root:
# OpenAI API Configuration
OPENAI_API_KEY=sk-your-api-key-here
# MySQL Database Configuration
DB_HOST=localhost
DB_USER=root
DB_PASSWORD=your_password
DB_NAME=your_database_name
# Redis Configuration
REDIS_HOST=localhost
REDIS_PORT=6379
CACHE_TTL=3600
# MCP Server Configuration
MCP_HOST=localhost
MCP_PORT=8000# Test MySQL connection
python3 database.py
# Test Redis connection
python3 cache.py
# Test OpenAI API
python3 -c "from openai import AsyncOpenAI; print('OpenAI SDK loaded')"The easiest way to get started is to run both the MCP server and web client together.
Terminal 1 - Start the MCP Server:
cd /home/anaghk/Public/Code/llm_sql
source venv/bin/activate
python3 main.pyTerminal 2 - Start the Web Client (in a new terminal):
cd /home/anaghk/Public/Code/llm_sql
source venv/bin/activate
python3 client.pyThen open in browser: http://localhost:5000
Create run_all.sh in the project root:
#!/bin/bash
echo "Starting LLM SQL Chat System..."
echo "================================"
# Activate virtual environment
source venv/bin/activate
# Start MCP Server in background
echo "[1/2] Starting MCP Server (main.py)..."
python3 main.py &
MCP_PID=$!
echo " PID: $MCP_PID"
# Wait for MCP server to start
sleep 2
# Start Web Client in background
echo "[2/2] Starting Web Client (client.py)..."
python3 client.py &
WEB_PID=$!
echo " PID: $WEB_PID"
echo ""
echo "✅ Both services started!"
echo " - MCP Server: Running (PID: $MCP_PID)"
echo " - Web Client: http://localhost:5000 (PID: $WEB_PID)"
echo ""
echo "Press Ctrl+C to stop all services..."
echo ""
# Keep script running
waitMake it executable and run:
chmod +x run_all.sh
./run_all.shOption 1: Using main.py (Standard)
source venv/bin/activate
python3 main.pyThe MCP server will start and listen for connections.
Option 2: Using main.py (Recommended - Enhanced Version)
source venv/bin/activate
python3 main.pyFeatures better error handling and cost tracking.
Option 3: Run with Environment Reload
source venv/bin/activate
python3 main.py --reloadIf the MCP server is already running elsewhere:
source venv/bin/activate
python3 client.py --host 0.0.0.0 --port 5000 --debugForwarding to a remote MCP server:
source venv/bin/activate
python3 client.py --host 0.0.0.0 --port 5000Before running both services, verify:
- Virtual environment activated
- Dependencies installed:
pip install -r requirements.txt -
.envfile configured with all credentials - MySQL server running
- Redis server running
- Ports 5000 (Flask) and any MCP ports are available
Terminal 1 - MCP Server:
WARNING:asyncio:... Creating LLM task...
Starting main server...
Server running...
Terminal 2 - Web Client:
[INFO] Using main (enhanced MCP server)
* Running on http://localhost:5000
* Press CTRL+C to quit
Once both are running, open: http://localhost:5000
Once the server is running, you can ask questions:
Question: "How many users are there?"
Response: (Cached) There are a total of 158 users.
Question: "From order table find top 5 products which was ordered most"
Response: Top products by order quantity with detailed breakdown
Question: "From product table find 5 products who are very recent and give product_id and title"
Response: 5 most recently added products with IDs and titles
The system uses a two-tier caching approach:
| Cache Tier | Storage | TTL | Use Case |
|---|---|---|---|
| L1 Cache | templates.json | Permanent | Common/preset queries |
| L2 Cache | Redis | 1 hour (default) | Recently executed queries |
- Selective Schema Loading: Only fetches schemas for identified tables (not all tables)
- Template Reuse: Avoids re-processing identical questions
- Cost Tracking: Logs token usage per query for optimization
- Table Selection: LLM identifies relevant tables from the schema
- Schema Fetch: Only retrieve schemas for selected tables
- Query Generation: LLM creates optimal SQL
- Result Processing: Format and cache results
Input:
question(string): Natural language question about product data
Output:
- String: Formatted answer with query results
Example:
result = await ask_product_data("How many orders are there?")DB_HOST=localhost # MySQL server hostname
DB_USER=root # MySQL username
DB_PASSWORD=your_pwd # MySQL password
DB_NAME=database_name # Database to queryREDIS_HOST=localhost # Redis server hostname
REDIS_PORT=6379 # Redis port
CACHE_TTL=3600 # Cache time-to-live in secondsOPENAI_API_KEY=sk-... # Your OpenAI API key
OPENAI_MODEL=gpt-4o # Model to use (default: gpt-4o)- First query: ~3-5 seconds (LLM processing + DB execution)
- Cached query: ~50-100ms (Redis lookup)
- Template query: <10ms (JSON lookup)
- Average simple query: 500-800 tokens
- Complex aggregation: 1000-1500 tokens
- Cost per query: ~$0.005-$0.01 (with GPT-4o)
- Use Redis with adequate memory
- Regularly archive old templates
- Monitor token usage via cost tracking in main.py
- Index frequently queried columns in MySQL
| Issue | Cause | Solution |
|---|---|---|
| "Database connection failed" | MySQL not running | Start MySQL: sudo service mysql start |
| "No cached result" first query | Cache empty | Normal - LLM will process |
| "Error connecting to Redis" | Redis not running | Start Redis: redis-server |
| "Invalid API key" | Wrong OpenAI key | Check .env file and OpenAI dashboard |
| "Table not found" (SQL error) | Wrong table name | Verify table exists in schema |
| "MCP server not initialized" (Web Client) | MCP server not running | Start MCP: python3 main.py in another terminal |
| "Cannot connect to Flask" (http://localhost:5000 fails) | Web client not running | Start client: python3 client.py |
| "Port already in use" | Another service using port 5000 | Use different port: python3 client.py --port 8080 or kill process: lsof -i :5000 |
| "Module import error" in client.py | Missing dependencies | Run: pip install -r requirements.txt |
Create .env.example for team reference:
# Copy this file to .env and fill in your values
OPENAI_API_KEY=sk-your-key
DB_HOST=localhost
DB_USER=root
DB_PASSWORD=your_password
DB_NAME=your_database
REDIS_HOST=localhost
REDIS_PORT=6379
CACHE_TTL=3600To add visual documentation of the web interface:
- mcp: Model Context Protocol SDK
- mysql-connector-python: MySQL database driver
- redis: Redis client library
- python-dotenv: Environment variable management
- openai: OpenAI API client (async support)
- Flask: Web framework
- Flask-CORS: CORS support for Flask
See requirements.txt for all versions.
- Never commit
.envfile - Add to.gitignore - Use environment variables for all credentials
- Validate user queries - Sanitize inputs
- Restrict database user - Use least privilege
- Enable Redis authentication in production
- Use HTTPS for API calls
This project is licensed under the MIT License - see the LICENSE file for details.
Permissions:
- ✅ Commercial use
- ✅ Modification
- ✅ Distribution
- ✅ Private use
Conditions:
- 📋 License and copyright notice required
Guidelines for contributors:
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
For issues or questions:
- Check the troubleshooting section
- Review existing templates in
templates.json - Test database connectivity
- Verify OpenAI API credits
Here are examples of queries you can ask in the chat interface. The system will automatically detect if a visualization is needed and display charts where appropriate.
These queries will automatically generate charts for data visualization:
Show me orders every month for the last year
Response: Line chart showing order count by month + text analysis
- Chart Type: Line Chart
- Use Case: Track trends over time, identify seasonal patterns
- Visualization: True, chart_type: "line"
Show order distribution by category
Response: Pie chart breaking down orders by product category
- Chart Type: Pie Chart
- Use Case: Understand composition, see category breakdown
- Visualization: True, chart_type: "pie"
Show me the top 5 products ordered most
Response: Bar chart comparing product sales
- Chart Type: Bar Chart
- Use Case: Identify best-selling products, benchmark performance
- Visualization: True, chart_type: "bar"
Plot total revenue per month with trends
Response: Line chart showing revenue progression
- Chart Type: Line Chart
- Use Case: Monitor revenue performance, forecast trends
- Visualization: True, chart_type: "line"
Break down sales percentage by product type
Response: Pie chart with percentage distribution
- Chart Type: Pie Chart
- Use Case: Understand market share, optimize inventory
- Visualization: True, chart_type: "pie"
Compare sales for each week this month
Response: Bar chart with weekly performance
- Chart Type: Bar Chart
- Use Case: Weekly performance tracking, consistency analysis
- Visualization: True, chart_type: "bar"
These queries return textual analysis without charts:
How many orders are there in total?
Response: "There are 15,847 total orders in the system..."
How many users are registered?
Response: "You have 2,341 registered users..."
What is the average order value?
Response: "The average order value is $125.43..."
How many orders are in processing status?
Response: "There are 342 orders currently in processing status..."
Which product has been ordered the most?
Response: "Product 'USB-C Cable' has been ordered 1,247 times..."
What is the total revenue from all orders?
Response: "Total revenue is $1,987,654.32..."
How many orders does the average customer place?
Response: "The average customer places 6.7 orders..."
Calculate the total refunded amount from refunds
Response: "Total refunded amount is $87,932.45..."
Find 5 most recent products with product_id and title
Response: List of 5 recent products with details
List orders from customer with ID 123
Response: Complete order history for that customer
How many orders were placed in March?
Response: Order count for March with breakdown
Find products priced between $50 and $100
Response: List of products in that price range with count
Count orders by status
Response: Breakdown of orders across all statuses
"What was our best selling month?"
"Compare last month's sales to this month"
"Show me revenue trend for the last quarter"
"Which customers spent the most?"
"List our top 10 best sellers"
"Show products with low stock"
"Which products have never been ordered?"
"Compare sales performance by product category"
"How many new customers this month?"
"What is customer retention rate?"
"Find customers who haven't ordered in 90 days"
"List VIP customers (high spending)"
"What is the average order processing time?"
"Show order completion rate by status"
"Compare Q1 vs Q2 performance"
"Analyze refund rate trends"
- Be Specific: "Top 5 products" works better than "products"
- Use Time References: "last month", "this quarter", "past 30 days"
- Request Visualizations: Say "show me a chart of..." for automatic visualization
- Include Metrics: "revenue", "orders", "sales", "customers"
- Natural Language: Write as you would speak to a colleague
{
"success": true,
"message": "Show me orders every month",
"answer": "Based on database analysis, order volume shows steady growth with seasonal peaks in Q4...",
"cached": false,
"timestamp": "2026-04-02T12:30:00.000000",
"source": "llm",
"visualise": true,
"chart_type": "line",
"chart_data": {
"labels": ["Jan", "Feb", "Mar", "Apr", "May", "Jun"],
"datasets": [{
"label": "Orders",
"data": [1250, 1340, 1180, 1450, 1620, 1890],
"borderColor": "rgb(75, 192, 192)",
"backgroundColor": "rgba(75, 192, 192, 0.1)"
}]
}
}{
"success": true,
"message": "How many orders are there?",
"answer": "There are 15,847 total orders in the system. Of these, 12,450 are completed, 2,150 are pending, and 1,247 are cancelled.",
"cached": false,
"timestamp": "2026-04-02T12:32:00.000000",
"source": "llm",
"visualise": false
}- Open http://localhost:3000
- Type any question about your database
- Sit back and watch AI-powered answers with charts appear instantly!
Example First Question:
Show me orders every month for the last year
Last Updated: April 2, 2026
Version: 1.1
Status: Production Ready with Visualization Support
