🛡️ AI-Powered Web Activity Firewall: Real-Time Bot Detection

✨ Project Overview

Bot Detector is a real-time, scalable AI-powered firewall designed to detect and mitigate automated threats across web and network environments. It combines behavioral analytics with deep network-flow profiling to identify bot-driven attacks and anomalous user behavior—without compromising user experience.

🚨 Problem Statement

Modern web platforms face a surge in automated attacks such as credential stuffing, scraping, and reconnaissance. Traditional rule-based firewalls often:

Frustrate legitimate users
Fail to detect novel or low-volume threats
Require constant manual updates
Raise privacy and accessibility concerns

🎯 Solution Highlights

Bot Detector addresses these limitations through:

Real-time detection using streaming data pipelines
Ensemble modeling combining login behavior and network flow
Privacy-conscious design with full control over data
Cloud-native deployment for scalability and resilience

🛠️ Core Features

Streaming Architecture: Kafka-based ingestion and real-time feature extraction
Machine Learning Models: LightGBM classifiers trained on synthetic and real-world datasets
Ensemble Intelligence: Combines login behavior and network-flow metrics for robust scoring
FastAPI Service: Issues JWTs for verified users, blocks bots, and logs activity
Admin Dashboard: Visualizes attack patterns and supports real-time monitoring
Feedback Loop: Continuous model retraining using Optuna for hyperparameter tuning

📊 Datasets Used

Login Behavior Dataset (Synthetic, 400K rows)
- Balanced: 200K benign vs. 200K attack samples
- Features: time_to_submit, failed_login_count_last_10min, user_agent, login_hour, client_ip, password_length, is_username_email
Network Flow Dataset (CIC-IDS2018, ~1.1M flows)
- 78 flow metrics per TCP session
- Attack classes: DoS, brute-force, web attacks, etc.
Ensemble Test Set (10K rows)
- Combines login and flow features to evaluate joint model performance

🏗️ Architecture Overview

Microservices: Containerized components for ingestion, scoring, API, and UI
Deployment: Docker + docker-compose (or Kubernetes), AWS-ready (EC2, ECS, RDS, MSK)
Fallback Logic: Simple rule-based detection if ML model is unavailable

📈 Experimental Results

Model	AUC	FPR @ 95% TPR
Login-only	0.93	8%
Ensemble	0.98	2%

+5 points in AUC
75% reduction in false alarms

🔧 Technologies Used

Python
FastAPI
LightGBM
Kafka
Docker
PostgreSQL
React
Optuna
CIC-IDS2018 dataset

👤 My Role

Led feature engineering and exploratory data analysis
Developed and optimized LightGBM models
Integrated ML components into the backend API
Validated and tuned ensemble models
Collaborated on deployment and system architecture

🚀 Future Improvements

Early integration testing and CI/CD setup
Enhanced user feedback and usability testing
Visual project tracking (e.g., Trello, Gantt charts)
Risk assessment planning at project kickoff

🛠️ Setup Instructions

Clone the Repository:

git clone <repository-url>
cd bot-detector

Install Dependencies:

pip install -r requirements.txt

Set Up Environment:

Configure Kafka, PostgreSQL, and environment variables in .env Example .env:

KAFKA_BOOTSTRAP_SERVERS=localhost:9092
POSTGRES_URL=postgresql://user:password@localhost:5432/bot_detector

Run Services:

docker-compose up -d

Access the Dashboard:

Navigate to http://localhost:3000 for the React-based admin dashboard

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
app		app
bot-detector-ui		bot-detector-ui
infra/nginx		infra/nginx
notebooks		notebooks
scripts		scripts
.dockerignore		.dockerignore
.env		.env
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
Dockerfile.api		Dockerfile.api
Dockerfile.feature-extractor		Dockerfile.feature-extractor
Dockerfile.nginx		Dockerfile.nginx
Dockerfile.scorer		Dockerfile.scorer
README.md		README.md
attack_payload.json		attack_payload.json
auto_test.py		auto_test.py
correlation.png		correlation.png
docker-compose.yml		docker-compose.yml
feature distribution.png		feature distribution.png
raw_event.json		raw_event.json
requirements.txt		requirements.txt
sample_flow.json		sample_flow.json
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🛡️ AI-Powered Web Activity Firewall: Real-Time Bot Detection

✨ Project Overview

🚨 Problem Statement

🎯 Solution Highlights

🛠️ Core Features

📊 Datasets Used

🏗️ Architecture Overview

📈 Experimental Results

🔧 Technologies Used

👤 My Role

🚀 Future Improvements

🛠️ Setup Instructions

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🛡️ AI-Powered Web Activity Firewall: Real-Time Bot Detection

✨ Project Overview

🚨 Problem Statement

🎯 Solution Highlights

🛠️ Core Features

📊 Datasets Used

🏗️ Architecture Overview

📈 Experimental Results

🔧 Technologies Used

👤 My Role

🚀 Future Improvements

🛠️ Setup Instructions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages