βββββββββββββββ ββββββ βββ ββββββββββ βββββββ βββ βββ ββββββ βββββββ βββββββ ββββββ βββ
βββββββββββββββββββββββββββ βββββββββββββββββββ βββ βββββββββββββββββββββββββββ βββββββββββ
ββββββ βββββββββββββββββββ ββββββ ββββββ βββββββ ββββββββββββββββββββββ βββ βββββββββββ
ββββββ βββββββββββββββββββ ββββββ ββββββ ββββββ ββββββββββββββββββββββ βββ βββββββββββ
βββ βββ ββββββ βββββββββββββββββββββββββββββββββββββββββ ββββββ βββββββββββ βββ ββββββ
βββ βββ ββββββ βββ βββββββ βββββββ βββββββ βββββββ βββ ββββββ ββββββββββ βββ ββββββ
FraudGuard AI is a high-fidelity, production-grade fraud intelligence platform that fuses classical machine learning with generative AI forensics β wrapped in a cyberpunk-grade terminal interface. Detect fraud in milliseconds. Understand it in seconds.
| Capability | Technology | Performance |
|---|---|---|
| Real-time transaction scoring | XGBoost Classifier | 99.98% AUC-ROC |
| Natural language explanations | Groq Β· Llama-3.3-70b | < 800ms response |
| Bank statement forensics | Two-call AI pipeline | PDF + Image support |
| Forensic PDF export | fpdf2 | Full styled report |
| Interactive 3D risk graph | 3d-force-graph | Live node/edge mapping |
graph TB
subgraph CLIENT["π₯οΈ BROWSER CLIENT"]
UI["Cyberpunk Dashboard\n index.html + app.js"]
CHARTS["Chart.js Visualizations"]
GRAPH["3D Force Graph\nRelationship Map"]
end
subgraph API["β‘ FASTAPI BACKEND"]
PREDICT["/predict\nTransaction Scoring"]
ANALYSE["/analyse-statement\nStatement Forensics"]
REPORT["/report/{txn_id}\nPDF Generation"]
EXPORT["/export-statement-report\nStatement PDF"]
STORE["In-Memory\nReport Store\n(OrderedDict, max 50)"]
end
subgraph ML["π€ ML LAYER"]
XGB["XGBoost Classifier\nxboost_model.pkl"]
FEAT["Feature Engineering\nlog_amount, recency, etc."]
end
subgraph AI["π§ AI FORENSICS LAYER"]
GROQ["Groq API"]
LLM["Llama-3.3-70b-versatile\nText Analysis"]
VISION["Vision Model\nImage Statements"]
end
subgraph PDF["π REPORT ENGINE"]
FPDF["fpdf2\nStyled PDF Builder"]
PYPDF["pypdf\nPDF Text Extraction"]
end
UI -->|"POST /predict"| PREDICT
UI -->|"POST /analyse-statement"| ANALYSE
UI -->|"GET /report/{id}"| REPORT
UI -->|"POST /export-statement-report"| EXPORT
PREDICT --> FEAT --> XGB
XGB -->|"fraud_probability"| PREDICT
PREDICT -->|"+ summary"| GROQ
GROQ --> LLM
PREDICT --> STORE
ANALYSE --> PYPDF
ANALYSE --> VISION
PYPDF --> LLM
VISION --> LLM
LLM -->|"Call 1: Extract"| ANALYSE
LLM -->|"Call 2: Risk Score"| ANALYSE
REPORT --> STORE
STORE --> FPDF
FPDF --> REPORT
FPDF --> EXPORT
PREDICT -->|"JSON Response\ntxn_id + scores + summary"| UI
ANALYSE -->|"raw_analysis\nrisk_summary"| UI
UI --> CHARTS
UI --> GRAPH
style CLIENT fill:#0d0d18,stroke:#6366f1,color:#e2e8f0
style API fill:#0d0d18,stroke:#6366f1,color:#e2e8f0
style ML fill:#0d0d18,stroke:#10b981,color:#e2e8f0
style AI fill:#0d0d18,stroke:#f59e0b,color:#e2e8f0
style PDF fill:#0d0d18,stroke:#f43f5e,color:#e2e8f0
sequenceDiagram
participant U as π₯οΈ User
participant FE as Frontend
participant API as FastAPI /predict
participant ML as XGBoost Model
participant G as Groq LLM
participant S as Report Store
U->>FE: Fill form & click INITIATE SCAN
FE->>FE: Assemble JSON payload<br/>(15 features + metadata)
FE->>API: POST /predict
API->>API: Feature engineering<br/>(log_amount, one-hot types)
API->>ML: predict_proba(features)
ML-->>API: fraud_probability: 0.923
API->>API: Compute risk_level<br/>(HIGH / MEDIUM / LOW)
API->>G: Build analyst prompt + invoke
G-->>API: 2β3 sentence summary
API->>S: Store result with txn_id
API-->>FE: {txn_id, fraud_probability,<br/>is_fraud, risk_level, summary}
FE->>FE: Animate gauge arc
FE->>FE: Render verdict card
FE->>FE: Typewriter AI summary
FE->>FE: Update 3D graph
FE-->>U: Full visual dashboard
sequenceDiagram
participant U as π₯οΈ User
participant FE as Frontend
participant API as FastAPI /analyse-statement
participant P as pypdf
participant G as Groq LLM (Call 1)
participant G2 as Groq LLM (Call 2)
U->>FE: Drop PDF or Image file
FE->>API: POST /analyse-statement (multipart)
alt PDF File
API->>P: Extract raw text from all pages
P-->>API: raw_text (string)
else Image File
API->>API: Read bytes, construct prompt
end
API->>G: CALL 1 β Transaction Extraction<br/>"Format all transactions as<br/>DATE | DESCRIPTION | AMOUNT | BALANCE"
G-->>API: raw_analysis (structured list)
API->>API: Count transactions via regex
API->>G2: CALL 2 β Risk Assessment<br/>"Score: Velocity / Destination /<br/>Amount / Pattern / Overall"
G2-->>API: 5-line structured risk_summary
API-->>FE: {file_type, transaction_count,<br/>raw_analysis, risk_summary}
FE->>FE: Parse transactions client-side
FE->>FE: Render verdict strip
FE->>FE: Draw volume bar chart
FE->>FE: Draw balance line chart
FE->>FE: Animate risk DNA bars
FE->>FE: Populate transaction table
FE->>FE: Typewriter risk summary
FE-->>U: Full forensic dashboard
flowchart LR
A["User clicks\nEXPORT FORENSIC REPORT"] --> B["Frontend sends\nGET /report/{txn_id}"]
B --> C{{"txn_id in\nreport_store?"}}
C -->|No| D["404 β Report Not Found"]
C -->|Yes| E["Retrieve stored\nresult dict"]
E --> F["fpdf2 builds PDF"]
F --> G["Page background\n#050508"]
G --> H["Indigo header bar\n+ Report ID"]
H --> I["Verdict banner\nROSE or EMERALD"]
I --> J["Score + Risk tier\ncells"]
J --> K["Feature table\n18 rows alternating"]
K --> L["AI summary block\nwith word wrap"]
L --> M["Watermark +\nFooter"]
M --> N["StreamingResponse\napplication/pdf"]
N --> O["Browser downloads\nFRAUDGUARD_{txn_id}.pdf"]
style A fill:#0d0d18,stroke:#6366f1,color:#e2e8f0
style F fill:#0d0d18,stroke:#f43f5e,color:#e2e8f0
style O fill:#0d0d18,stroke:#10b981,color:#e2e8f0
fraudguard/
β
βββ app/
β βββ main.py β FastAPI core: routing, ML inference, LLM forensics, PDF engine
β βββ static/
β βββ index.html β Cyberpunk dashboard: structure, styling, all UI components
β βββ app.js β Frontend logic: API wiring, charts, 3D graph, animations
β
βββ models/
β βββ xboost_model.pkl β Pre-trained XGBoost classifier (99.98% AUC-ROC)
β
βββ preprocessing/
β βββ prepare_data.py β Retraining Step 1: CSV β Parquet conversion
β βββ features.py β Retraining Step 2: Feature engineering
β βββ split_by_step.py β Retraining Step 3: Train/val/test split by time step
β
βββ train.py β Retrain XGBoost from scratch
βββ requirements.txt β All Python dependencies
βββ .env.example β Environment variable template (add your Groq key)
βββ README.md β This file
- Python
3.9+ - A free Groq API key β console.groq.com/keys
pip install -r requirements.txtOr install directly:
pip install fastapi uvicorn pydantic joblib pandas numpy \
scikit-learn xgboost langchain-groq python-dotenv \
fpdf2 pypdfcp .env.example .envOpen .env and set your key:
GROQ_API_KEY=gsk_your_key_hereuvicorn app.main:app --reloadhttp://localhost:8000
API docs (Swagger UI):
http://localhost:8000/docs
| Method | Endpoint | Description |
|---|---|---|
POST |
/predict |
Run XGBoost inference + Groq summary on a transaction |
GET |
/report/{txn_id} |
Download styled forensic PDF for a scanned transaction |
POST |
/analyse-statement |
Upload PDF/image bank statement for AI forensic analysis |
POST |
/export-statement-report |
Generate and download a styled PDF of the statement analysis |
GET |
/ |
Serve the main dashboard UI |
{
"amount": 9800.00,
"recency_hours": 0.3,
"txn_count_24h": 4,
"is_dest_new": 1,
"hours_day": 2,
"oldbalanceOrg": 21400.00,
"newbalanceOrig": 1800.00,
"oldbalanceDest": 0.00,
"newbalanceDest": 9800.00,
"type_TRANSFER": 1,
"type_CASH_OUT": 0,
"type_CASH_IN": 0,
"type_DEBIT": 0,
"type_PAYMENT": 0,
"currency": "USD",
"user_id": "U-294857",
"transaction_id": "T-9XY8CAQ8"
}{
"txn_id": "f53ebcf10ffd",
"fraud_probability": 0.9821,
"is_fraud": true,
"decision": "FRAUD",
"risk_level": "HIGH",
"summary": "This TRANSFER drains 91.6% of the sender's balance to a first-time recipient at 2AM, a pattern highly consistent with account takeover fraud..."
}| Attribute | Value |
|---|---|
| Algorithm | XGBoost Classifier |
| Training Dataset | PaySim (6.3M synthetic transactions) |
| AUC-ROC | 99.98% |
| Precision (@ 0.8 threshold) | 80.34% |
| Recall | 99.20% |
| Fraud decision threshold | 0.8 (configurable) |
amount log_amount recency_hours txn_count_24h
is_dest_new hours_day oldbalanceOrg newbalanceOrig
oldbalanceDest newbalanceDest type_CASH_IN type_CASH_OUT
type_DEBIT type_PAYMENT type_TRANSFER
If you have the PaySim dataset:
# Step 1 β Prepare raw CSV
python preprocessing/prepare_data.py
# Step 2 β Engineer features
python preprocessing/features.py
# Step 3 β Split by time step
python preprocessing/split_by_step.py
# Train
python train.py