🔍 FinCrime SQL Analytics

SQL-driven financial crime analysis on a simulated fintech transactional database. Ten progressive queries — from foundational aggregations to advanced window functions — answering real compliance and analytics questions.

Built as a portfolio project to demonstrate SQL proficiency for Data Analyst and Financial Crime Analyst roles in fintech.

📋 Overview

This project takes a relational database with 4 tables (users, merchants, transactions, alerts) and answers 10 business questions that a compliance or analytics team at a neobank would face daily. Each query teaches a SQL concept while solving a real problem.

Database schema:

users (3,000 rows)          merchants (500 rows)
├── user_id (PK)            ├── merchant_id (PK)
├── signup_date             ├── merchant_name
├── country                 ├── category
├── age                     ├── country
├── risk_tier               └── is_high_risk
└── is_active

transactions (200,000 rows)     alerts (4,000 rows)
├── transaction_id (PK)         ├── alert_id (PK)
├── user_id (FK → users)        ├── transaction_id (FK → transactions)
├── merchant_id (FK → merchants)├── alert_date
├── timestamp                   ├── status
├── amount_usd                  ├── analyst_id
├── country                     └── resolution_days
├── channel
└── is_fraud

🎯 Queries and business questions

Level 1 — Aggregations & GROUP BY

#	Business question	SQL concepts
Q1	Top 5 countries by transaction volume	GROUP BY, COUNT, SUM, ORDER BY, LIMIT
Q2	Fraud rate by merchant category	JOIN, AVG on binary column
Q3	High-volume users with suspicious activity	GROUP BY, HAVING

Level 2 — CASE WHEN, Subqueries, Multiple JOINs

#	Business question	SQL concepts
Q4	Classify transactions into risk bands	CASE WHEN, conditional categorization
Q5	Countries with above-average fraud rates	Subquery in HAVING
Q6	Full alert investigation view (3-table join)	Multiple JOINs, table aliases
Q7	Analyst productivity and detection accuracy	CASE WHEN inside SUM, NULLIF

Level 3 — CTEs & Window Functions

#	Business question	SQL concepts
Q8	Top users by spending with global % share	CTE (WITH), ROW_NUMBER, CROSS JOIN
Q9	Time between consecutive user transactions	LAG, PARTITION BY, velocity detection
Q10	Transaction amount vs user's rolling average	Window frames (ROWS BETWEEN), anomaly scoring

🔬 Key technical highlights

Velocity detection (Q9): uses LAG() OVER (PARTITION BY user_id ORDER BY timestamp) to calculate time gaps between consecutive transactions per user. Transactions with gaps of seconds or minutes indicate card testing or account takeover — the same logic behind real-time fraud monitoring systems.

Anomaly scoring (Q10): computes a rolling average of each user's last 10 transactions using ROWS BETWEEN 10 PRECEDING AND 1 PRECEDING (excluding the current row to prevent data leakage), then flags transactions where the amount exceeds 5x the user's baseline. This is the SQL equivalent of the amount_vs_user_avg feature in the FinCrime Transaction Monitor ML project.

Analyst KPIs (Q7): calculates precision (confirmed fraud / total closed alerts) per analyst using conditional aggregation (CASE WHEN inside SUM), demonstrating how to build operational dashboards from raw alert data.

🔗 Connection to FinCrime Transaction Monitor

This project complements the FinCrime Transaction Monitor (XGBoost + Streamlit). The ML project engineers features in Python; this project shows the same logic can be implemented in pure SQL:

ML feature (Python)	SQL equivalent (this project)
`tx_count_1h` (velocity)	Q9: LAG + time delta
`amount_vs_user_avg` (anomaly)	Q10: rolling AVG with ROWS BETWEEN
`is_high_risk_mcc` (risk flag)	Q4: CASE WHEN risk bands

In production, feature engineering lives in SQL (executed in BigQuery, Snowflake, or a feature store), and the ML model consumes the results. This project demonstrates fluency in both approaches.

🚀 How to run

Prerequisites

Python 3.10+
~50 MB free disk space

Setup

git clone https://github.com/Rivalry11/fincrime-sql-analytics.git
cd fincrime-sql-analytics

python -m venv venv
source venv/bin/activate          # Windows: venv\Scripts\activate
pip install -r requirements.txt

Run

# Generate database (if not present)
python data/generate_db.py

# Open notebook
jupyter notebook notebooks/fincrime_sql_analytics.ipynb

The database file data/fincrime.db is included in the repo, so you can open the notebook directly without generating data.

🛠️ Tech stack

DuckDB — in-process analytical database (no server setup required)
SQL — all analysis in pure SQL, no Python data manipulation
Python — only for database generation and notebook execution
Jupyter — interactive notebook with documented queries

📚 SQL concepts covered

Fundamentals: SELECT, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT, JOIN, aggregate functions (COUNT, SUM, AVG, ROUND)

Intermediate: CASE WHEN, subqueries, multiple JOINs (3+ tables), conditional aggregation (CASE inside SUM), NULLIF, IN operator

Advanced: Common Table Expressions (WITH/CTE), window functions (ROW_NUMBER, LAG), PARTITION BY, window frames (ROWS BETWEEN), CROSS JOIN, EXTRACT

👤 Author

Camila Rubio Cuellar Data Analyst transitioning to Financial Crime Analytics LinkedIn · GitHub

Built April 2026 as part of an active job search portfolio.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔍 FinCrime SQL Analytics

📋 Overview

🎯 Queries and business questions

Level 1 — Aggregations & GROUP BY

Level 2 — CASE WHEN, Subqueries, Multiple JOINs

Level 3 — CTEs & Window Functions

🔬 Key technical highlights

🔗 Connection to FinCrime Transaction Monitor

🚀 How to run

Prerequisites

Setup

Run

🛠️ Tech stack

📚 SQL concepts covered

👤 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
notebooks		notebooks
venv		venv
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🔍 FinCrime SQL Analytics

📋 Overview

🎯 Queries and business questions

Level 1 — Aggregations & GROUP BY

Level 2 — CASE WHEN, Subqueries, Multiple JOINs

Level 3 — CTEs & Window Functions

🔬 Key technical highlights

🔗 Connection to FinCrime Transaction Monitor

🚀 How to run

Prerequisites

Setup

Run

🛠️ Tech stack

📚 SQL concepts covered

👤 Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages