An end-to-end Machine Learning project that predicts whether a loan will be approved or rejected based on applicant demographics, financial profile, and loan details. Built with a full ML pipeline including hyperparameter tuning, fairness analysis, and model explainability and deployed as a Streamlit web app.
Predict a loan applicant's approval status given inputs like age, income, employment experience, credit score, loan amount, education level, and loan purpose.
Target variable: loan_status (binary classification problem: 0 = Rejected, 1 = Approved)
Raw Data → EDA → Outlier Removal → Data Preprocessing → SMOTE Balancing
↓
User Input → Streamlit Web App → Voting Classifier Pipeline → Approved / Rejected
↑
voting_classifier.pkl
Loan-Approval-Predictor/
│
├── .vscode/
├── artifacts/
│ ├── voting_classsifier.pkl
├── .gitignore
├── README.md
├── app.py
├── app_screenshot.png
├── loan-approval-predictor.ipynb
└── requirements.txt
| Feature | Type | Example |
|---|---|---|
| Age | Numeric | 28 |
| Annual Income | Numeric | 55000 |
| Employment Experience | Numeric | 4 |
| Credit Score | Numeric | 700 |
| Loan Amount | Numeric | 12000 |
| Interest Rate (%) | Numeric | 10.5 |
| Loan Percent Income | Numeric | 0.22 |
| Credit History Length | Numeric | 6 |
| Education | Categorical | Bachelor |
| Gender | Categorical | male / female |
| Home Ownership | Categorical | RENT / OWN / MORTGAGE / OTHER |
| Loan Purpose | Categorical | EDUCATION / MEDICAL / VENTURE / PERSONAL / HOMEIMPROVEMENT / DEBTCONSOLIDATION |
| Previous Loan Default | Categorical | Yes / No |
Trained and compared using Optuna hyperparameter tuning, scored by AUC-ROC, with SMOTE applied to handle class imbalance:
- Logistic Regression
- Decision Tree
- Random Forest
- Gradient Boosting
- XGBoost
- LightGBM
- CatBoost
- AdaBoost
- Voting Classifier (Ensemble)
- Stacking Classifier
| Model | Accuracy | Precision | Recall | F1 | AUC-ROC |
|---|---|---|---|---|---|
| Voting Classifier | 0.9258 | 0.8884 | 0.8789 | 0.8835 | 0.9732 |
| GradientBoosting | 0.9259 | 0.8910 | 0.8752 | 0.8828 | 0.9730 |
| LightGBM | 0.9250 | 0.8901 | 0.8731 | 0.8812 | 0.9729 |
| XGBoost | 0.9244 | 0.8875 | 0.8746 | 0.8808 | 0.9729 |
| Stacking Classifier | 0.9220 | 0.8902 | 0.8610 | 0.8745 | 0.9684 |
| RandomForest | 0.9023 | 0.8405 | 0.8761 | 0.8563 | 0.9656 |
- Fairness Analysis performed using
fairlearnmeasuring Demographic Parity Difference (DPD) and Equalized Odds Difference (EOD) across gender and education groups - Bias Mitigation applied using
ThresholdOptimizerto reduce DPD while preserving accuracy - Model Explainability via
SHAPsummary plots, bar plots, and waterfall charts to interpret predictions
git clone https://github.com/Priyesh-DS-Code/loan-approval-predictor.git
cd loan-approval-predictor
python -m venv venv && source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
streamlit run app.py # starts app on http://localhost:8501Python · Scikit-learn · XGBoost · LightGBM · CatBoost · Optuna · SMOTE · SHAP · Fairlearn · Streamlit
Priyesh · GitHub
