Empowering Financial Inclusion with AI-Driven Credit Risk Assessment
- Introduction
- ✨ Features
- 📂 Project Structure
- ⚙️ Installation
- 🚀 Usage
- 🔍 Exploratory Data Analysis (EDA)
- 🛠️ Feature Engineering
- 🤖 Model Training and Evaluation
- 🔮 Model Explainability
- 🌐 API Development
- 💻 Frontend Interface
- 🚀 Deployment
- 🎯 Challenges and Solutions
- 🔮 Future Work
- 🤝 Contributing
The Credit Scoring Model for Bati Bank is an AI-powered platform designed to assess credit risk using eCommerce transaction data. This solution enables financial inclusion through:
- 📈 Accurate Predictions: Random Forest model achieves ROC-AUC: 0.9998
- 🔍 Transparent Decisions: SHAP explanations and feature importance visualizations
- ⚡ Real-Time Processing: FastAPI backend with <100ms response times
- 📱 Mobile-First Interface: Responsive design accessible on all devices
- Automated Data Pipelines
- RFMS scoring (Recency, Frequency, Monetary, Score)
- WoE encoding for categorical features
- Advanced Modeling
- Hyperparameter-tuned Random Forest & Logistic Regression
- Cross-validation with stratified sampling
- Production-Ready Deployment
- Dockerized environment
- CI/CD pipeline with GitHub Actions
- User-Centric Interface
- Dual form system (Quick/Detailed assessment)
- Interactive risk visualization dashboard
dagiteferi-credit-scoring-model/
├── 📁 credit_scoring_app/ # FastAPI backend
├── 📁 models/ # Serialized ML models
├── 📁 notebooks/ # Jupyter analysis notebooks
├── 📁 scripts/ # Data processing scripts
├── 📁 static/ # CSS/JS assets
└── 📁 tests/ # Unit/integration testsgit clone https://github.com/your-repo/dagiteferi-credit-scoring-model.git
cd dagiteferi-credit-scoring-model
python3 -m venv venv
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windows
pip install -r requirements.txt
## 🚀 Usage
### Running the Backend
```bash
cd credit_scoring_app
uvicorn main:app --host 0.0.0.0 --port 8000Access at http://localhost:8000/static/index.html
curl -X POST "http://localhost:8000/predict/good" \
-H "Content-Type: application/json" \
-d '{
"TransactionId": 1,
"Amount": 0.05,
"FraudResult": 0
}'Key Insights:
- 🎯 Class Imbalance: Only 0.2% fraud cases
- 📉 Skewed Distributions: Transaction amounts follow power law
- 🔗 Strong Correlations:
RFMS_score↔Total_Transaction_Amount(ρ=0.89)Transaction_Count↔Product_Variety(ρ=0.76)
Transformations Applied:
- Temporal Features
- Transaction hour/day/month
- Time since last transaction
- Aggregate Features
- 30-day rolling transaction count
- Customer lifetime value
| Model | ROC-AUC | Precision | Recall | F1-Score |
|---|---|---|---|---|
| Random Forest | 0.9998 | 0.997 | 0.998 | 0.997 |
| Logistic Regression | 0.9962 | 0.982 | 0.961 | 0.971 |
SHAP Analysis:
- Top Predictive Features:
Total_Transaction_Amount(SHAP value: 1.42)RFMS_score(SHAP value: 1.18)Transaction_Recency(SHAP value: 0.76)
Endpoints:
@app.post("/predict/good")
async def predict_good_risk(data: CustomerData):
return predict(data, model_path="models/RandomForest_best_model.pkl")