Predictive Modeling Plan for
Delinquency Risk
Step 1: Predictive Model Logic
Pipeline Overview:
- 1. Data Preprocessing: Impute missing Income and Loan_Balance, encode categorical
features, normalize numeric variables.
- 2. Feature Selection: Use domain knowledge and correlation analysis to select relevant
variables.
- 3. Model Options:
- - Simple: Logistic Regression (interpretable, widely used in finance).
- - Complex: Random Forest (handles non-linearity & feature interactions).
- 4. Chosen Model: Logistic Regression (for interpretability and compliance).
- 5. Top 5 Features: Missed_Payments, Credit_Utilization, Credit_Score,
Debt_to_Income_Ratio, Payment_History_Pattern.
Step 2: Model Justification
Logistic Regression is widely adopted in financial services because it provides clear,
interpretable outputs essential for regulatory compliance (e.g., Basel guidelines). While
tree-based models can capture complex patterns, they are harder to explain, which could
create challenges during audits. Logistic Regression balances accuracy, simplicity, and
transparency—crucial for building trust with regulators and customers.
Step 3: Evaluation Strategy
Metrics:
- Accuracy: Overall correctness.
- Precision & Recall: Handle class imbalance.
- F1-score: Trade-off between precision and recall.
- AUC-ROC: Ability to distinguish risky vs safe customers.
- Fairness Checks: Compare error rates across demographics (e.g., location, employment
status).
Bias Mitigation:
- Re-sampling techniques (SMOTE for imbalance).
- Fairness-aware algorithms (post-processing adjustments).
- Continuous monitoring for drift & bias.