This project focuses on predicting loan eligibility for applicants based on their personal and financial information. By analyzing data from various Indian states, the goal is to identify patterns and determine whether an applicant is likely to be flagged as a risky candidate for a loan.
Financial institutions often face challenges in assessing the creditworthiness of loan applicants. Using machine learning techniques and applicant data (such as income, employment history, credit history, and demographic information), we aim to build a predictive model that can classify applicants as either eligible or risky.
- Analyze and preprocess loan applicant data from diverse Indian states.
- Develop a machine learning model to predict loan eligibility.
- Evaluate the performance of the model using metrics like accuracy, precision, recall, and F1-score.
- Provide insights into the factors influencing loan eligibility.
The dataset consists of anonymized information about loan applicants, including:
- Demographic Features: Age, Gender, Marital Status, Number of Dependents.
- Employment Features: Employment Type, Income, Loan Amount, Loan Term.
- Credit History: Past loan repayment records, credit scores, etc.
- Geographic Features: State and region of residence.
-
Data Preprocessing:
- Handling missing values.
- Encoding categorical variables.
- Scaling numerical features.
-
Exploratory Data Analysis (EDA):
- Analyzing distributions, correlations, and outliers.
- Identifying significant predictors of loan eligibility.
-
Model Development:
- Comparing algorithms such as Logistic Regression, Decision Trees, Random Forests, Artificial Neural Networks, Naive Bayes, Support Vector Machine.
-
Model Evaluation:
- Using train-test split or cross-validation.
- Assessing metrics like:
- Accuracy
- Precision
- Recall
- F1-score
- ROC-AUC
- Programming Language: Python
- Libraries and Frameworks: Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn
- Version Control: Git