A machine learning project focused on building an interpretable and accurate credit scoring system, developed for the Engineering Foundations in FinTech (IEDA4500) course at HKUST.
- Classify individuals into Good, Standard, or Poor credit categories
- Maintain high accuracy while improving transparency using SHAP explainability
- Languages & Libraries: Python, scikit-learn, XGBoost, SHAP, pandas, seaborn, matplotlib
- Feature Engineering:
- Debt-to-Income (DTI)
- EMI-to-Salary ratio
- Models Compared:
- Logistic Regression
- Random Forest (selected)
- XGBoost
- Used SHAP values to interpret both global and local model behavior
- Identified top drivers of credit risk: Monthly Balance, Payment Behaviour, Age
- Accuracy: ~80%
- Macro F1-Score: 0.79
- Generated CSV with credit score probabilities for downstream integration
notebooks/: Model training and SHAP explanation scriptsreport.pdf: Final documentationslides.pdf: Presentation deck
Explainable AI (XAI) · Credit Risk Modeling · Model Evaluation · Feature Engineering · FinTech