Credit Card Default Risk Analysis

This project aims to predict credit card default risk based on historical customer data. We use machine learning techniques to build a classification model that identifies whether a customer will default on their payment in the next month.

Project Description

The dataset used in this project is sourced from the UCI Machine Learning Repository. It contains information about credit card clients, such as credit limit, age, gender, payment history, and bill amounts. The goal is to predict the default payment next month variable, which indicates whether a customer will default (1) or not (0).

The project follows these steps:

Data Import and Exploration: Loading and initial analysis of the dataset.
Data Cleaning and Preprocessing: Handling missing values, encoding categorical variables, and normalizing numeric features.
Modeling: Training a RandomForestClassifier to predict default risk.
Model Evaluation: Using metrics like precision, recall, F1-score, and confusion matrix to evaluate model performance.
Visualization: Generating charts to interpret feature importance and the confusion matrix.

Technologies Used

Python: Primary programming language.
Pandas: Data manipulation and analysis.
NumPy: Numerical computations.
Scikit-learn: Machine learning (modeling, cross-validation, metrics).
Matplotlib and Seaborn: Data visualization.
Joblib: Saving and loading the trained model.

Project Structure

credit_card_analysis/
├── credit_card_analysis.py       # Main analysis script
├── default of credit card clients.xls  # Dataset
├── credit_card_default_model.pkl # Saved trained model
├── README.md                     # Project documentation
├── requirements.txt              # Project dependencies
├── log.txt                       # Log file for tracking script execution
├── images/                       # Folder containing visualizations
│   ├── feature_importance.png    # Feature importance plot
│   └── confusion_matrix.png      # Confusion matrix plot
│   └── others                    # Other images
└── venv/                         # Virtual environment folder (optional)

Results

The model achieved an average accuracy of 81.7% with cross-validation. However, the recall for class 1 (defaulters) was 36%, indicating that the model struggles to correctly identify customers who will default. Below are some visualizations generated:

Feature Importance:
Confusion Matrix:
Others:

Developer Info

. Developer: Edson Copque
. Website: https://linktr.ee/edsoncopque
. GitHub: https://github.com/ecopque
. Signal Messenger: ecop.01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Credit Card Default Risk Analysis

Project Description

Technologies Used

Project Structure

Results

Developer Info

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
prints		prints
.gitignore		.gitignore
README.md		README.md
credit_card_analysis.py		credit_card_analysis.py
credit_card_default_model.pkl		credit_card_default_model.pkl
default of credit card clients.xls		default of credit card clients.xls
log.txt		log.txt

Folders and files

Latest commit

History

Repository files navigation

Credit Card Default Risk Analysis

Project Description

Technologies Used

Project Structure

Results

Developer Info

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages