Just completed another predictive analytics project using AI and Machine Learning to predict banking customer churn.

Skills:

Python | Google Colab | Numpy | pandas | Matplotlib & Seaborn | Exploratory Data Analysts (EDS) | Cross Validation | Hyperparaemter Tuning | Classification Metrics

Project Overview:

Using anonymized bank customer data—including features like credit score, tenure, balance, products held, and activity status—this project built a binary classification model to predict churn likelihood.

Objective

✅ Identify customers at high risk of churn (binary classification)

✅ Prioritize Recall to reduce false negatives (missed churners)

✅ Balance the dataset using SMOTE to handle class imbalance

✅ Compare optimizer effectiveness (SGD vs. Adam)

✅ Evaluate model robustness with and without Dropout regularization

Key Analytical Approaches:

Data Cleaning & Preprocessing → Checked for nulls, converted categorical features via one-hot encoding, normalized features, and split into train/validation/test.

Exploratory Data Analysis (EDA) → Analyzed churn patterns across customer demographics.

Class Imbalance Correction → Applied SMOTE (Synthetic Minority Over-sampling Technique) to improve minority class representation.

Model Development → Trained and compared: • Neural Network with SGD Optimizer • Neural Network with Adam Optimizer • Neural Network with Dropout Regularization • Neural Network with SMOTE-enhanced training

Evaluation Metrics → Focused on Recall, AUC, F1-Score, and Confusion Matrix.

Hyperparameter Tuning → Experimented with layer sizes, learning rates, optimizers, batch sizes, and epochs to optimize performance.

Results & Business Impact:

✅ Final model (NN with Adam, SMOTE, and Dropout) achieved Recall: 0.80 and AUC: 0.83 on the test set

✅ Significant improvement in detecting churners vs. baseline models

✅ Model prioritizes capturing as many at-risk customers as possible for proactive outreach

✅ Confusion matrix and ROC curves validate robust generalization to unseen data

Tools & Techniques Used:

✅ Python → Core implementation language

✅ Google Colab → Notebook-based development environment

✅ NumPy, pandas → Data wrangling and manipulation

✅ Matplotlib, Seaborn → Visualizations

✅ Scikit-learn → Model evaluation and metrics

✅ TensorFlow / Keras → Neural network development

✅ SMOTE (imblearn) → Synthetic resampling to correct imbalance

Business & Problem-Solving Skills Demonstrated:

✅ Churn Risk Modeling → Built a risk prediction framework tailored to customer retention

✅ Strategic Metric Selection → Prioritized Recall to avoid losing valuable customers

✅ Deep Learning Proficiency → Deployed tuned neural networks with dropout for generalization

✅ Model Interpretability → Delivered confusion matrices and ROC curves for stakeholder review

Project Portfolio Page:

https://datascienceportfol.io/ThomasHall/projects/6

#AI #MachineLearning #PredictiveAnalytics

Reply to this note

Please Login to reply.

Discussion

No replies yet.