Just completed another predictive analytics project using AI and Machine Learning to predict banking customer churn.
Skills:
Python | Google Colab | Numpy | pandas | Matplotlib & Seaborn | Exploratory Data Analysts (EDS) | Cross Validation | Hyperparaemter Tuning | Classification Metrics
Project Overview:
Using anonymized bank customer data—including features like credit score, tenure, balance, products held, and activity status—this project built a binary classification model to predict churn likelihood.
Objective
✅ Identify customers at high risk of churn (binary classification)
✅ Prioritize Recall to reduce false negatives (missed churners)
✅ Balance the dataset using SMOTE to handle class imbalance
✅ Compare optimizer effectiveness (SGD vs. Adam)
✅ Evaluate model robustness with and without Dropout regularization
Key Analytical Approaches:
Data Cleaning & Preprocessing → Checked for nulls, converted categorical features via one-hot encoding, normalized features, and split into train/validation/test.
Exploratory Data Analysis (EDA) → Analyzed churn patterns across customer demographics.
Class Imbalance Correction → Applied SMOTE (Synthetic Minority Over-sampling Technique) to improve minority class representation.
Model Development → Trained and compared: • Neural Network with SGD Optimizer • Neural Network with Adam Optimizer • Neural Network with Dropout Regularization • Neural Network with SMOTE-enhanced training
Evaluation Metrics → Focused on Recall, AUC, F1-Score, and Confusion Matrix.
Hyperparameter Tuning → Experimented with layer sizes, learning rates, optimizers, batch sizes, and epochs to optimize performance.
Results & Business Impact:
✅ Final model (NN with Adam, SMOTE, and Dropout) achieved Recall: 0.80 and AUC: 0.83 on the test set
✅ Significant improvement in detecting churners vs. baseline models
✅ Model prioritizes capturing as many at-risk customers as possible for proactive outreach
✅ Confusion matrix and ROC curves validate robust generalization to unseen data
Tools & Techniques Used:
✅ Python → Core implementation language
✅ Google Colab → Notebook-based development environment
✅ NumPy, pandas → Data wrangling and manipulation
✅ Matplotlib, Seaborn → Visualizations
✅ Scikit-learn → Model evaluation and metrics
✅ TensorFlow / Keras → Neural network development
✅ SMOTE (imblearn) → Synthetic resampling to correct imbalance
Business & Problem-Solving Skills Demonstrated:
✅ Churn Risk Modeling → Built a risk prediction framework tailored to customer retention
✅ Strategic Metric Selection → Prioritized Recall to avoid losing valuable customers
✅ Deep Learning Proficiency → Deployed tuned neural networks with dropout for generalization
✅ Model Interpretability → Delivered confusion matrices and ROC curves for stakeholder review
Project Portfolio Page:
https://datascienceportfol.io/ThomasHall/projects/6
#AI #MachineLearning #PredictiveAnalytics
