Customer Churn Prediction for SmartBank
Built a churn prediction model for a Lloyds subsidiary (SmartBank) using EDA, feature engineering, and supervised learning to identify at-risk customers and support targeted retention strategies.
Project summary
This project focuses on predicting customer churn for SmartBank, a fictional Lloyds subsidiary, using historical customer behaviour, demographics, and product usage data. The goal was to give the business an early warning system to identify customers likely to leave, enabling proactive retention actions.
Business problem & objectives
Banks lose significant revenue when high-value customers close their accounts. SmartBank wanted to:
- Identify customers most likely to churn in the near future.
- Understand which factors drive churn behaviour (e.g., complaints, low activity, fees).
- Prioritise retention campaigns for high-risk and high-value segments.
Data, modelling & workflow
- Performed exploratory data analysis (EDA) to understand distributions, correlations, and data quality.
- Engineered features from transaction history, tenure, complaints, and product holdings.
- Handled class imbalance using techniques like class weighting / resampling.
- Trained baseline models (Logistic Regression) and compared with tree-based methods.
- Evaluated models using AUC, precision/recall, and business-friendly metrics.
Key insights & outcomes
- Model successfully differentiated churners from non-churners with strong AUC and recall on high-risk customers.
- Top drivers of churn included reduced account activity, increased complaints, and low product engagement.
- Suggested a prioritisation strategy: focus retention offers on high-risk, high-LTV customers first.
What I learned
This project strengthened my end-to-end workflow skills: from cleaning and exploring real-world style banking data, to feature engineering, model evaluation, and framing results in a way that is useful for business stakeholders rather than just technically interesting.