Sales Playbook Optimization with Machine Learning
Dynamic B2B Sales Strategy Powered by Predictive Analytics
XGBoost · KMeans Clustering · Streamlit · Docker
DS-5640 Machine Learning | Vanderbilt University
Traditional sales playbooks are static and reactive. This project builds a dynamic alternative using predictive analytics and classification models, delivering an intelligent, evolving guide to improve B2B deal closures. The final deliverable is an interactive Streamlit dashboard backed by a trained ML model in a Dockerized deployment.
Data Scale
19,851
Company records
593
Deal records
6
Models compared
Machine Learning Approach
Trained and compared 6 algorithms to find the best deal outcome predictor:
XGBoost
Winner
Random Forest
AdaBoost
KNN
Decision Tree
Logistic Reg.
XGBoost selected for best test performance and robust generalization
Customer Segmentation (KMeans)
Grouped companies by revenue, engagement, and age into actionable segments:
High-Value
Top priority accounts
Active Clients
Engaged, growing potential
Low-Value
Deprioritize or nurture
Streamlit Dashboard Features
- Deal Outcome Predictor: Predicts win/loss for new deals based on entered parameters
- Dataset Filter Tool: Drill down into data by any column and value
- Sales Summary: Visualize win rates and lead scoring
- High-Value Segments: Identify top-converting industries and company sizes
- Cross-Segment Comparison: Compare win rates across customer attributes
Feature Engineering
- Custom fields for revenue buckets, tech stack indicators, deal size categories
- Categorical encoding, scaling, and imputation strategies
- Leak-proof feature pipeline ensuring clean train/test split
- Handled high-missing columns and outliers in company data
Technologies Used
Team
- Roshan Siddartha Sivakumar
- Xiaochen Liu
- Anna Lorenz
- Najma Thomas-Akpanoko