A collection ofย machine learning projects in finance, retail, and customer analyticsย โ covering predictive modelling, classification, forecasting, and interactive app deployment.
โจย Featured Projects
๐ Diamond Price Prediction
An end-to-end machine learning project predicting diamond prices using the industry's 4C model (Cut, Color, Clarity, Carat) with advanced feature engineering.
What makes this project different? I benchmarked linear vs. ensemble models, engineered PCA + polynomial features, and deployed the best model (XGBoost, Rยฒ = 0.982) as a Streamlit web app.
Featured Content:
- ๐ EDA & Feature Engineering โ PCA, polynomial features, encoding
 - ๐ค Model Comparison โ Linear, RF, Gradient Boosting, XGBoost
 - ๐ Deployment โ Interactive Streamlit web app for real-time prediction
 - ๐ก Insights โ Carat weight strongest predictor (92% correlation), Ideal cut commands premium
 
Key Insights:
- Carat weight dominates pricing (Rยฒ = 0.92 with price).
 - Cut quality (Ideal & Premium) drives premium valuation.
 - Color + Clarity add moderate but complementary effects.
 - Business Strategy: Ideal cut + VS2 clarity balance brilliance and affordability.
 
Outcome:
Delivered a production-ready predictive model with a Streamlit app that provides real-time price estimates and business insights โ supporting jewelers, sellers, and buyers in making smarter pricing and inventory decisions.
๐ ๏ธ Tools & Tech:
- ๐ Python (pandas, numpy, seaborn, matplotlib)
 - ๐ค scikit-learn ยท XGBoost
 - ๐ Streamlit ยท Pickle for model storage
 
Explore the full Project on Github | ๐ Live Dashboard
โ๏ธ SuperStore Sales Analysis & Profit Forecasting
An end-to-end financial analytics & machine learning project applying predictive and prescriptive analytics to the SuperStore dataset.
๐ก What makes this project different?
I combined profit classification (85%+ accuracy), sales forecasting, and Power BI dashboards to deliver ROI-driven strategies for management.
Featured Content
- ๐ EDA & Feature Engineering โ Profit margins, CLV, discount effectiveness
 - ๐ค Predictive Models โ Regression, classification (profitable vs. loss-making), sales forecasting
 - ๐ Business Intelligence โ Power BI dashboards with drill-down by region, segment, category
 - ๐ก Strategic Insights โ Discount thresholds, customer segmentation, regional performance metrics
 
Key Insights
- Discounting erodes margins โ aggressive discounts cut profitability
 - Customer segmentation pays off โ top customers increase CLV by 15โ25%
 - Regional performance varies โ tailored strategies improve profitability by 20โ30%
 - Seasonal trends matter โ forecasting improves inventory & cash flow
 
Outcome
Delivered executive-ready dashboards and a profit classification model (85%+ accuracy), enabling strategies for:
- Margin optimisation
 - Customer retention
 - Regional growth
 - ROI-focused decision-making
 
๐ ๏ธ Tools & Tech
- ๐ Python (pandas, NumPy, matplotlib, seaborn, scikit-learn)
 - ๐ค Regression ยท Classification ยท Forecasting
 - ๐ Power BI (interactive dashboards, variance analysis, ROI scenarios)
 
Explore the full Project on Github | ๐ Live Dashboard
๐ ML Zoomcamp โ Applying Machine Learning in Financial Data Analysis
An end-to-end learning project following the DataTalks.Club ML Zoomcamp, designed to deepen my machine learning skills in a finance-focused context.
๐ก What makes this journey different?
I'm learning in public โ documenting each module with financial case studies, code samples, and real-world insights, bridging theory with practical application.
Featured Content:
- ๐๏ธ Regression โ Predicting asset prices, loan performance, financial KPIs
 - ๐๏ธ Classification โ Customer churn, fraud detection, credit risk
 - ๐งฎ Evaluation Metrics โ Precision, recall, F1 for finance model accuracy
 - โป๏ธ Model Deployment โ Pipelines for production-ready financial models
 - ๐ณ Decision Trees & Ensembles โ Portfolio risk & investment strategies
 - ๐ค Neural Networks โ Capturing complex financial patterns
 - โ๏ธ Serverless Deep Learning โ Real-time transaction analysis at scale
 - ๐ ๏ธ Kubernetes โ Deploying ML for high-volume financial data
 
Key Insights:
- Regression & Classification underpin financial forecasting and risk analysis.
 - Evaluation metrics ensure models balance precision, recall, and ROI.
 - Deployment pipelines are essential for production-ready finance models.
 - Ensemble & neural models improve robustness for complex financial behaviour.
 
Outcome:
Building a comprehensive portfolio of finance-focused ML projects, including:
- Forecasting & profitability analysis
 - Risk management models
 - Customer churn prediction
 - Real-time analytics with scalable deployments
 
๐ ๏ธ Tools & Tech:
- ๐ Python (pandas, NumPy, scikit-learn, matplotlib, seaborn)
 - ๐ค Regression ยท Classification ยท Ensemble Learning ยท Neural Networks
 - โ๏ธ Streamlit ยท Kubernetes ยท Serverless ML
 
๐ Community:
This journey is powered by DataTalks.Club.
Join the Slack channel (#course-ml-zoomcamp) for collaboration and support.
๐ฅ Let's make an impact in financial data analysis together!
Explore the full Project on Github
๐ Portfolio Stock Market Analysis of Tech Giants
An end-to-end stock market & portfolio optimization project analyzing Apple (AAPL), Microsoft (MSFT), Google (GOOGL), and Amazon (AMZN) against the S&P 500 benchmark, using return analytics, risk evaluation, and mean-variance optimization.
๐ก What makes this project different?
I combined performance metrics (daily returns, volatility, Sharpe ratio) with portfolio optimization (Markowitz Efficient Frontier) to build risk-adjusted strategies investors can use for growth and downside protection.
Featured Content
- ๐ Return Analytics โ mean, median, geometric & harmonic means, cumulative returns
 - โ๏ธ Risk & Distribution โ volatility, skewness, kurtosis, coefficient of variation
 - ๐ Benchmarking vs. S&P 500 โ daily return plots, correlation heatmaps, Beta & Alpha regression
 - ๐งฎ Portfolio Optimization โ equal-weight portfolio vs. optimized portfolio (Sharpe-maximizing) with Efficient Frontier
 
Key Insights
- Apple dominates growth (512% cumulative return) but with higher volatility (std: 0.0204).
 - Amazon shows high upside but heavy tail risk (kurtosis: 14.8, skew: 0.85).
 - Google & Microsoft deliver steadier returns with lower risk, making them stabilizers.
 - Sharpe Ratios: Apple leads (1.11), Microsoft lags (0.73) โ highlighting differences in risk-adjusted performance.
 - Optimized portfolio outperforms equal-weight:
 - CAGR โ from 18.2% โ 20.5%
 - Sharpe Ratio โ from 1.20 โ 1.24
 - Cumulative return โ from 1,735% โ 2,469%
 
Outcome
Delivered investment-ready insights and optimized portfolios that enable:
- Superior risk-adjusted growth (higher Sharpe & Sortino ratios)
 - Capital allocation that balances volatility with return
 - Robust downside protection during market stress
 - Data-driven investment decision support
 
๐ ๏ธ Tools & Tech
- ๐ Python (pandas, NumPy, matplotlib, seaborn, plotly)
 - ๐ yfinance ยท QuantStats ยท PyPortfolioOpt
 - โ๏ธ Regression ยท Risk Models ยท Efficient Frontier
 
Explore the full Project on Github