A collection of machine learning projects in finance, retail, and customer analytics — covering predictive modelling, classification, forecasting, and interactive app deployment.
✨ Featured Projects
💎 Diamond Price Prediction
An end-to-end machine learning project predicting diamond prices using the industry's 4C model (Cut, Color, Clarity, Carat) with advanced feature engineering.
What makes this project different? I benchmarked linear vs. ensemble models, engineered PCA + polynomial features, and deployed the best model (XGBoost, R² = 0.982) as a Streamlit web app.
Featured Content:
- 📊 EDA & Feature Engineering — PCA, polynomial features, encoding
- 🤖 Model Comparison — Linear, RF, Gradient Boosting, XGBoost
- 🚀 Deployment — Interactive Streamlit web app for real-time prediction
- 💡 Insights — Carat weight strongest predictor (92% correlation), Ideal cut commands premium
Key Insights:
- Carat weight dominates pricing (R² = 0.92 with price).
- Cut quality (Ideal & Premium) drives premium valuation.
- Color + Clarity add moderate but complementary effects.
- Business Strategy: Ideal cut + VS2 clarity balance brilliance and affordability.
Outcome:
Delivered a production-ready predictive model with a Streamlit app that provides real-time price estimates and business insights — supporting jewelers, sellers, and buyers in making smarter pricing and inventory decisions.
🛠️ Tools & Tech:
- 📊 Python (pandas, numpy, seaborn, matplotlib)
- 🤖 scikit-learn · XGBoost
- 🌐 Streamlit · Pickle for model storage
⚖️ SuperStore Sales Analysis & Profit Forecasting
An end-to-end financial analytics & machine learning project applying predictive and prescriptive analytics to the SuperStore dataset.
💡 What makes this project different?
I combined profit classification (85%+ accuracy), sales forecasting, and Power BI dashboards to deliver ROI-driven strategies for management.
Featured Content
- 📊 EDA & Feature Engineering — Profit margins, CLV, discount effectiveness
- 🤖 Predictive Models — Regression, classification (profitable vs. loss-making), sales forecasting
- 📈 Business Intelligence — Power BI dashboards with drill-down by region, segment, category
- 💡 Strategic Insights — Discount thresholds, customer segmentation, regional performance metrics
Key Insights
- Discounting erodes margins — aggressive discounts cut profitability
- Customer segmentation pays off — top customers increase CLV by 15–25%
- Regional performance varies — tailored strategies improve profitability by 20–30%
- Seasonal trends matter — forecasting improves inventory & cash flow
Outcome
Delivered executive-ready dashboards and a profit classification model (85%+ accuracy), enabling strategies for:
- Margin optimisation
- Customer retention
- Regional growth
- ROI-focused decision-making
🛠️ Tools & Tech
- 📊 Python (pandas, NumPy, matplotlib, seaborn, scikit-learn)
- 🤖 Regression · Classification · Forecasting
- 📈 Power BI (interactive dashboards, variance analysis, ROI scenarios)
📊 ML Zoomcamp — Applying Machine Learning in Financial Data Analysis
An end-to-end learning project following the DataTalks.Club ML Zoomcamp, designed to deepen my machine learning skills in a finance-focused context.
💡 What makes this journey different?
I'm learning in public — documenting each module with financial case studies, code samples, and real-world insights, bridging theory with practical application.
Featured Content:
- 🗒️ Regression — Predicting asset prices, loan performance, financial KPIs
- 🗓️ Classification — Customer churn, fraud detection, credit risk
- 🧮 Evaluation Metrics — Precision, recall, F1 for finance model accuracy
- ♻️ Model Deployment — Pipelines for production-ready financial models
- 🌳 Decision Trees & Ensembles — Portfolio risk & investment strategies
- 🤖 Neural Networks — Capturing complex financial patterns
- ☁️ Serverless Deep Learning — Real-time transaction analysis at scale
- 🛠️ Kubernetes — Deploying ML for high-volume financial data
Key Insights:
- Regression & Classification underpin financial forecasting and risk analysis.
- Evaluation metrics ensure models balance precision, recall, and ROI.
- Deployment pipelines are essential for production-ready finance models.
- Ensemble & neural models improve robustness for complex financial behaviour.
Outcome:
Building a comprehensive portfolio of finance-focused ML projects, including:
- Forecasting & profitability analysis
- Risk management models
- Customer churn prediction
- Real-time analytics with scalable deployments
🛠️ Tools & Tech:
- 📊 Python (pandas, NumPy, scikit-learn, matplotlib, seaborn)
- 🤖 Regression · Classification · Ensemble Learning · Neural Networks
- ☁️ Streamlit · Kubernetes · Serverless ML
🌐 Community:
This journey is powered by DataTalks.Club.
Join the Slack channel (#course-ml-zoomcamp) for collaboration and support.
💥 Let's make an impact in financial data analysis together!
📈 Portfolio Stock Market Analysis of Tech Giants
An end-to-end stock market & portfolio optimization project analyzing Apple (AAPL), Microsoft (MSFT), Google (GOOGL), and Amazon (AMZN) against the S&P 500 benchmark, using return analytics, risk evaluation, and mean-variance optimization.
💡 What makes this project different?
I combined performance metrics (daily returns, volatility, Sharpe ratio) with portfolio optimization (Markowitz Efficient Frontier) to build risk-adjusted strategies investors can use for growth and downside protection.
Featured Content
- 📊 Return Analytics — mean, median, geometric & harmonic means, cumulative returns
- ⚖️ Risk & Distribution — volatility, skewness, kurtosis, coefficient of variation
- 📈 Benchmarking vs. S&P 500 — daily return plots, correlation heatmaps, Beta & Alpha regression
- 🧮 Portfolio Optimization — equal-weight portfolio vs. optimized portfolio (Sharpe-maximizing) with Efficient Frontier
Key Insights
- Apple dominates growth (512% cumulative return) but with higher volatility (std: 0.0204).
- Amazon shows high upside but heavy tail risk (kurtosis: 14.8, skew: 0.85).
- Google & Microsoft deliver steadier returns with lower risk, making them stabilizers.
- Sharpe Ratios: Apple leads (1.11), Microsoft lags (0.73) — highlighting differences in risk-adjusted performance.
- Optimized portfolio outperforms equal-weight:
- CAGR ↑ from 18.2% → 20.5%
- Sharpe Ratio ↑ from 1.20 → 1.24
- Cumulative return ↑ from 1,735% → 2,469%
Outcome
Delivered investment-ready insights and optimized portfolios that enable:
- Superior risk-adjusted growth (higher Sharpe & Sortino ratios)
- Capital allocation that balances volatility with return
- Robust downside protection during market stress
- Data-driven investment decision support
🛠️ Tools & Tech
- 📊 Python (pandas, NumPy, matplotlib, seaborn, plotly)
- 📈 yfinance · QuantStats · PyPortfolioOpt
- ⚖️ Regression · Risk Models · Efficient Frontier