This project is a comprehensive journey through SQL and MySQL, where I’ve built and documented core querying skills such as aliases, joins, aggregations, and filtering techniques. To enhance interactivity and data handling,
I integrated Python using mysql.connector and pandas, enabling seamless connection to MySQL databases and transformation of raw query outputs into clean, insightful data frames.
This practical approach bridges traditional SQL querying with modern Python workflows—ideal for real-time data analysis and reporting.
As part of the DataTalks.Club ML Zoomcamp, I spent three months applying machine learning concepts directly to financial datasets. This included hands-on work with regression, classification, evaluation metrics, model deployment, decision trees, ensemble models, and neural networks.
I used Python to build, train, and deploy models, drawing actionable insights from real financial data.This project demonstrates my ability to connect machine learning theory with high-impact finance applications, such as risk analysis and investment prediction.
This project explores how the duration of student stays relates to their well-being, using three key psychological metrics:
The analysis focuses on:
By uncovering patterns between stay length and well-being, this study provides actionable insights to support student welfare and improve program design for international students.
As part of the SuperDataScience Community Project, I contributed to building a machine learning model that predicts the price of diamonds based on key characteristics—carat, cut, color, clarity—as well as additional features like depth and dimensions.
This end-to-end project involved:
Working collaboratively with other data professionals, I helped turn raw data into a functional pricing tool—bridging data science with real-world application in the gemstone industry.performance, and profitability drivers.
In this project, I applied data science techniques to analyze and forecast sales performance for the SuperStore retail dataset. Using Python, I explored sales trends, customer behavior, regional performance, and profitability drivers.
I also built predictive models to forecast future sales using regression techniques. The insights generated support business decision-making around inventory, marketing, and resource allocation—demonstrating the power of data-driven strategy in retail operations.
This project focuses on analyzing stock performance of leading tech companies—Apple (AAPL), Microsoft (MSFT), Google (GOOGL), and Amazon (AMZN).
Using Python, I processed historical stock data to uncover market trends, calculate returns, and evaluate investment performance over time. The goal was to understand stock volatility and correlation, enabling data-driven portfolio strategy decisions.
The project showcases data manipulation, visualization, and financial metric computation using libraries such as pandas, matplotlib, and yfinance.
This project explores public sentiment on immigration in the UK by analyzing YouTube comments from politically charged videos. Using the YouTube Data API, comments are extracted and processed through an automated ETL pipeline powered by Apache Airflow.
A sentiment analysis model built with Huggingface transformers classifies each comment as positive, negative, or neutral. The final model is deployed on Streamlit or Huggingface Spaces, enabling real-time analysis through an interactive web interface.
This project applies data science and business intelligence techniques to a large-scale financial transactions dataset sourced from Kaggle. With over 1 million records, it covers customer behavior, card usage, and financial activity throughout the 2010s. The project demonstrates a full-stack approach—combining SQL, Python, Machine Learning, and Power BI—to deliver insights across four key areas:
This end-to-end pipeline showcases a data-driven approach to financial planning and decision-making.