π IPL Data Analysis (Season 2024 and 2025)
A data-driven project exploring player and team performances across the 2024 and 2025 IPL seasons β combining SQL, Python, Power BI, and Generative AI to break down everything from boundaries and catches to toss impact and review success.
π οΈ Tools & Technologies Used
- SQL (MySQL) β Used for slicing the main IPL dataset into filtered views for the 2024 and 2025 seasons.
- Python (pandas, matplotlib, seaborn) β Exploratory Data Analysis, player performance trends, season-wise insights.
- Power BI β Built a multi-page interactive dashboard with KPIs, slicers, trendlines, and maps.
- Generative AI (ChatGPT) β Assisted in ideation, error handling, SQL optimisation, notebook documentation, visual design, and storytelling.
π Links
π Dataset
The project started with a large IPL Dataset covering all IPL seasons. Using SQL, I derived three filtered datasets specifically for the 2024 and 2025 seasons:
ipl_filtered_24_25.csv
β Ball-by-ball data for the 2024β25 seasons onlyipl_player_stats_24_25.csv
β Simplified version to track batter/bowler/wicket actionsipl_summary_24_25.csv
β Match-level summary (toss winner, result, player of match)
π Original Source:
π― Problem Statements / Goals
This project wasnβt just about which team won more matches β it was about uncovering patterns that arenβt obvious casually, such as:
- How important is the toss β does winning it actually increase win%?
- Does batting position matter in scoring runs?
- Which players are consistent but overlooked?
- How successful are DRS reviews per team?
- Whatβs the typical target score range teams face in chases?
- Which teams are tactically better at bowling dot balls?
- Do some teams dominate in catches or commit fewer fielding errors?
- And lastly, in which stadium were most IPL matches played in those seasons?
π§ͺ Project Workflow
SQL Phase
- Used Filtering methods to create 3 tables from the master IPL dataset.
- Removed cancelled matches and playoff outliers where needed.
- Wrote queries to calculate:
Python + Jupyter Notebook Phase
- Loaded the 3 pre-filtered datasets exported from MySQL Workbench into pandas.
- Validated datatypes, fixed inconsistencies (e.g., nulls in βruns_targetβ, unknown fields).
- Season-separated stats and grouped analysis (2024 vs 2025).
- Built tables and visualisations for:
- Team win% (excluding cancelled matches).
- Catches per team (bar charts).
- Dot Balls bowled by Each Team (bar charts).
- Total Boundaries differentiated by number of 4s & 6s. (side-by-side bar charts).
- Toss-Winner Impact on Match Result (Pie-Charts).
- Comments and markdowns added to enhance storytelling.
Power BI Dashboard
Built a 4-page dashboard with interactivity and visual elements:
π Page 1: IPL Overview
- Total boundaries, matches, runs, deliveries (KPI cards).
- Team-wise venue map.
- Toss Winner v/s Match Winner Donut Chart.
- Slicers for seasons and teams.
π Page 2: Boundary & Dot Ball Insights
- Total Runs scored by Teams.
- Total 4s and 6s by team.
- Dot balls bowled per team (per season)
- Total Catches taken by each team.
- Slicers for seasons.
π Page 3: Batting Performance
- Trendline: Batting position vs runs scored.
- Donut: Top 10 boundary hitters
- Bar Charts: Top run scorers, wicket takers, i.e. Orange and Purple Cap Leaderboard.
π Page 4: Match Behaviour
- Review system summary
- Target range distribution while chasing
βοΈ Page navigation via buttons (no bookmarks/DAX required)
π Key Insights
- π Punjab Kings led in total catches in 2025.
- π― Most teams had to chase targets of more than 200 runs.
- π₯ V. Kohli, B. Sai Sudharsan had dominant batting performances.
- βοΈ Toss winner didnβt always secure match wins β strategy mattered more.
- π Several matches had successful DRS overturns β DRS efficiency varies by team.
- π― Dot ball count helped teams control run flow, especially in 2024.
π‘ Things I Learned
This project taught me a lot β not just about cricket data, but about organising a full-stack data project from scratch:
- How to set up and clean real-world data in SQL.
- Structuring exploratory analysis in Python before designing visuals.
- Creating clean, slicer-friendly dashboards in Power BI.
- Avoiding visual clutter and focusing on what insights actually matter.
- Balancing aesthetics with interactivity (tooltips, maps, dynamic filtering).
- Using AI tools for efficiency and handling complex tasks easily.
π¦ How to Explore
If youβre checking this project out:
- You can review the
.sql
file to recreate the database - Use the Jupyter Notebook to experiment or modify visualisations
- View the Power BI
.pbix
file directly, or try the online published version