MySQL Window Functions Syntax

Status

Done

Text

Advanced analytical functions like and aggregate window functions with OVER() clause.

‣

Introduction

‣

Query Logical Processing

‣

Window function syntax

‣

Supplementary NOTE

Complete Window Functions Reference - Categorized

Overview

Window functions can be grouped into several categories based on their purpose and behavior. All can use the OVER()clause with PARTITION BY and ORDER BY.

Category 1: Aggregate Window Functions

Purpose: Perform calculations across rows and return aggregated values while preserving individual rows.

Basic Aggregates

SUM()

Calculates sum of values
Example: Running totals, cumulative sums
SUM(amount) OVER (PARTITION BY customer_id ORDER BY date)

AVG()

Calculates average of values
Example: Moving averages, mean calculations
AVG(salary) OVER (PARTITION BY department ORDER BY hire_date)

MIN()

Finds minimum value
Example: Lowest price in category, earliest date
MIN(price) OVER (PARTITION BY category)

MAX()

Finds maximum value
Example: Highest score, latest date, peak values
MAX(revenue) OVER (PARTITION BY region ORDER BY month)

COUNT()

Counts rows (or non-NULL values)
Example: Running count, rows in partition
COUNT(*) OVER (PARTITION BY status ORDER BY created_date)

Statistical Aggregates

STDDEV() / STDDEV_POP() / STDDEV_SAMP()

Standard deviation (population or sample)
Example: Measure variability within groups
STDDEV(test_score) OVER (PARTITION BY class_id)

VARIANCE() / VAR_POP() / VAR_SAMP()

Variance (population or sample)
Example: Data spread analysis
VARIANCE(response_time) OVER (PARTITION BY server_id)

Advanced Aggregates

GROUP_CONCAT() (MySQL specific)

Concatenates values into a single string
Example: Combine multiple values with separator
GROUP_CONCAT(product_name ORDER BY price) OVER (PARTITION BY category_id)

STRING_AGG() (PostgreSQL, SQL Server)

Similar to GROUP_CONCAT, concatenates strings
Example: Build comma-separated lists
STRING_AGG(name, ', ') OVER (PARTITION BY department)

JSON_ARRAYAGG() / JSON_OBJECTAGG() (MySQL 8.0+)

Aggregates values into JSON arrays or objects
Example: Create JSON structures from rows
JSON_ARRAYAGG(product_id) OVER (PARTITION BY category)

Category 2: Ranking Window Functions

Purpose: Assign ranks or row numbers based on ordering, with different tie-handling strategies.

ROW_NUMBER()

Assigns unique sequential integers
Ties get different numbers (arbitrary order)
Example: Number rows 1, 2, 3, 4...
ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC)
Result: 1, 2, 3, 4, 5...

RANK()

Assigns ranking with gaps after ties
Ties get same rank, next rank skips numbers
Example: Olympic medals (1st, 1st, 3rd, 4th)
RANK() OVER (PARTITION BY category ORDER BY score DESC)
Result: 1, 2, 2, 4, 5... (notice gap from 2 to 4)

DENSE_RANK()

Assigns ranking without gaps
Ties get same rank, next rank continues sequentially
Example: Grade rankings (A, A, B, C)
DENSE_RANK() OVER (PARTITION BY region ORDER BY revenue DESC)
Result: 1, 2, 2, 3, 4... (no gap, continues from 2 to 3)

NTILE(n)

Divides rows into n equal buckets/groups
Example: Quartiles (4 groups), deciles (10 groups)
NTILE(4) OVER (ORDER BY salary) creates quartiles
Use case: Segment customers into equal-sized groups

PERCENT_RANK()

Calculates relative rank as percentage (0 to 1)
Formula: (rank - 1) / (total rows - 1)
Example: Percentile rankings
PERCENT_RANK() OVER (ORDER BY exam_score)
Result: 0.0, 0.25, 0.5, 0.75, 1.0

CUME_DIST()

Cumulative distribution (percentage of rows ≤ current row)
Formula: (rows ≤ current) / (total rows)
Example: "This score beats X% of all scores"
CUME_DIST() OVER (ORDER BY salary)
Result: 0.2, 0.4, 0.6, 0.8, 1.0

Category 3: Value/Offset Window Functions

Purpose: Access values from other rows relative to the current row.

LAG(column, offset, default)

Accesses value from previous row(s)
offset: how many rows back (default = 1)
default: value if no previous row exists
Example: Compare to previous month
LAG(revenue, 1, 0) OVER (ORDER BY month) gets previous month's revenue

LEAD(column, offset, default)

Accesses value from next row(s)
offset: how many rows forward (default = 1)
default: value if no next row exists
Example: Compare to next quarter
LEAD(sales, 1) OVER (ORDER BY quarter) gets next quarter's sales

FIRST_VALUE(column)

Returns first value in the window frame
Example: Compare each row to the first/earliest value
FIRST_VALUE(price) OVER (PARTITION BY product ORDER BY date)
Gets the initial/opening price

LAST_VALUE(column)

Returns last value in the window frame
CAUTION: Default frame is up to current row, not entire partition
Must specify: ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
Example: Compare to most recent/final value
LAST_VALUE(stock_price) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)

NTH_VALUE(column, n)

Returns the nth value in the window frame
Example: Get 2nd place, 3rd highest, etc.
NTH_VALUE(salary, 2) OVER (PARTITION BY dept ORDER BY salary DESC)
Gets the 2nd highest salary

Category 4: Distribution Functions

Purpose: Analyze data distribution and percentiles.

PERCENTILE_CONT(percentile) (PostgreSQL, SQL Server, Oracle)

Continuous percentile (interpolates between values)
Returns exact percentile even if value doesn't exist
Example: Median (50th percentile), 95th percentile
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) calculates median

PERCENTILE_DISC(percentile) (PostgreSQL, SQL Server, Oracle)

Discrete percentile (returns actual value from dataset)
Returns closest actual value at or above percentile
Example: True median from dataset
PERCENTILE_DISC(0.75) WITHIN GROUP (ORDER BY price) finds 75th percentile value

Comparison Table: Ranking Functions

Function	Handles Ties	Gaps After Ties	Sequential	Use Case
ROW_NUMBER()	No (arbitrary order)	N/A	Yes (1,2,3,4)	Unique numbering, pagination
RANK()	Yes (same rank)	Yes	No (1,2,2,4)	Competition ranking, top-N with ties
DENSE_RANK()	Yes (same rank)	No	Yes (1,2,2,3)	Grade levels, continuous rankings
NTILE(n)	Distributes evenly	N/A	Yes	Equal-sized buckets, quartiles

Visual Example:

Scores: 95, 90, 90, 85, 80

ROW_NUMBER():  1,  2,  3,  4,  5  (arbitrary for ties)
RANK():        1,  2,  2,  4,  5  (gap after tie)
DENSE_RANK():  1,  2,  2,  3,  4  (no gap)
NTILE(2):      1,  1,  1,  2,  2  (split into 2 groups)

Comparison Table: Value Functions

Function	Direction	Offset	Use Case
LAG()	Backward (previous)	Configurable	Period-over-period change, previous value
LEAD()	Forward (next)	Configurable	Future comparison, next value
FIRST_VALUE()	Beginning of frame	Fixed	Compare to baseline, initial value
LAST_VALUE()	End of frame	Fixed	Compare to final, most recent value
NTH_VALUE()	Specific position	Specified by n	Get 2nd place, 3rd item, etc.

Window Frame Specification

All window functions can use window frames to control which rows are included:

Syntax:

function() OVER (
    PARTITION BY column
    ORDER BY column
    {ROWS | RANGE} BETWEEN frame_start AND frame_end
)

Frame Boundaries:

UNBOUNDED PRECEDING - From the start of partition
n PRECEDING - n rows/values before current
CURRENT ROW - The current row
n FOLLOWING - n rows/values after current
UNBOUNDED FOLLOWING - To the end of partition

Frame Types:

ROWS: Physical row count
RANGE: Logical value-based (considers ORDER BY values)

Common Frames:

-- Running total (default with ORDER BY)
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

-- Entire partition (default without ORDER BY)
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING

-- Last 3 rows (moving window)
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW

-- Centered window
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING

-- Last 7 days of data
RANGE BETWEEN INTERVAL 6 DAY PRECEDING AND CURRENT ROW

Function Availability by Database

Function	MySQL 8.0+	PostgreSQL	SQL Server	Oracle	SQLite 3.25+
SUM/AVG/MIN/MAX/COUNT	✅	✅	✅	✅	✅
ROW_NUMBER/RANK/DENSE_RANK	✅	✅	✅	✅	✅
NTILE	✅	✅	✅	✅	✅
LAG/LEAD	✅	✅	✅	✅	✅
FIRST_VALUE/LAST_VALUE	✅	✅	✅	✅	✅
NTH_VALUE	✅	✅	✅	✅	✅
PERCENT_RANK/CUME_DIST	✅	✅	✅	✅	✅
PERCENTILE_CONT/DISC	❌	✅	✅	✅	❌
STDDEV/VARIANCE	✅	✅	✅	✅	❌
GROUP_CONCAT	✅ (MySQL)	❌	❌	❌	✅
STRING_AGG	❌	✅	✅	❌	✅

Quick Selection Guide

When to Use Each Category:

Aggregate Functions - When you need:

Running totals or cumulative sums
Moving averages or rolling calculations
Comparisons to group totals/averages
Statistical analysis within groups

Ranking Functions - When you need:

Top N per category
Percentile rankings
Position within ordered groups
Equal-sized buckets/segments

Value Functions - When you need:

Period-over-period comparisons (LAG/LEAD)
Growth rates (current vs previous)
First/last values in a sequence
Access to specific positioned values

Distribution Functions - When you need:

Median calculations
Percentile thresholds
Statistical distribution analysis
Outlier detection

Common Patterns Summary

Pattern 1: Running Total

SUM(amount) OVER (PARTITION BY customer_id ORDER BY date)

Pattern 2: Moving Average

AVG(sales) OVER (ORDER BY month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)

Pattern 3: Rank Within Category

RANK() OVER (PARTITION BY category ORDER BY sales DESC)

Pattern 4: Top N per Group

WHERE ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) <= 5

Pattern 5: Period-over-Period Change

value - LAG(value) OVER (ORDER BY date) AS change

Pattern 6: Percentage of Total

value / SUM(value) OVER (PARTITION BY category) * 100 AS pct_of_total

Pattern 7: First and Last Comparison

SELECT
    FIRST_VALUE(price) OVER (ORDER BY date) AS opening_price,
    LAST_VALUE(price) OVER (ORDER BY date
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS closing_price

Key Takeaways

Aggregate functions compute across rows while preserving detail
Ranking functions assign positions with different tie-handling
Value functions access other rows relative to current position
All functions use OVER() clause with optional PARTITION BY and ORDER BY
Window frames control which rows are included in calculations
Multiple functions can be combined in a single query
Performance varies; use appropriate indexes on PARTITION BY and ORDER BY columns