SQL Window Functions: A Quick Review With Examples

Introduction

Window functions are a powerful feature in SQL used to perform calculations across a set of rows related to the current row. Unlike aggregate functions, window functions do not group rows into a single output; they return a result for each row while maintaining the context of the dataset.


In this article, we’ll explore some commonly used SQL window functions (ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE(), LEAD(), and LAG()) with examples.

Sample Table: Sales Data

We’ll use the following Sales table to demonstrate window functions:

| SalesID | CustomerID | Product | Region | Amount | SaleDate |
|—-|—-|—-|—-|—-|—-|
| 1 | 101 | Laptop | North | 1200 | 2023-01-05 |
| 2 | 102 | Tablet | North | 800 | 2023-02-15 |
| 3 | 103 | Phone | North | 800 | 2023-03-10 |
| 4 | 104 | Tablet | North | 500 | 2023-04-01 |
| 5 | 105 | Laptop | South | 1300 | 2023-05-05 |
| 6 | 106 | Tablet | South | 700 | 2023-06-20 |
| 7 | 107 | Phone | West | 900 | 2023-07-15 |
| 8 | 108 | Laptop | East | 1300 | 2023-08-10 |

1. ROW_NUMBER()

The ROW_NUMBER() function assigns a unique number to each row within a partition, ordered by a specified column.


Task: Assign a unique row number to each sale within a region based on the sale amount (highest to lowest).

SELECT SalesID, Region, Amount,
       ROW_NUMBER() OVER (PARTITION BY Region ORDER BY Amount DESC) AS RowNum
FROM Sales;

Result:

| SalesID | Region | Amount | RowNum |
|—-|—-|—-|—-|
| 1 | North | 1200 | 1 |
| 2 | North | 800 | 2 |
| 3 | North | 800 | 3 |
| 4 | North | 500 | 4 |
| 5 | South | 1300 | 1 |
| 6 | South | 700 | 2 |
| 7 | West | 900 | 1 |
| 8 | East | 1300 | 1 |

2. RANK()

The RANK() function assigns a rank to each row within a partition. Rows with the same values receive the same rank, and the next rank is skipped.


Task: Rank sales within each region by amount (highest to lowest).

SELECT SalesID, Region, Amount,
       RANK() OVER (PARTITION BY Region ORDER BY Amount DESC) AS Rank
FROM Sales;

Result:

| SalesID | Region | Amount | Rank |
|—-|—-|—-|—-|
| 1 | North | 1200 | 1 |
| 2 | North | 800 | 2 |
| 3 | North | 800 | 2 |
| 4 | North | 500 | 4 |
| 5 | South | 1300 | 1 |
| 6 | South | 700 | 2 |
| 7 | West | 900 | 1 |
| 8 | East | 1300 | 1 |

Key Feature:

  • For the North region, both Amount = 800 rows share rank 2.

  • The next rank is skipped (i.e., rank 3 is missing) and jumps to 4.

3. DENSE_RANK()

The DENSE_RANK() function assigns ranks like RANK(), but it doesn’t skip ranks after ties.


Task: Assign dense ranks to sales within each region by amount (highest to lowest).

SELECT SalesID, Region, Amount,
       DENSE_RANK() OVER (PARTITION BY Region ORDER BY Amount DESC) AS DenseRank
FROM Sales;

Result:

| SalesID | Region | Amount | DenseRank |
|—-|—-|—-|—-|
| 1 | North | 1200 | 1 |
| 2 | North | 800 | 2 |
| 3 | North | 800 | 2 |
| 4 | North | 500 | 3 |
| 5 | South | 1300 | 1 |
| 6 | South | 700 | 2 |
| 7 | West | 900 | 1 |
| 8 | East | 1300 | 1 |

Key Feature:

  • For the North region, both Amount = 800 rows share rank 2.

  • The next rank is 3, with no skipping of ranks.

4. NTILE()

NTILE() divides rows into a specified number of approximately equal groups.


Task: Divide all sales into 4 groups based on Amount in descending order.

SELECT SalesID, Amount,
       NTILE(4) OVER (ORDER BY Amount DESC) AS Quartile
FROM Sales;

Result:

| SalesID | Amount | Quartile |
|—-|—-|—-|
| 5 | 1300 | 1 |
| 8 | 1300 | 1 |
| 1 | 1200 | 2 |
| 7 | 900 | 2 |
| 2 | 800 | 3 |
| 3 | 800 | 3 |
| 4 | 500 | 4 |
| 6 | 700 | 4 |

5. LEAD()

LEAD() retrieves the value from the next row within the same partition.


Task: Compare each sale amount to the next sale amount, ordered by SaleDate.

SELECT SalesID, Amount, 
       LEAD(Amount) OVER (ORDER BY SaleDate) AS NextAmount
FROM Sales;

Result:

| SalesID | Amount | NextAmount |
|—-|—-|—-|
| 1 | 1200 | 800 |
| 2 | 800 | 800 |
| 3 | 800 | 500 |
| 4 | 500 | 1300 |
| 5 | 1300 | 700 |
| 6 | 700 | 900 |
| 7 | 900 | 1300 |
| 8 | 1300 | NULL |

6. LAG()

LAG() retrieves the value from the previous row within the same partition.


Task: Compare each sale amount to the previous sale amount, ordered by SaleDate.

SELECT SalesID, Amount, 
       LAG(Amount) OVER (ORDER BY SaleDate) AS PrevAmount
FROM Sales;

Result:

| SalesID | Amount | PrevAmount |
|—-|—-|—-|
| 1 | 1200 | NULL |
| 2 | 800 | 1200 |
| 3 | 800 | 800 |
| 4 | 500 | 800 |
| 5 | 1300 | 500 |
| 6 | 700 | 1300 |
| 7 | 900 | 700 |
| 8 | 1300 | 900 |

Conclusion

SQL window functions like ROWNUMBER(), RANK(), DENSERANK(), NTILE(), LEAD(), and LAG() provide powerful ways to analyze data within partitions.


Key Takeaways:

  • ROW_NUMBER() assigns a unique identifier for each row.
  • RANK() and DENSE_RANK() differ in how they handle ties (skipping vs. no skipping).
  • NTILE() is useful for dividing rows into statistic groups.
  • LEAD() and LAG() allow comparisons with adjacent rows.


By mastering these functions, you can handle complex analytics and ranking tasks effectively!


Thank you for taking the time to explore data-related insights with me. I appreciate your engagement. If you find this information helpful, I invite you to follow me or connect with me on LinkedIn. Happy exploring!👋

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.