PostgreSQL recently added several great new features like Common Table Expressions (CTE), Recursive CTEs, Windowing Functions (my favorite) and Parallel restoring of dumps. The goal of this article is to provide an overview of the new PostgreSQL Windowing Functions. Previously limited to enterprise databases such as Oracle and DB2, they open up a completely new world of functionality to SQL queries.
What are Window Functions (and why should I care)?
These are also referred to as basic analytical functions. SQL is a very capable language and there are very few questions that it cannot answer. However, the performance of some of these queries is not what it should be – nor is the query itself easy to write in the first place. Some of the things that are hard to do in straight SQL are actually very commonly requested operations, including:
Calculate a running total
Show the cumulative salary within a department row by row, with each row including a summation of the prior rows’ salary.
Find percentages within a group
Show the percentage of the total salary paid to an individual in a certain department. Take their salary and divide it by the sum of the salary in the department.
Top-N queries
Find the top N highest-paid people or the top N sales by region.
Compute a moving average
Average the current row’s value and the previous N rows values together.
Perform ranking queries
Show the relative rank of an individual’s salary within their department.
Analytical functions are designed to address these issues, greatly enhancing the expressiveness of SQL.
PostgreSQL window functions
A window function performs a calculation across a set of table rows that are somehow related to the current row. This is comparable to the type of calculation that can be done with an aggregate function. But unlike regular aggregate functions, use of a window function does not cause rows to become grouped into a single output row — the rows retain their separate identities. Behind the scenes, the window function is able to access more than just the current row of the query result: it is able to scan all the rows that would be part of the current row’s group according to the grouping specification (PARTITION BY list) of the function call. The syntax of a window function call is one of the following:
function_name ([expression [, expression ... ]]) OVER ( window_definition ) function_name ([expression [, expression ... ]]) OVER window_name function_name ( * ) OVER ( window_definition ) function_name ( * ) OVER window_name
where window_definition has the syntax
[ existing_window_name ] [ PARTITION BY expression [, ...] ] [ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ] [ frame_clause ]
and the optional frame_clause can be one of
RANGE UNBOUNDED PRECEDING RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ROWS UNBOUNDED PRECEDING ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
List of PostgreSQL built-in Windowing Functions:
• row_number() • rank() • dense_rank() • percent_rank() • cume_dist() • ntile(num_buckets integer) • lag(value any [, offset integer [, DEFAULT any ]]) • lead(value any [, offset integer [, DEFAULT any ]]) • first_value(value any) • last_value(value any) • nth_value(value any, nth integer)
In addition to these functions, any built-in or user-defined aggregate function can be used as a window function.
Here is a very simple example that shows how to compare each employee’s salary with the average salary in his or her department:
SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM empsalary; depname | empno | salary | avg -------------------------------------------------- develop | 11 | 5200 | 5020.00 develop | 7 | 4200 | 5020.00 develop | 9 | 4500 | 5020.00 develop | 8 | 6000 | 5020.00 develop | 10 | 5200 | 5020.00 personnel | 5 | 3500 | 3700.00 personnel | 2 | 3900 | 3700.00 sales | 3 | 4800 | 4866.66 sales | 1 | 5000 | 4866.66 sales | 4 | 4800 | 4866.66
There is another important concept associated with window functions: for each row, there is a set of rows within its partition called its window frame. Many window functions act only on the rows of the window frame, rather than of the whole partition. By default, if ORDER BY is supplied then the frame consists of all rows from the start of the partition up through the current row, plus any following rows that are equal to the current row according to the ORDER BY clause. When ORDER BY is omitted, the default frame consists of all rows in the partition. Here is an example using sum:
SELECT salary, sum(salary) OVER (ORDER BY salary) FROM empsalary; salary | sum --------------------- 3500 | 3500 3900 | 7400 4200 | 11600 4500 | 16100 4800 | 25700 4800 | 25700 5000 | 30700 5200 | 41100 5200 | 41100 6000 | 47100
Since there is no PARTITION BY in the OVER clause, the partition is the whole table. The sum is taken from the first (lowest) salary up through the current one, including any duplicates of the current one (notice the results for the duplicated salaries).
In conclusion, window functions provide a great data analysis tool, which PostgreSQL makes available to everyone at no cost.

