Overview
The SUM()
window function returns the sum of the input column or expression values. It can be used with a RANGE
clause, that allows you to define a logical frame of rows based on the values of the current row, rather than a fixed number of rows.
Syntax
The syntax for this function is as follows:
SUM(expression) OVER (
[PARTITION BY partition_expression]
ORDER BY sort_expression
[ROWS | RANGE BETWEEN start_value AND end_value]
)
The expression’s argument types supported by the SUM
window function are INTEGER
, BIGINT
, REAL
and DOUBLE PRECISION
. The return types of the SUM
function are: BIGINT
for integer and DOUBLE PRECISION
for floating-point arguments.
The SUM()
window function works with numeric values and ignores NULL ones
Parameters
expression
: input’s column or expression values to be summed
PARTITION BY
: optional clause, which divides the result set into partitions to which the function is applied
ROWS | RANGE BETWEEN
: range-based window frame relative to the current row
Examples
For the needs of this section, we will create the winsales
table that stores details of some sales transactions:
CREATE TABLE winsales(
salesid int,
dateid date,
sellerid int,
buyerid text,
qty int,
qty_shipped int);
INSERT INTO winsales VALUES
(30001, '8/2/2003', 3, 'b', 10, 10),
(10001, '12/24/2003', 1, 'c', 10, 10),
(10005, '12/24/2003', 1, 'a', 30, null),
(40001, '1/9/2004', 4, 'a', 40, null),
(10006, '1/18/2004', 1, 'c', 10, null),
(20001, '2/12/2004', 2, 'b', 20, 20),
(40005, '2/12/2004', 4, 'a', 10, 10),
(20002, '2/16/2004', 2, 'c', 20, 20),
(30003, '4/18/2004', 3, 'b', 15, null),
(30004, '4/18/2004', 3, 'b', 20, null),
(30007, '9/7/2004', 3, 'c', 30, null);
SUM() with ORDER BY
In this example, we will focus on executing the SUM()
window function with ORDER BY
keyword:
SELECT salesid, dateid, sellerid, qty
SUM(qty) OVER (ORDER BY dateid, salesid ROWS UNBOUNDED PRECEDING)
FROM winsales
ORDER BY 2,1;
The output from the above query includes the sales ID, date ID, seller ID, quantity and quantity sum:
salesid | dateid | sellerid | qty | sum
---------+------------+----------+-----+-----
30001 | 2003-08-02 | 3 | 10 | 10
10001 | 2003-12-24 | 1 | 10 | 20
10005 | 2003-12-24 | 1 | 30 | 50
40001 | 2004-01-09 | 4 | 40 | 90
10006 | 2004-01-18 | 1 | 10 | 100
20001 | 2004-02-12 | 2 | 20 | 120
40005 | 2004-02-12 | 4 | 10 | 130
20002 | 2004-02-16 | 2 | 20 | 150
30003 | 2004-04-18 | 3 | 15 | 165
30004 | 2004-04-18 | 3 | 20 | 185
30007 | 2004-09-07 | 3 | 30 | 215
(11 rows)
SUM() with ORDER BY and ROWS Frame
In this example we will calculate the running total of qty
ordered by dateid and salesid using a ROWS UNBOUNDED PRECEDING
frame,
which sums all rows from the start up to the current row:
SELECT salesid, dateid, sellerid, qty,
SUM(qty) OVER (ORDER BY dateid, salesid ROWS UNBOUNDED PRECEDING) AS running_qty_sum
FROM winsales
ORDER BY dateid, salesid;
After executing the query above, we get the following output:
salesid | dateid | qty | running_qty_sum
---------+------------+-----+-----------------
30001 | 2003-08-02 | 10 | 10
10001 | 2003-12-24 | 10 | 20
10005 | 2003-12-24 | 30 | 50
40001 | 2004-01-09 | 40 | 90
10006 | 2004-01-18 | 10 | 100
20001 | 2004-02-12 | 20 | 120
40005 | 2004-02-12 | 10 | 130
20002 | 2004-02-16 | 20 | 150
30003 | 2004-04-18 | 15 | 165
30004 | 2004-04-18 | 20 | 185
30007 | 2004-09-07 | 30 | 215
(11 rows)
The running_qty_sum
column shows the cumulative sum of qty
ordered by dateid
and salesid
.
For each row, it sums all qty
values from the first row up to the current row in that order.
SUM() with ORDER BY and PARTITION BY
In this example we will focus on executing the SUM()
function with ORDER BY
keyword and PARTITION BY
clause:
SELECT salesid, dateid, sellerid, qty
SUM(qty) OVER (PARTITION BY sellerid ORDER BY dateid, sellerid ROWS UNBOUNDED PRECEDING)
FROM winsales
ORDER BY 3,2,1;
After executing the query above, we get the following output:
salesid | dateid | sellerid | qty | sum
---------+------------+----------+-----+-----
10001 | 2003-12-24 | 1 | 10 | 10
10005 | 2003-12-24 | 1 | 30 | 40
10006 | 2004-01-18 | 1 | 10 | 50
20001 | 2004-02-12 | 2 | 20 | 20
20002 | 2004-02-16 | 2 | 20 | 40
30001 | 2003-08-02 | 3 | 10 | 10
30003 | 2004-04-18 | 3 | 15 | 25
30004 | 2004-04-18 | 3 | 20 | 45
30007 | 2004-09-07 | 3 | 30 | 75
40001 | 2004-01-09 | 4 | 40 | 40
40005 | 2004-02-12 | 4 | 10 | 50
(11 rows)
Time Series: SUM() with RANGE BETWEEN for Last 30 Days
In this example, we will demonstrate a common time series use case.
Calculating the rolling sum of sales quantity over the last 30 days for each row,
using the RANGE BETWEEN INTERVAL ‘30 days’ PRECEDING AND CURRENT ROW frame:
SELECT salesid, dateid, qty,
SUM(qty) OVER (
ORDER BY dateid
RANGE BETWEEN INTERVAL '30 days' PRECEDING AND CURRENT ROW
) AS rolling_30d_qty_sum
FROM winsales
ORDER BY dateid;
The output from the above query sums the qty
of all sales within the 30-day window ending at the current row’s dateid
:
salesid | dateid | qty | rolling_30d_qty_sum
---------+------------+-----+---------------------
30001 | 2003-08-02 | 10 | 10
10001 | 2003-12-24 | 10 | 40
10005 | 2003-12-24 | 30 | 40
40001 | 2004-01-09 | 40 | 80
10006 | 2004-01-18 | 10 | 90
20001 | 2004-02-12 | 20 | 40
40005 | 2004-02-12 | 10 | 40
20002 | 2004-02-16 | 20 | 60
30003 | 2004-04-18 | 15 | 35
30004 | 2004-04-18 | 20 | 35
30007 | 2004-09-07 | 30 | 30
(11 rows)