tencent cloud

Feedback

Window Function Use Instructions

Last updated: 2024-12-17 16:43:47
    Window functions, also known as OLAP (Online Analytical Processing) functions, can perform real-time data analysis and processing. They are very important for an analytical database.
    The usage of window functions in the read-only analysis engine is basically consistent with MySQL 8.0. However, the read-only analysis engine enabled in MySQL 5.7 can still use the window functions and windowing syntax.
    The specific windowing syntax is as follows:
    SELECT
    <WindowFunction> OVER (PARTITION BY <expr_list> ORDER BY <expr_list> [ASC / DESC] [<WindowFrame>])
    FROM
    tb_test_window;
    SQL sample:
    select studentid,departmentid,classid,math,
    row_number() over(partition by departmentid,classid order by math) as row_num
    from student_scores;
    This sample sorts the data within the groups departmentid and classid by math.

    WindowFrame

    Frame
    ROWS Scenario
    RANGE Scenario
    CURRENT ROW
    Current row.
    All rows identical to the current row.
    UNBOUNDED PRECEDING
    To the first row.
    To the first row.
    UNBOUNDED
    FOLLOWING
    To the last row.
    To the last row.
    <N> PRECEDING
    Previous N rows.
    To the rows with values greater than or equal to the OrderBy column value minus <N>.
    <N> FOLLOWING
    Subsequent N rows.
    To the rows with values less than or equal to the OrderBy column value plus <N>.
    ROWS | RANGE <Frame>.
    ROWS | RANGE BETWEEN <Frame> AND <Frame>.

    WindowFunction

    Window functions currently supported in the windowing syntax are shown in the table below:
    Function Name
    Description
    Function Parameter
    Supported Type
    ROW_NUMBER()
    Marks the row number for data within each partition.
    -
    -
    RANK()
    Performs non-dense ranking for data within each partition.
    -
    -
    DENSE_RANK()
    Performs dense ranking for data within each partition.
    -
    -
    LEAD(<expr>,<offset>,<default>)
    Calculates the value of the row that follows the current row by <offset> rows, and returns <default> if no row meets the condition.
    [Required] <expr>: Calculation column.
    All types (except Time types for three parameters).
    [Optional] <offset>: Number of rows to offset from the current row, 1 by default.
    Numeric data type.
    [Optional] <default>: Default return value if no calculation row meets the condition, NULL by default.
    Consistent with the <expr> type.
    LAG(<expr>,<offset>,<default>)
    Calculates the value of the row that precedes the current row by <offset> rows, and returns <default> if no row meets the condition.
    [Required] <expr>: Calculation column.
    All types (except Time types for three parameters).
    [Optional] <offset>: Number of rows to offset from the current row, 1 by default.
    Numeric data type.
    [Optional] <default>: Default return value if no calculation row meets the condition, NULL by default.
    Consistent with the <expr> type.
    FIRST_VALUE(<expr>)
    Calculates the first value in the partition window.
    [Required] <expr>: Calculation column.
    All types.
    LAST_VALUE(<expr>)
    Calculates the last value in the partition window.
    [Required] <expr>: Calculation column.
    All types.
    MIN(<expr>)
    Calculates the <expr> value of the row corresponding to the minimum value in the OrderBy column of the partition window.
    [Required] <expr>: Calculation column.
    All types.
    MAX(<expr>)
    Calculates the <expr> value of the row corresponding to the maximum value in the OrderBy column of the partition window.
    [Required] <expr>: Calculation column.
    All types.
    COUNT(<expr>)
    Calculates the total number of rows in the partition window.
    [Required] <expr>: Calculation column.
    All types.
    SUM(<expr>)
    Calculates the total sum of data in the partition window.
    [Required] <expr>: Calculation column.
    Numeric data type.
    AVG(<expr>)
    Calculates the average value of data in the partition window.
    [Required] <expr>: Calculation column.
    Numeric data type.
    Numeric data type: int, bigint, float, double, decimal.
    Character type: char, varchar.
    Time type: date, time, datetime, timestamp.

    Detailed Samples

    Sample statements for creating tables:
    drop table if exists test.tb_window;
    create table test.tb_window (c1 int not null primary key, c2 int, c3 int);
    create table test.tb_window (c1 Int32, c2 Nullable(Int32), c3 Nullable(Int32)) engine = LibraTree order by (c1);
    insert into test.tb_window values (1, 1, 1), (2, 1, 1), (3, 1, 2), (4, 1, 4), (5, 1, 6), (6, 1, 6);

    ROWS Keyword

    Note:
    This keyword makes statistics of the window size by row and performs computation on the data within the window.
    -- Sample statements
    mysql> select c2, c3, COUNT(c1) over (partition by c2 order by c3 ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING) cn from test.tb_window;
    +----+----+----+
    | c2 | c3 | cn |
    +----+----+----+
    | 1 | 1 | 3 | -- Window row index range: current -> next 2 rows [0-2]
    | 1 | 1 | 3 | -- Window row index range: current -> next 2 rows [1-3]
    | 1 | 2 | 3 | -- Window row index range: current -> next 2 rows [2-4]
    | 1 | 4 | 3 | -- Window row index range: current -> next 2 rows [3-5]
    | 1 | 6 | 2 | -- Window row index range: current -> next 1 row [4-5] (only 1 row after it)
    | 1 | 6 | 1 | -- Window row index range: current (no data after it)
    +----+----+----+
    6 rows in set (0.06 sec)

    RANGE Keyword

    Note:
    This keyword makes statistics of the window size by value and performs computation on the data within the window. For example, the following SQL sample finds a corresponding row by adding 2 to the current row value in the C3 column, and specifies the range from the corresponding row to the current row as a window.
    -- Sample statements
    mysql> select c2, c3, COUNT(c1) over (partition by c2 order by c3 RANGE BETWEEN CURRENT ROW AND 2 FOLLOWING) cn from test.tb_window;
    +----+----+----+
    | c2 | c3 | cn |
    +----+----+----+
    | 1 | 1 | 3 | --
    | 1 | 1 | 3 |
    | 1 | 2 | 2 | --
    | 1 | 4 | 3 | --
    | 1 | 6 | 2 | --
    | 1 | 6 | 2 |
    +----+----+----+
    6 rows in set (0.06 sec)

    ROW_NUMBER

    Note:
    This function assigns numbers to data within the partition and is not restricted by <WindowFrame>.
    -- Sample statements
    mysql> select c2, c3, ROW_NUMBER() over (partition by c2 order by c3) rn from test.tb_window;
    +----+----+------+
    | c2 | c3 | rn |
    +----+----+------+
    | 1 | 1 | 1 |
    | 1 | 1 | 2 |
    | 1 | 2 | 3 |
    | 1 | 4 | 4 |
    | 1 | 6 | 5 |
    | 1 | 6 | 6 |
    +----+----+------+
    6 rows in set (0.04 sec)

    RANK & DENSE_RANK

    RANK function: Performs non-dense ranking on data within the partition. This function is not restricted by <WindowFrame>.
    DESC_RANK function: Performs dense ranking on data within the partition. This function is not restricted by <WindowFrame>.
    -- Sample statements
    select
    c2, c3,
    RANK() over (partition by c2 order by c3) rk,
    DENSE_RANK() over (partition by c2 order by c3) drk
    from test.tb_window;
    +------+------+------+------+
    | c2 | c3 | rk | drk |
    +------+------+------+------+
    | 1 | 1 | 1 | 1 |
    | 1 | 1 | 1 | 1 |
    | 1 | 2 | 3 | 2 |
    | 1 | 4 | 4 | 3 |
    | 1 | 6 | 5 | 4 |
    | 1 | 6 | 5 | 4 |
    +------+------+------+------+
    6 rows in set (0.05 sec)

    LEAD & LAG

    1. One-Parameter Scenarios

    LEAD(<expr>) function: Calculates the data of a row immediately following the current row of the partition. If there is no immediately following row, NULL is returned by default. This function is not restricted by <WindowFrame>.
    LAG(<expr>) function: Calculates the data of a row immediately preceding the current row of the partition. If there is no immediately preceding row, NULL is returned by default. This function is not restricted by <WindowFrame>.
    -- Sample statements
    mysql> select
    c2, c3,
    LEAD(c3) over (partition by c2 order by c3) ld,
    LAG(c3) over (partition by c2 order by c3) lg
    from test.tb_window;
    +------+------+------+------+
    | c2 | c3 | ld | lg |
    +------+------+------+------+
    | 1 | 1 | 1 | NULL |
    | 1 | 1 | 2 | 1 |
    | 1 | 2 | 4 | 1 |
    | 1 | 4 | 6 | 2 |
    | 1 | 6 | 6 | 4 |
    | 1 | 6 | NULL | 6 |
    +------+------+------+------+
    6 rows in set (0.11 sec)

    2. Two-Parameter Scenarios

    LEAD(<expr>, <offset>) function: Calculates the data of a row that precedes the current row of the partition by <offset> rows. If there is no row at the specified <offset>, NULL is returned by default. This function is not restricted by <WindowFrame>.
    LAG(<expr>, <offset>) function: Calculates the data of a row that follows the current row of the partition by <offset> rows. If there is no row at the specified <offset>, NULL is returned by default. This function is not restricted by <WindowFrame>.
    -- Sample statements
    mysql> select
    c2, c3,
    LEAD(c3, 2) over (partition by c2 order by c3) ld,
    LAG(c3, 2) over (partition by c2 order by c3) lg
    from test.tb_window;
    +------+------+------+------+
    | c2 | c3 | ld | lg |
    +------+------+------+------+
    | 1 | 1 | 2 | NULL |
    | 1 | 1 | 4 | NULL |
    | 1 | 2 | 6 | 1 |
    | 1 | 4 | 6 | 1 |
    | 1 | 6 | NULL | 2 |
    | 1 | 6 | NULL | 4 |
    +------+------+------+------+
    6 rows in set (0.07 sec)

    3. Three-Parameter Scenarios

    LEAD(<expr>, <offset>, <default>) function: Calculates the data of a row that precedes the current row of the partition by <offset> rows. If there is no row at the specified <offset>, <default> is returned by default. This function is not restricted by <WindowFrame>.
    LAG(<expr>, <offset>, <default>) function: Calculates the data of a row that follows the current row of the partition by <offset> rows. If there is no row at the specified <offset>, <default> is returned by default. This function is not restricted by <WindowFrame>.
    -- Sample statements
    mysql> select
    c2, c3,
    LEAD(c3, 2, 1000) over (partition by c2 order by c3) ld,
    LAG(c3, 2, 1000) over (partition by c2 order by c3) lg
    from test.tb_window;
    +------+------+------+------+
    | c2 | c3 | ld | lg |
    +------+------+------+------+
    | 1 | 1 | 2 | 1000 |
    | 1 | 1 | 4 | 1000 |
    | 1 | 2 | 6 | 1 |
    | 1 | 4 | 6 | 1 |
    | 1 | 6 | 1000 | 2 |
    | 1 | 6 | 1000 | 4 |
    +------+------+------+------+
    6 rows in set (0.10 sec)

    FIRST_VALUE & LAST_VALUE

    FIRST_VALUE(<expr>) function: Calculates the first value within the partition window (if OrderBy c3 is executed and the first value of the c3 column is duplicated, the result of first_value(c4) may be unstable).
    LAST_VALUE(<expr>) function: Calculates the last value within the partition window (if OrderBy c3 is executed and the last value of the c3 column is duplicated, the result of first_value(c4) may be unstable).
    -- Sample statements
    mysql> select
    c2, c3,
    FIRST_VALUE(c3) over (partition by c2 order by c3 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) fv,
    LAST_VALUE(c3) over (partition by c2 order by c3 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) lv
    from test.tb_window;
    +------+------+------+------+
    | c2 | c3 | fv | lv |
    +------+------+------+------+
    | 1 | 1 | 1 | 6 |
    | 1 | 1 | 1 | 6 |
    | 1 | 2 | 1 | 6 |
    | 1 | 4 | 1 | 6 |
    | 1 | 6 | 1 | 6 |
    | 1 | 6 | 1 | 6 |
    +------+------+------+------+
    6 rows in set (0.07 sec)

    MIN & MAX

    MIN(<expr>) function: Calculates the minimum value within the partition window.
    MAX(<expr>) function: Calculates the maximum value within the partition window.
    -- Sample statements
    mysql> select
    c2, c3,
    MIN(c3) over (partition by c2 order by c3 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mi,
    MAX(c3) over (partition by c2 order by c3 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) ma
    from test.tb_window;
    +------+------+------+------+
    | c2 | c3 | mi | ma |
    +------+------+------+------+
    | 1 | 1 | 1 | 6 |
    | 1 | 1 | 1 | 6 |
    | 1 | 2 | 1 | 6 |
    | 1 | 4 | 1 | 6 |
    | 1 | 6 | 1 | 6 |
    | 1 | 6 | 1 | 6 |
    +------+------+------+------+
    6 rows in set (0.07 sec)

    COUNT

    Note:
    It calculates the total number of data rows within the partition window.
    -- Sample statements
    mysql> select c2, c3, COUNT(c3) over (partition by c2 order by c3 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) c from test.tb_window;
    +------+------+----+
    | c2 | c3 | c |
    +------+------+----+
    | 1 | 1 | 6 |
    | 1 | 1 | 6 |
    | 1 | 2 | 6 |
    | 1 | 4 | 6 |
    | 1 | 6 | 6 |
    | 1 | 6 | 6 |
    +------+------+----+
    6 rows in set (0.04 sec)

    SUM

    Note:
    It calculates the total sum of data within the partition window.
    -- Sample statements
    mysql> select c2, c3, SUM(c3) over (partition by c2 order by c3 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) s from test.tb_window;
    +------+------+------+
    | c2 | c3 | s |
    +------+------+------+
    | 1 | 1 | 20 |
    | 1 | 1 | 20 |
    | 1 | 2 | 20 |
    | 1 | 4 | 20 |
    | 1 | 6 | 20 |
    | 1 | 6 | 20 |
    +------+------+------+
    6 rows in set (0.06 sec)

    AVG

    Note:
    It calculates the average value of data within the partition window.
    -- Sample statements
    mysql> select c2, c3, AVG(c3) over (partition by c2 order by c3 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) a from test.tb_window;
    +------+------+--------+
    | c2 | c3 | a |
    +------+------+--------+
    | 1 | 1 | 3.3333 |
    | 1 | 1 | 3.3333 |
    | 1 | 2 | 3.3333 |
    | 1 | 4 | 3.3333 |
    | 1 | 6 | 3.3333 |
    | 1 | 6 | 3.3333 |
    +------+------+--------+
    6 rows in set (0.06 sec)
    
    
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support