tencent cloud

Feedback

CTE Syntax Use Instructions

Last updated: 2024-12-17 16:41:56
    TDSQL-C for MySQL does not support CTE syntax in version 5.7. The row storage engine only in the versions later than 8.0 supports the CTE syntax, while the read-only analysis engine in both the versions 5.7 and 8.0 can support the CTE syntax normally.

    Support Status

    Currently, the read-only analysis engine only supports non-recursive CTEs. When using them, you need to use Hint /*+ MERGE() */, otherwise the correct execution plan may not be generated. In subsequent versions, the read-only analysis engine will gradually support recursive CTEs and optimize the CTE execution performance.

    CTE Introduction

    Common Table Expressions (CTEs) are a part of the SQL standard, commonly referred to as WITH clauses. CTE was first introduced in the SQL:1999 standard, providing a concise and powerful way to define temporary result sets. These result sets can be referenced multiple times in a single SQL statement, greatly improving the readability and maintainability of queries.
    WITH clause use example:
    -- Start defining CTE. WITH CustomerCTE AS ( SELECT customer_id, first_name, last_name, email_address FROM customer ) -- End defining CTE. SELECT * FROM CustomerCTE; -- Reference the CTE.

    CTE Strengths

    CTEs provide numerous strengths in complex SQL queries.
    Simplifying queries: CTEs can organize and simplify complex SQL statements, improving the maintainability. For example, when the same subquery needs to be referenced multiple times, CTEs can avoid redundant code, making the query clearer.
    Improving code readability: Using meaningful names for intermediate results makes SQL easier to understand.
    Avoiding duplicate queries: CTEs allow the definition of temporary result sets, which can be referenced multiple times in a single SQL statement, thereby avoiding the repeated execution of the same operation.
    Recursive queries: CTEs support recursive queries, which can satisfy the query needs for hierarchical data (such as staff organizational structures). It is very useful for dealing with tree structure data.

    Syntax Structure

    The syntax structure of CTEs is as follows.
    with_clause:
    WITH [RECURSIVE]
    cte_name [(col_name [, col_name] ...)] AS (subquery)
    [, cte_name [(col_name [, col_name] ...)] AS (subquery)] ...
    Parameter
    Description
    WITH keyword
    Indicates the beginning of the CTE definition.
    [RECURSIVE]
    Optional keyword. If RECURSIVE is contained, it indicates that self-referencing within the CTE is allowed. It is used to create recursive queries.
    cte_name
    The name specified for the CTE, which can be referenced in subsequent queries.
    [(col_name [,col_name] ...)]
    Optional column name list, which specifies the column names for the CTE result sets. If omitted, the column names in the subquery will be used.
    AS (subquery)
    Internal subquery of the CTE, which defines the CTE content.
    Comma and additional CTEs
    In a WITH clause, you can define multiple CTEs separated by commas. Each additional CTE follows the same structure cte_name [(col_name ...)] AS (subquery).

    Non-Recursive CTE

    In a non-recursive CTE, the CTE only references other tables or previously defined CTEs, but does not reference itself. It is suitable for decomposing multi-step queries, to build the final query result step by step through intermediate calculations.
    WITH cte1 AS (SELECT * FROM t1, t2), cte2 AS (SELECT i1, i2 FROM cte1 WHERE i3 > 10) cte3 AS (SELECT * FROM cte2, t3 WHERE cte2.i1 = t3.i1) SELECT * FROM cte3;

    Recursive CTE

    In a recursive CTE, the CTE references itself. Recursive CTEs are often used for querying the tree structure or graph structure data, such as calculating factorials, generating sequences, or traversing hierarchical relationships.
    A recursive CTE consists of three parts, namely Seed Part Subquery, Union Type, and Recursive Part Subquery. The Seed Part Subquery does not reference itself, while the Recursive Part Subquery must reference itself.
    WITH RECURSIVE cte(n, fact) AS (
    SELECT 0, 1 -- Seed Part Subquery
    UNION ALL -- Union Type
    SELECT n + 1, (n + 1) * fact FROM cte WHERE n < 5 -- Recursive Part Subquery
    )
    SELECT n, fact FROM cte;

    Examples

    Calculating factorials
    WITH RECURSIVE cte(n, fact) AS (
    SELECT 0, 1
    UNION ALL
    SELECT n + 1, (n + 1) * fact FROM cte WHERE n < 5
    )
    SELECT n, fact FROM cte;
    In this example, the recursive part UNION ALL SELECT n + 1, (n + 1) * fact FROM cte WHERE n < 5 will repeatedly call itself until n reaches 5. The recursion ends when the recursive part outputs an empty row.
    Traversing tree structures
    Assuming that we have a table named employees, it contains the employee hierarchy, where id is the unique identifier of an employee, name is the employee name, and manager_id is the superior ID of the employee.
    CREATE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    manager_id INT
    );
    
    INSERT INTO employees (id, name, manager_id) VALUES
    (1, 'CEO', NULL),
    (2, 'Manager 1', 1),
    (3, 'Manager 2', 1),
    (4, 'Employee 1', 2),
    (5, 'Employee 2', 2),
    (6, 'Employee 3', 3);
    A recursive CTE is used to traverse the employee hierarchy, retrieving all subordinates from top to bottom:
    
    WITH RECURSIVE employee_hierarchy AS (
    -- Basic situation: starting from CEO
    SELECT
    id,
    name,
    manager_id,
    1 AS level
    FROM employees
    WHERE manager_id IS NULL
    
    UNION ALL
    
    -- Recursive situation: finding each employee's subordinates
    SELECT
    e.id,
    e.name,
    e.manager_id,
    eh.level + 1
    FROM employees e
    INNER JOIN employee_hierarchy eh ON eh.id = e.manager_id
    )
    SELECT id, name, manager_id, level
    FROM employee_hierarchy
    ORDER BY level, manager_id;
    
    -- Result
    ┌───────┬────────────┬────────────┬───────┐
    id │ name │ manager_id │ level │
    │ int32 │ varchar │ int32 │ int32 │
    ├───────┼────────────┼────────────┼───────┤
    1 │ CEO │ │ 1
    2 │ Manager 112
    3 │ Manager 212
    4 │ Employee 123
    5 │ Employee 223
    6 │ Employee 333
    └───────┴────────────┴────────────┴───────┘

    Basic CTE

    WITH CustomerCTE AS (
    SELECT customer_id, first_name, last_name, email_address
    FROM customer
    )
    SELECT /*+ MERGE() */ *
    FROM CustomerCTE;

    Multiple CTEs

    WITH
    CTE1 AS (
    SELECT customer_id, first_name, last_name, email_address
    FROM customer
    ),
    CTE2 AS (
    SELECT ss_item_sk, ss_customer_sk, ss_sold_date_sk, ss_sales_price
    FROM store_sales
    )
    SELECT /*+ MERGE() */ CTE1.first_name, CTE1.last_name, CTE2.ss_sales_price
    FROM CTE1
    JOIN CTE2 ON CTE1.customer_id = CTE2.ss_customer_sk;
    Two CTEs are defined, namely CTE1 and CTE2.
    The result sets of these two CTEs are used to perform a Join operation in the final query.
    Execution result:
    +------------+-----------+----------------+
    | first_name | last_name | ss_sales_price |
    +------------+-----------+----------------+
    | John | Doe | 45.99 |
    | Jane | Smith | 32.50 |
    | Michael | Johnson | 78.25 |
    | Emily | Brown | 19.99 |
    | David | Wilson | 55.00 |
    | John | Doe | 67.75 |
    | Jane | Smith | 22.99 |
    | Michael | Johnson | 41.50 |
    | Emily | Brown | 89.99 |
    | David | Wilson | 33.25 |
    +------------+-----------+----------------+
    10 rows in set (0.12 sec)

    Nested CTE

    WITH SalesSummary AS (
    SELECT ss_customer_sk, SUM(ss_net_paid) AS total_spent
    FROM store_sales
    GROUP BY ss_customer_sk
    ),
    TopCustomers AS (
    SELECT ss_customer_sk, total_spent
    FROM SalesSummary
    WHERE total_spent > 1000 -- Assume that a threshold is set, such as customer spending more than 1,000.
    ),
    CustomerDetails AS (
    SELECT c.customer_id, c.first_name, c.last_name, tc.total_spent
    FROM customer c
    JOIN TopCustomers tc ON c.customer_id = tc.ss_customer_sk
    )
    SELECT /*+ MERGE() */ *
    FROM CustomerDetails;
    SalesSummary calculates the total spending of each customer.
    TopCustomers filters out customers whose spending exceeds 1,000 from the SalesSummary result set.
    CustomerDetails joins the customer information in the customer table with the TopCustomers result set.
    The final SELECT query extracts all data from CustomerDetails.
    Execution result:
    +-------------+------------+-----------+--------------+
    | customer_id | first_name | last_name | total_spent |
    +-------------+------------+-----------+--------------+
    | 1001 | John | Doe | 1523.75 |
    | 1002 | Jane | Smith | 2105.50 |
    | 1003 | Michael | Johnson | 1789.99 |
    | 1004 | Emily | Brown | 1650.25 |
    | 1005 | David | Wilson | 1875.00 |
    | 1006 | Sarah | Davis | 2250.75 |
    | 1007 | Robert | Taylor | 1955.50 |
    | 1008 | Jennifer | Anderson | 1725.25 |
    | 1009 | William | Thomas | 2015.00 |
    | 1010 | Lisa | Jackson | 1890.75 |
    +-------------+------------+-----------+--------------+
    10 rows in set (0.15 sec)
    
    
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support