tencent cloud

Feedback

SELECT STATEMENT

Last updated: 2024-08-07 17:27:53
    SELECT statement: Retrieve rows from multiple tables.

    Statement

    [ WITH with_query [, ...] ]
    SELECT [ ALL | DISTINCT ] select_expression [, ...]
    [ FROM from_item [, ...] ]
    [ WHERE condition ]
    [ GROUP BY [ ALL | DISTINCT ] grouping_element [, ...] ]
    [ HAVING condition ]
    [ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select ]
    [ ORDER BY expression [ ASC | DESC ] [ NULLS FIRST | NULLS LAST] [, ...] ]
    [ LIMIT [ count | ALL ] ]

    Parameter

    [ WITH with_query [, ....] ]

    You can use WITH to flatten nested queries or simplify subqueries. The with_query statement is as follows:
    subquery_table_name [ ( column_name [, ...] ) ] AS (subquery)
    subquery_table_name indicates the unique name of a temporary table used to define the results of WITH subqueries. Each subquery must contain a table name that can be referenced in the FROM clause.
    column_name [ , ...] indicates an optional list of output column names. The number of column names must be equal to or less than that of columns defined in the subquery.
    subquery indicates a query statement.

    [ ALL | DISTINCT ] select_expr

    The ALL and DISTINCT options specify whether duplicate rows are returned. If none of these options are given, the default value ALL is used, which indicates that all matching rows are returned. DISTINCT specifies that duplicate rows are removed from the result set.

    FROM from_item [, ...]

    from_item can indicate a view, table, or subquery. If a multi-table join statement is used, supported JOIN statements are as follows:
    [ INNER ] JOIN
    LEFT [ OUTER ] JOIN
    RIGHT [ OUTER ] JOIN
    FULL [ OUTER ] JOIN
    CROSS JOIN
    ON join_condition: If join_condition is used, you can specify the column name for the join key of multiple tables. If join_column is used, the join column must exist in both tables.

    [ WHERE condition ]

    Results are filtered according to the condition you have specified, and the set of results meeting the condition is returned.

    [ GROUP BY [ ALL | DISTINCT ] grouping_expressions [, ...] ]

    The GROUP BY clause divides the output into groups based on the specified column names.

    [HAVING condition ]

    This clause is used in conjunction with aggregate functions and the GROUP BY clause. It controls which groups are selected so as to eliminate groups that do not satisfy the given conditions. It filters groups after groups and aggregates are computed.

    [{UNION | INTERSECT | EXCEPT} [ALL | DISTINCT] union_query]

    UNION, INTERSECT, and EXCEPT combine the results of more than one statement. UNION combines all the rows in the result set from the first query with those in the result set from the second query. To eliminate duplication, UNION builds a hash table, which consumes memory resources. UNION ALL is recommended for better performance.
    INTERSECT returns only the rows in the result sets of both the first and the second queries.
    EXCEPT returns the rows in the result set of the first query, but not the second.

    [ORDER BY expression [ ASC | DESC ] [ NULLS FIRST | NULLS LAST] [, ...] ]

    This statement sorts a result set by one or more output expressions. If the clause contains multiple expressions, sorting is performed using the first expression. Then, the second expression is used to sort rows matching the first expression, and so on.

    Example

    WITH Clause

    The WITH clause defines named relations for use within a query. It allows flattening nested queries or simplifying subqueries. For example, the following queries are equivalent:
    WITH x AS (SELECT a, MAX(b) AS b FROM t GROUP BY a)
    SELECT a, b FROM x;
    This also works with multiple subqueries:
    WITH
    t1 AS (SELECT a, MAX(b) AS b FROM x GROUP BY a),
    t2 AS (SELECT a, AVG(d) AS d FROM y GROUP BY a)
    SELECT t1.*, t2.*
    FROM t1
    JOIN t2 ON t1.a = t2.a;

    GROUP BY Clause

    The GROUP BY clause divides the output of a SELECT statement into groups of rows containing matching values. A simple GROUP BY clause may contain any expression composed of input columns, or it may be an ordinal number selecting an output column by position.
    SELECT count(*), nationkey FROM customer GROUP BY 2;
    SELECT count(*), nationkey FROM customer GROUP BY nationkey;
    SELECT count(*) FROM customer GROUP BY mktsegment;

    GROUPING SETS

    Grouping sets allow users to specify multiple lists of columns to group on. The columns not part of a given sublist of grouping columns are set to NULL.
    SELECT origin_state, origin_zip, destination_state, sum(package_weight)
    FROM shipping
    GROUP BY GROUPING SETS (
    (origin_state),
    (origin_state, origin_zip),
    (destination_state));
    SELECT origin_state, NULL, NULL, sum(package_weight)
    FROM shipping GROUP BY origin_state
    UNION ALL
    SELECT origin_state, origin_zip, NULL, sum(package_weight)
    FROM shipping GROUP BY origin_state, origin_zip
    UNION ALL
    SELECT NULL, NULL, destination_state, sum(package_weight)
    FROM shipping GROUP BY destination_state;

    HAVING Clause

    The HAVING clause is used in conjunction with aggregate functions and the GROUP BY clause to control which groups are selected. A HAVING clause eliminates groups that do not satisfy the given conditions.
    SELECT count(*), mktsegment, nationkey,
    CAST(sum(acctbal) AS bigint) AS totalbal
    FROM customer
    GROUP BY mktsegment, nationkey
    HAVING sum(acctbal) > 5700000
    ORDER BY totalbal DESC;

    IN

    The IN operator allows users to specify multiple values in the WHERE clause.
    SELECT name
    FROM nation
    WHERE regionkey IN (SELECT regionkey FROM region)

    EXISTS

    The EXISTS operator determines if a subquery returns any rows. If one or more rows are returned, this operator returns True. Otherwise, it returns False.
    SELECT column_name(s)
    FROM table_name
    WHERE EXISTS
    (SELECT column_name FROM table_name WHERE condition)

    USING

    The USING keyword is used to simplify queries.
    The query must be an equal join.
    Columns in an equal join must have the same name and data type.
    SELECT *
    FROM table_1
    JOIN table_2
    USING (key_A, key_B)
    SELECT *
    FROM (
    VALUES
    (1, 3, 10),
    (2, 4, 20)
    ) AS table_1 (key_A, key_B, y1)
    LEFT JOIN (
    VALUES
    (1, 3, 100),
    (2, 4, 200)
    ) AS table_2 (key_A, key_B, y2)
    USING (key_A, key_B);

    CROSS JOIN

    CROSS JOIN returns the Cartesian product (all combinations) of two relations.
    SELECT *
    FROM nation
    CROSS JOIN region

    LIMIT Clause

    The LIMIT clause restricts the number of rows in the result set.
    SELECT orderdate FROM orders LIMIT 5

    ORDER BY Clause

    The ORDER BY clause sorts a result set by one or more output expressions
    Statement: ORDER BY expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...]
    SELECT name, age FROM person ORDER BY age
    SELECT * FROM student
    ORDER BY student_id
    SELECT * FROM student
    ORDER BY student_id,student_name

    EXCEPT

    The EXCEPT clause/operator combines two SELECT statements and returns the rows that are in the result set of the first SELECT statement, but not the second SELECT statement. This means that EXCEPT only returns rows and does not apply to the second SELECT statement.
    As with the UNION operator, the EXCEPT operator is better when the same rules are applied.
    SELECT * FROM (VALUES 13, 42)
    EXCEPT
    SELECT 13

    INTERSECT

    INTERSET returns rows that exist in all result sets returned by two or more SELECT statements.
    SELECT * FROM (VALUES 13, 42)
    INTERSECT
    SELECT 1

    UNION

    UNION combines the result sets of two or more SELECT statements into one result set. To retain duplicate rows in the result sets, use the UNION ALL operator.
    SELECT 13
    UNION
    SELECT 42
    SELECT id FROM a
    UNION ALL
    SELECT id FROM b;

    TABLESAMPLE

    BERNOULLI: Each row is selected to be in the table sample with a probability of the sample percentage. When a table is sampled using the Bernoulli method, all physical blocks of the table are scanned and certain rows are skipped (based on a comparison between the sample percentage and a random value calculated at runtime). The probability of a row being included in the result is independent from any other row. This does not reduce the time required to read the sampled table from disk. It may have an impact on the total query time if the sampled output is processed further.
    SYSTEM: This sampling method divides the table into logical segments of data and samples the table at this granularity. This sampling method either selects all rows from a specific data segment or skips it, based on a comparison between the sampling percentage and a randomly computed value at runtime. The row selected in system sampling depends on the connector used. For example, when Hive is used, it depends on how the data is laid out on HDFS. This method does not guarantee independent sampling probabilities.
    SELECT *
    FROM users TABLESAMPLE BERNOULLI (50);
    SELECT *
    FROM users TABLESAMPLE SYSTEM (75);

    PIVOT Clause

    Returns aggregate values based on specific columns.
    SELECT * FROM person
    PIVOT (
    SUM(age) AS a, AVG(class) AS c
    FOR name IN ('John' AS john, 'Mike' AS mike)
    );

    Lateral View Clause

    LATERAL VIEW [ OUTER ] generator_function ( expression [ , ... ] ) [ table_alias ] AS column_alias [ , ... ]

    Escaping

    To escape a single quote, precede it with another single quote, as shown in the following example:
    Select 'dlc''test'
    Specify escape characters or escape sequences when creating a table, for example, using the following method to create:
    CREATE EXTERNAL TABLE IF NOT EXISTS `csv_test_2222` (
    `_c0` STRING,
    `_c1` INTEGER,
    `_c2` INTEGER,
    `_c3` INTEGER
    )
    ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES (
    'separatorChar' = '''',
    'quoteChar' = ''''
    )
    STORED AS `textfile`
    LOCATION 'cosn://dlc-nj-1258469122/csv/100M/
    
    
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support