SELECT STATEMENT

SELECT statement: Retrieve rows from multiple tables.
Statement
[ WITH with_query [, ...] ]
SELECT [ ALL | DISTINCT ] select_expression [, ...]
[ FROM from_item [, ...] ]
[ WHERE condition ]
[ GROUP BY [ ALL | DISTINCT ] grouping_element [, ...] ]
[ HAVING condition ]
[ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select ]
[ ORDER BY expression [ ASC | DESC ] [ NULLS FIRST | NULLS LAST] [, ...] ]
[ LIMIT [ count | ALL ] ]
Parameter
[ WITH with_query [, ....] ]
You can use WITH to flatten nested queries or simplify subqueries. The with_query statement is as follows:
subquery_table_name [ ( column_name [, ...] ) ] AS (subquery)
subquery_table_name indicates the unique name of a temporary table used to define the results of WITH subqueries. Each subquery must contain a table name that can be referenced in the FROM clause.
column_name [ , ...] indicates an optional list of output column names. The number of column names must be equal to or less than that of columns defined in the subquery.
subquery indicates a query statement.
[ ALL | DISTINCT ] select_expr
The ALL and DISTINCT options specify whether duplicate rows are returned. If none of these options are given, the default value ALL is used, which indicates that all matching rows are returned. DISTINCT specifies that duplicate rows are removed from the result set.
FROM from_item [, ...]
from_item can indicate a view, table, or subquery. If a multi-table join statement is used, supported JOIN statements are as follows:
[ INNER ] JOIN
LEFT [ OUTER ] JOIN
RIGHT [ OUTER ] JOIN
FULL [ OUTER ] JOIN
CROSS JOIN
ON join_condition: If join_condition is used, you can specify the column name for the join key of multiple tables. If join_column is used, the join column must exist in both tables.
[ WHERE condition ]
Results are filtered according to the condition you have specified, and the set of results meeting the condition is returned.
[ GROUP BY [ ALL | DISTINCT ] grouping_expressions [, ...] ]
The GROUP BY clause divides the output into groups based on the specified column names.
[HAVING condition ]
This clause is used in conjunction with aggregate functions and the GROUP BY clause. It controls which groups are selected so as to eliminate groups that do not satisfy the given conditions. It filters groups after groups and aggregates are computed.
[{UNION | INTERSECT | EXCEPT} [ALL | DISTINCT] union_query]
UNION, INTERSECT, and EXCEPT combine the results of more than one statement. UNION combines all the rows in the result set from the first query with those in the result set from the second query. To eliminate duplication, UNION builds a hash table, which consumes memory resources. UNION ALL is recommended for better performance.
INTERSECT returns only the rows in the result sets of both the first and the second queries.
EXCEPT returns the rows in the result set of the first query, but not the second.
[ORDER BY expression [ ASC | DESC ] [ NULLS FIRST | NULLS LAST] [, ...] ]
This statement sorts a result set by one or more output expressions. If the clause contains multiple expressions, sorting is performed using the first expression. Then, the second expression is used to sort rows matching the first expression, and so on.
Example
WITH Clause
The WITH clause defines named relations for use within a query. It allows flattening nested queries or simplifying subqueries. For example, the following queries are equivalent:
WITH x AS (SELECT a, MAX(b) AS b FROM t GROUP BY a)
SELECT a, b FROM x;
This also works with multiple subqueries:
WITH
  t1 AS (SELECT a, MAX(b) AS b FROM x GROUP BY a),
  t2 AS (SELECT a, AVG(d) AS d FROM y GROUP BY a)
SELECT t1.*, t2.*
FROM t1
JOIN t2 ON t1.a = t2.a;
GROUP BY Clause
The GROUP BY clause divides the output of a SELECT statement into groups of rows containing matching values. A simple GROUP BY clause may contain any expression composed of input columns, or it may be an ordinal number selecting an output column by position.
SELECT count(*), nationkey FROM customer GROUP BY 2;
SELECT count(*), nationkey FROM customer GROUP BY nationkey;
SELECT count(*) FROM customer GROUP BY mktsegment;
GROUPING SETS
Grouping sets allow users to specify multiple lists of columns to group on. The columns not part of a given sublist of grouping columns are set to NULL.
SELECT origin_state, origin_zip, destination_state, sum(package_weight)
FROM shipping
GROUP BY GROUPING SETS (
    (origin_state),
    (origin_state, origin_zip),
    (destination_state));
SELECT origin_state, NULL, NULL, sum(package_weight)
FROM shipping GROUP BY origin_state
UNION ALL
SELECT origin_state, origin_zip, NULL, sum(package_weight)
FROM shipping GROUP BY origin_state, origin_zip
UNION ALL
SELECT NULL, NULL, destination_state, sum(package_weight)
FROM shipping GROUP BY destination_state;
HAVING Clause
The HAVING clause is used in conjunction with aggregate functions and the GROUP BY clause to control which groups are selected. A HAVING clause eliminates groups that do not satisfy the given conditions.
SELECT count(*), mktsegment, nationkey,
       CAST(sum(acctbal) AS bigint) AS totalbal
FROM customer
GROUP BY mktsegment, nationkey
HAVING sum(acctbal) > 5700000
ORDER BY totalbal DESC;
IN
The IN operator allows users to specify multiple values in the WHERE clause.
SELECT name
FROM nation
WHERE regionkey IN (SELECT regionkey FROM region)
EXISTS
The EXISTS operator determines if a subquery returns any rows. If one or more rows are returned, this operator returns True. Otherwise, it returns False.
SELECT column_name(s)
FROM table_name
WHERE EXISTS
(SELECT column_name FROM table_name WHERE condition)
USING
The USING keyword is used to simplify queries.
The query must be an equal join.
Columns in an equal join must have the same name and data type.
SELECT *
FROM table_1
JOIN table_2
USING (key_A, key_B)
SELECT *
FROM (
    VALUES
        (1, 3, 10),
        (2, 4, 20)
) AS table_1 (key_A, key_B, y1)
LEFT JOIN (
    VALUES
        (1, 3, 100),
        (2, 4, 200)
) AS table_2 (key_A, key_B, y2)
USING (key_A, key_B);
CROSS JOIN
CROSS JOIN returns the Cartesian product (all combinations) of two relations.
SELECT *
FROM nation
CROSS JOIN region
LIMIT Clause
The LIMIT clause restricts the number of rows in the result set.
SELECT orderdate FROM orders LIMIT 5
ORDER BY Clause
The ORDER BY clause sorts a result set by one or more output expressions
Statement: ORDER BY expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...]
SELECT name, age FROM person ORDER BY age
SELECT * FROM student
ORDER BY student_id
SELECT * FROM student
ORDER BY student_id,student_name
EXCEPT
The EXCEPT clause/operator combines two SELECT statements and returns the rows that are in the result set of the first SELECT statement, but not the second SELECT statement. This means that EXCEPT only returns rows and does not apply to the second SELECT statement.
As with the UNION operator, the EXCEPT operator is better when the same rules are applied.
SELECT * FROM (VALUES 13, 42)
EXCEPT
SELECT 13
INTERSECT
INTERSET returns rows that exist in all result sets returned by two or more SELECT statements.
SELECT * FROM (VALUES 13, 42)
INTERSECT
SELECT 1
UNION
UNION combines the result sets of two or more SELECT statements into one result set. To retain duplicate rows in the result sets, use the UNION ALL operator.
SELECT 13
UNION
SELECT 42
SELECT id FROM a
UNION ALL
SELECT id FROM b;
TABLESAMPLE
BERNOULLI: Each row is selected to be in the table sample with a probability of the sample percentage. When a table is sampled using the Bernoulli method, all physical blocks of the table are scanned and certain rows are skipped (based on a comparison between the sample percentage and a random value calculated at runtime). The probability of a row being included in the result is independent from any other row. This does not reduce the time required to read the sampled table from disk. It may have an impact on the total query time if the sampled output is processed further.
SYSTEM: This sampling method divides the table into logical segments of data and samples the table at this granularity. This sampling method either selects all rows from a specific data segment or skips it, based on a comparison between the sampling percentage and a randomly computed value at runtime. The row selected in system sampling depends on the connector used. For example, when Hive is used, it depends on how the data is laid out on HDFS. This method does not guarantee independent sampling probabilities.
SELECT *
FROM users TABLESAMPLE BERNOULLI (50);
SELECT *
FROM users TABLESAMPLE SYSTEM (75);
PIVOT Clause
Returns aggregate values based on specific columns.
SELECT * FROM person
    PIVOT (
      SUM(age) AS a, AVG(class) AS c
      FOR name IN ('John' AS john, 'Mike' AS mike)
     );
Lateral View Clause
LATERAL VIEW [ OUTER ] generator_function ( expression [ , ... ] ) [ table_alias ] AS column_alias [ , ... ]
Escaping
To escape a single quote, precede it with another single quote, as shown in the following example:
Select 'dlc''test'
Specify escape characters or escape sequences when creating a table, for example, using the following method to create:
CREATE EXTERNAL TABLE IF NOT EXISTS `csv_test_2222` (
      `_c0`  STRING,
      `_c1`  INTEGER,
      `_c2`  INTEGER,
      `_c3`  INTEGER
    )
    ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES (
      'separatorChar' = '''',
      'quoteChar' = ''''
    )
    STORED AS `textfile`
    LOCATION 'cosn://dlc-nj-1258469122/csv/100M/
﻿
﻿

Was this page helpful?

You can also Contact Sales or Submit a Ticket for help.

Yes

tencent cloud

Sign Up

Log in

Compute

Microservice

Data Migration

Database SaaS Tool

Data Security

Application Security

Big Data

Image Creation

Internet of Things

Stream Services

Cloud Real-time Rendering

Cloud Resource Management

More

Edge Computing

Serverless

Relational Database

Networking

Business Security

Domains & Websites

Face Recognition

AI Platform Service

Middleware

Media On-Demand

Game Services

Management and Audit Tools

Container

Essential Storage Service

Enterprise Distributed DBMS

CDN and Acceleration

Security Services

Enterprise Applications

Voice Technology

Natural Language Processing

Communication

Media Process Services

Education Sevices

Developer Tools

Distributed cloud

Data Process and Analysis

NoSQL Database

Network Security

Cloud Security

Office Collaboration

Tencent Big Model

Optical Character Recognition

Interactive Video Services

Media SDK

Medical Services

Monitor and Operation

Statement

Parameter

[ WITH with_query [, ....] ]

[ ALL | DISTINCT ] select_expr

FROM from_item [, ...]

[ WHERE condition ]

[ GROUP BY [ ALL | DISTINCT ] grouping_expressions [, ...] ]

[HAVING condition ]

[{UNION | INTERSECT | EXCEPT} [ALL | DISTINCT] union_query]

[ORDER BY expression [ ASC | DESC ] [ NULLS FIRST | NULLS LAST] [, ...] ]

Example

WITH Clause

GROUP BY Clause

GROUPING SETS

HAVING Clause

IN

EXISTS

USING

CROSS JOIN

LIMIT Clause

ORDER BY Clause

EXCEPT

INTERSECT

UNION

TABLESAMPLE

PIVOT Clause

Lateral View Clause

Escaping