SQL Editor

Data Lake Compute

Product Introduction

Purchase Guide

Configuration Adjustment Fees

Getting Started

Complete Process for New User Activation

DLC Data Import Guide

Quick Start with Data Analytics in Data Lake Compute

Quick Start with Permission Management in Data Lake Compute

Quick Start with Partition Table

Enabling Data Optimization

Cross-Source Analysis of EMR Hive Data

Standard Engine Configuration Guide

Operation Guide

Console Operation Introduction

Data Development and Exploration

Data Exploration

SQL Editor

Data Query Task

SELECT Task

Querying Partition Table

Querying JSON Data

Querying Data from Other Sources

Using View

INSERT INTO

Querying Script Parameters

Obtaining Task Results

Query Script Analysis

Data Job

Overview

Configuring Data Access Policy

Creating Data Job

Managing Data Job

PySpark Dependency Package Management

Resource Management

Engine Management

Network Connection Configuration

Storage Configuration

Managed Storage Configuration

Binding a Metadata Acceleration Bucket

Metadata Management

Data Catalogs and DMC

Data Table Management

Data View Management

Function Management

Partition Field Policy

Ops Management

Historical Task Instances

Historical task(Old version)

Session Management

Insight Management

Task Insights

System Management

User and Permission Management

CAM Service

Permission Overview

User and Work Group

Sub-Account Permission Management

Monitoring and Alarms

Data Engine Monitoring

Data Job Monitoring

Access Point Gateway Engine Monitoring

Monitoring Alarm Configuration

Audit Log

Development Guide

SparkJar Job Development Guide

PySpark Job Development Guide

Query Performance Optimization Guide

UDF Function Development Guide

Materialized View

System Restraints

Metadata Information

Computing Task

Client Access

JDBC Access

DLC JDBC Access

Hive JDBC Access

Presto JDBC Access

Configuring Public Access for Standard Engine

TDLC Command Line Interface Tool Access

Third-party Software Linkage

Python Access

Practical Tutorial

Table Creation Practice

Using Apache Airflow to Schedule DLC Engine to Submit Tasks

Direct Query of DLC Internal Storage with StarRocks

DLC Native Table

DLC Source Table Core Capabilities

DLC Source Table Operation Configuration

DLC Source Table Lake Ingestion Practice

DLC Source Table FAQs

SQL Statement

SuperSQL Statement

Overview of SuperSQL Statement

Unified Statement

DDL Statement

ALTER DATABASE

ALTER DATABASE SET DBPROPERTIES

ALTER DATABASE SET LOCATION

DROP DATABASE

CREATE TABLE

REPLACE TABLE AS SELECT

SHOW COLUMNS IN TABLE

ALTER TABLE

ALTER TABLE ADD COLUMNS

ALTER TABLE ADD COLUMN AFTER/FIRST

ALTER TABLE DROP COLUMN

ALTER TABLE ADD PARTATION

SHOW PARTITIONS

ALTER TABLE DROP PARTITION

ALTER TABLE ADD PARTITION FIELD

ALTER TABLE DROP PARTITION FIELD

ALTER TABLE ... RENAME COLUMN

ALTER TABLE SET TBLPROPERTIES

ALTER TABLE SET LOCATION

ALTER TABLE ... WRITE ORDERED BY

ALTER TABLE ... WRITE DISTRIBUTED BY PARTITION

ALTER TABLE ... SET IDENTIFIER FIELDS

ALTER TABLE ... DROP IDENTIFIER FIELDS

ALTER VIEW

ALTER VIEW SET TBLPROPERTIES

DML Statement

DQL Statement

Iceberg Table Statement

Differences in Statement Between Iceberg External Tables and Native Tables

Materialized View Statement

SQL Implicit Conversion

Functions

Unified Functions

Overview of Unified Functions

Binary Functions

Bitwise Functions

Collection Functions

Date and Time Functions

JSON Functions

Mathematical Functions

Presto Built-in Functions

Comparison of Hive Functions

Overview of Standard Spark Statement

Overview of Standard Presto Statement

API Documentation

Making API Requests

Data Table APIs

DescribeLakeFsDirSummary

DescribeLakeFsInfo

QueryResult

GenerateCreateMangedTableSql

Task APIs

Metadata APIs

DescribeForbiddenTablePro

DescribeDLCCatalogAccess

GrantDLCCatalogAccess

RevokeDLCCatalogAccess

DropDMSTable

DropDLCTable

DescribeDMSDatabaseList

Service Configuration APIs

CreateCHDFSBindingProduct

DeleteCHDFSBindingProduct

DescribeOtherCHDFSBindingList

CreateStoreLocation

DescribeStoreLocation

ModifyDataEngineDescription

RollbackDataEngineImage

SwitchDataEngine

SwitchDataEngineImage

UpgradeDataEngineImage

DeleteThirdPartyAccessUser

DescribeDataEngineImageVersions

DescribeSubUserAccessPolicy

DescribeThirdPartyAccessUser

RegisterThirdPartyAccessUser

RestartDataEngine

UpdateUserDataEngineConfig

UpdateDataEngineConfig

Permission Management APIs

Database APIs

ModifyAdvancedStoreLocation

ModifyGovernEventRule

DescribeAdvancedStoreLocation

Data Source Connection APIs

CheckDataEngineImageCanBeRollback

CheckDataEngineImageCanBeUpgrade

DescribeDataEnginePythonSparkImages

Data Optimization APIs

GetOptimizerPolicy

Data Engine APIs

CreateDataEngine

DescribeDataEnginesScaleDetail

DeleteDataEngine

RenewDataEngine

SuspendResumeDataEngine

UpdateDataEngine

DescribeUpdatableDataEngines

DescribeDataEngine

DescribeUserDataEngineConfig

CheckDataEngineConfigPairsValidity

General Reference

Operation Guide on Connecting Third-Party Software to DLC

Connecting CBoard to DLC

DLC Policy

Data Privacy And Security Agreement

Service Level Agreement

DocumentationData Lake ComputeOperation GuideConsole Operation IntroductionData Development and ExplorationData ExplorationSQL Editor

SQL Editor

Download PDF

Last updated: 2024-07-17 17:36:45

SQL Editor

Last updated: 2024-07-17 17:36:45

Download PDF

The SQL editor provided by Data Lake Compute (DLC) supports data querying using unified SQL statements, compatible with SparkSQL. You can complete data query tasks using standard SQL. 
You can access the SQL editor through data exploration, where you can perform simple data management, multi-session data queries, query record management, and download record management.
Data Management
Data management supports adding data sources, managing databases, and managing data tables.
Creating a data catalog
Currently, Data Lake Compute supports the management of COS and EMR Hive data catalogs. The directions are as follows:
1. Log in to the Data Lake Compute console and select the service region. You need to have the admin permission.
2. Select Data Explore on the left sidebar, hover over 
﻿
 on the Database & table tab, and click Create catalog.
﻿

For detailed directions, see Querying Data from Other Sources.
Managing a database
You can create, delete, and view the details of a database in the SQL editor.
Managing a data table
You can create, query, and view the details of a data table in the SQL editor.
Changing the default database
You can use the SQL editor to specify the default database for query tasks. If no database is specified in a query statement, the query will be executed in the default database.
1. Log in to the Data Lake Compute console and select the service region.
2. Select Data Explore on the left sidebar, hover over the target database name, click 
﻿
, and click Set as default database to set the database as the default database.
﻿
﻿
3. You can also change the default database in the Default database selection box.
﻿
﻿
Data Query
Add Query Page
The SQL editor supports adding multiple pages for data querying, with each query page having independent configurations (default database, computation engine used, query records, etc.). This facilitates users in running and managing multiple tasks.
You can create a new query page by clicking on the 
﻿
 icon, and switch the editor interface by clicking on the tab bar.
﻿
﻿
﻿
For your convenience, you can save frequently used query pages by clicking the Save button. You can also quickly open your saved pages by clicking the 
﻿
 icon.
﻿
﻿
﻿
For saved query page information, you can click the Refresh button to update and synchronize the saved information, ensuring the accuracy of the query statement.
﻿
﻿
﻿
The editor supports running multiple different SQL statements simultaneously. Clicking the Run button will execute all SQL statements within the editor, simultaneously dividing them into multiple SQL tasks.
If you need to run a portion of the statement, select the required statement and click Partial run.
﻿
﻿
﻿
Engine Parameter Configuration
After selecting the data engine, you can configure parameters for the data engine. After selecting the data engine, click Add in Advanced Settings to configure.
     
﻿
﻿

The currently supported configuration parameters are as follows:
Engine
Configuration name
Start Value
Configuration Notes
SparkSQL
spark.sql.files.maxRecordsPerFile
0
The maximum number of records that can be written to a single file.
If this value is zero or negative, there are no restrictions.
﻿
spark.sql.autoBroadcastJoinThreshold
10MB
Configure the maximum byte size of the table of all working nodes displayed when executing a connection.
By setting this value to "-1", the display can be disabled.
﻿
spark.sql.shuffle.partitions
200
Default Partition Count.
﻿
spark.sql.sources.partitionOverwriteMode
static
When the value is set to static, all qualifying partitions will be deleted prior to executing the overwrite operation.
For instance, in a partitioned table, there is a partition "2022-01". When using the INSERT OVERWRITE statement to write data to the "2022-02" partition, the data in the "2021-01" partition will also be overwritten.
When the value is set to 'dynamic', partitions will not be deleted in advance, but will be overwritten during runtime for those partitions where data is written.
﻿
spark.sql.files.maxPartitionBytes
128MB
The maximum number of bytes to be packaged into a single partition when reading a file.
Presto
use_mark_distinct
true
Determines whether the engine redistributes data when executing the distinct function.
If the distinct function is called multiple times in a query, it is recommended to set this parameter to false.
﻿
USEHIVEFUNCTION
true
Determines whether to use Hive functions when executing a query; if you need to use Presto native functions, please set the parameter to false.
﻿
query_max_execution_time
-
This setting is used to establish a query timeout. If the execution time of a query exceeds the set time, the query will be terminated. The units supported are d-day, h-hour, m-minute, s-second, ms-millisecond (for example, 1d represents 1 day, 3m represents 3 minutes).
﻿
dlc.query.execution.mode
async
The engine query execution mode is set to async mode by default. In this mode, the task will perform a complete query calculation, save the results to COS, and then return them to the user, allowing the user to download the query results after the query is completed.
Users can also change this value to sync. In sync mode, queries may not necessarily perform full calculations. Once partial results are available, they will be directly returned to the user by the engine, without being saved to COS. Therefore, users can achieve lower query latency and duration, but the results are only saved in the system for 30 seconds. This mode is recommended for users who do not need to download the complete query results from COS, but expect lower query latency and duration, such as during the query exploration phase or BI result display.
Presto Execution Mode
When the user selects the Presto engine, Data Exploration supports the user to choose to run in "Fast Mode" or "Full Mode".
Quick Query: This offers faster speed, but the query results cannot be persistently saved. It is suitable for the exploration phase.
Full Mode: Execute a full query and save the data to object storage.
﻿
﻿
﻿
Search results
Through the SQL editor, you can directly view the query results. You can expand or collapse the display height of the query results by clicking the 
﻿
 chart.
﻿
﻿
﻿

You can configure the query result storage directory through the configuration button in the upper right corner, supporting configuration to the COS path or built-in storage.
﻿
﻿
﻿
The console will return a maximum of 1000 results for a single task. If more results are needed, the API can be used. For instructions on API-related operations, refer to the API Documentation.
Query results can be downloaded locally when no COS storage path is specified. For detailed instructions, refer to Obtaining Task Results.
Querying statistical data
The query results under the Presto engine and SparkSQL engine support the display of optimized quantification with different characteristics.
The SparkSQL engine supports viewing:
1. Data Scanning Volume
2. Cache Acceleration
3. Adaptive Shuffle
4. Materialized View Acceleration
The Presto engine supports viewing:
1. Data Scanning Volume
2. Cache Acceleration
3. Materialized View Acceleration
Click on the Statistics column to review the statistical data and optimization suggestions for the query results.
﻿
﻿
﻿
Historical Queries
Each query page can save the running history of the past three months and supports viewing the query results of the past 24 hours. You can quickly find past task information through the running history. For detailed operations, refer to Task History Records.
Download History Management
Each query result's download task can be viewed in the Download history, where you can check the status of the download task and related parameter information.
﻿
﻿
﻿
﻿

Was this page helpful?

You can also Contact Sales or Submit a Ticket for help.

Yes

Engine	Configuration name	Start Value	Configuration Notes
SparkSQL	spark.sql.files.maxRecordsPerFile	0	The maximum number of records that can be written to a single file. If this value is zero or negative, there are no restrictions.
		spark.sql.autoBroadcastJoinThreshold	10MB	Configure the maximum byte size of the table of all working nodes displayed when executing a connection. By setting this value to "-1", the display can be disabled.
		spark.sql.shuffle.partitions	200	Default Partition Count.
		spark.sql.sources.partitionOverwriteMode	static	When the value is set to static, all qualifying partitions will be deleted prior to executing the overwrite operation. For instance, in a partitioned table, there is a partition "2022-01". When using the INSERT OVERWRITE statement to write data to the "2022-02" partition, the data in the "2021-01" partition will also be overwritten. When the value is set to 'dynamic', partitions will not be deleted in advance, but will be overwritten during runtime for those partitions where data is written.
		spark.sql.files.maxPartitionBytes	128MB	The maximum number of bytes to be packaged into a single partition when reading a file.
Presto	use_mark_distinct	true	Determines whether the engine redistributes data when executing the distinct function. If the distinct function is called multiple times in a query, it is recommended to set this parameter to false.
		USEHIVEFUNCTION	true	Determines whether to use Hive functions when executing a query; if you need to use Presto native functions, please set the parameter to false.
		query_max_execution_time	-	This setting is used to establish a query timeout. If the execution time of a query exceeds the set time, the query will be terminated. The units supported are d-day, h-hour, m-minute, s-second, ms-millisecond (for example, 1d represents 1 day, 3m represents 3 minutes).
		dlc.query.execution.mode	async	The engine query execution mode is set to async mode by default. In this mode, the task will perform a complete query calculation, save the results to COS, and then return them to the user, allowing the user to download the query results after the query is completed. Users can also change this value to sync. In sync mode, queries may not necessarily perform full calculations. Once partial results are available, they will be directly returned to the user by the engine, without being saved to COS. Therefore, users can achieve lower query latency and duration, but the results are only saved in the system for 30 seconds. This mode is recommended for users who do not need to download the complete query results from COS, but expect lower query latency and duration, such as during the query exploration phase or BI result display.

tencent cloud

New User Offers

Next-Generation CDN：EdgeOne

Elasticsearch Service Special Offers

Free Tier

Tencent Cloud Startup Program

Special Offers

Lighthouse Special Offers

Cloud Object Storage Special Offers

Featured Products

New Products

Education

Tencent Cloud Online Education Solutions

Gaming

Gaming Solution

Game Media Solutions

Financial Services

Financial Services Solution

Audio & Video

Audio/Video Solution

LVB Recording Solution

Interactive Classroom Solution

Interactive Live Streaming Solution

Audio Chat Social Networking Solution

Real Estate

Tencent Cloud LinkBase(Weiling)

E-commerce

E-commerce retail solutions

Compute

Cloud Virtual Machine

Auto Scaling

Batch Compute

CVM Dedicated Host

Database

TencentDB for MySQL

TencentDB for Redis®

TencentDB for CTSDB

TDSQL for MySQL

Data Transfer Service

TencentDB for MongoDB

TencentDB for PostgreSQL

TencentDB for SQL Server

TencentDB for TcaplusDB

Video Service

Cloud Streaming Services

Video on Demand

Media Processing Service

Cloud Application Rendering

Cloud Contact Center

Game Multimedia Engine

Chat

Real-time Communication

Tencent Effect SDK

AI and Machine Learning

Image Creation Large Model

Face Fusion

eKYC

Optical Character Recognition

Video Creation Large Model

Industry Applications

Tencent HealthCare Omics Platform

Container and Middleware

TDMQ for CKafka

Serverless Cloud Function

Tencent Kubernetes Engine

Tencent Kubernetes Engine for Serverless

Networking

Cloud Load Balancer

Virtual Private Cloud

Direct Connect

Cloud Connect Network

NAT Gateway

VPN Connection

Bandwidth Package

Anycast Internet Acceleration

Elastic Network Interface

Flow Logs

Global Application Acceleration Platform

Security

Captcha