tencent cloud

Feedback

SQL Editor

Last updated: 2024-07-17 17:36:45
    The SQL editor provided by Data Lake Compute (DLC) supports data querying using unified SQL statements, compatible with SparkSQL. You can complete data query tasks using standard SQL. For detailed syntax instructions, refer to SQL Syntax. You can access the SQL editor through data exploration, where you can perform simple data management, multi-session data queries, query record management, and download record management.

    Data Management

    Data management supports adding data sources, managing databases, and managing data tables.

    Creating a data catalog

    Currently, Data Lake Compute supports the management of COS and EMR Hive data catalogs. The directions are as follows:
    1. Log in to the Data Lake Compute console and select the service region. You need to have the admin permission.
    2. Select Data Explore on the left sidebar, hover over
    
    on the Database & table tab, and click Create catalog.
    
    For detailed directions, see Querying Data from Other Sources.

    Managing a database

    You can create, delete, and view the details of a database in the SQL editor.

    Managing a data table

    You can create, query, and view the details of a data table in the SQL editor.

    Changing the default database

    You can use the SQL editor to specify the default database for query tasks. If no database is specified in a query statement, the query will be executed in the default database.
    1. Log in to the Data Lake Compute console and select the service region.
    2. Select Data Explore on the left sidebar, hover over the target database name, click
    
    , and click Set as default database to set the database as the default database.
    
    
    3. You can also change the default database in the Default database selection box.
    
    

    Data Query

    Add Query Page

    The SQL editor supports adding multiple pages for data querying, with each query page having independent configurations (default database, computation engine used, query records, etc.). This facilitates users in running and managing multiple tasks.
    You can create a new query page by clicking on the
    
    icon, and switch the editor interface by clicking on the tab bar.
    
    
    
    For your convenience, you can save frequently used query pages by clicking the Save button. You can also quickly open your saved pages by clicking the
    
    icon.
    
    
    
    For saved query page information, you can click the Refresh button to update and synchronize the saved information, ensuring the accuracy of the query statement.
    
    
    
    The editor supports running multiple different SQL statements simultaneously. Clicking the Run button will execute all SQL statements within the editor, simultaneously dividing them into multiple SQL tasks.
    If you need to run a portion of the statement, select the required statement and click Partial run.
    
    
    

    Engine Parameter Configuration

    After selecting the data engine, you can configure parameters for the data engine. After selecting the data engine, click Add in Advanced Settings to configure.
    
    
    The currently supported configuration parameters are as follows:
    Engine
    Configuration name
    Start Value
    Configuration Notes
    SparkSQL
    spark.sql.files.maxRecordsPerFile
    0
    The maximum number of records that can be written to a single file.
    If this value is zero or negative, there are no restrictions.
    spark.sql.autoBroadcastJoinThreshold
    10MB
    Configure the maximum byte size of the table of all working nodes displayed when executing a connection.
    By setting this value to "-1", the display can be disabled.
    spark.sql.shuffle.partitions
    200
    Default Partition Count.
    spark.sql.sources.partitionOverwriteMode
    static
    When the value is set to static, all qualifying partitions will be deleted prior to executing the overwrite operation.
    For instance, in a partitioned table, there is a partition "2022-01". When using the INSERT OVERWRITE statement to write data to the "2022-02" partition, the data in the "2021-01" partition will also be overwritten.
    When the value is set to 'dynamic', partitions will not be deleted in advance, but will be overwritten during runtime for those partitions where data is written.
    spark.sql.files.maxPartitionBytes
    128MB
    The maximum number of bytes to be packaged into a single partition when reading a file.
    Presto
    use_mark_distinct
    true
    Determines whether the engine redistributes data when executing the distinct function.
    If the distinct function is called multiple times in a query, it is recommended to set this parameter to false.
    USEHIVEFUNCTION
    true
    Determines whether to use Hive functions when executing a query; if you need to use Presto native functions, please set the parameter to false.
    query_max_execution_time
    -
    This setting is used to establish a query timeout. If the execution time of a query exceeds the set time, the query will be terminated. The units supported are d-day, h-hour, m-minute, s-second, ms-millisecond (for example, 1d represents 1 day, 3m represents 3 minutes).
    dlc.query.execution.mode
    async
    The engine query execution mode is set to async mode by default. In this mode, the task will perform a complete query calculation, save the results to COS, and then return them to the user, allowing the user to download the query results after the query is completed.
    Users can also change this value to sync. In sync mode, queries may not necessarily perform full calculations. Once partial results are available, they will be directly returned to the user by the engine, without being saved to COS. Therefore, users can achieve lower query latency and duration, but the results are only saved in the system for 30 seconds. This mode is recommended for users who do not need to download the complete query results from COS, but expect lower query latency and duration, such as during the query exploration phase or BI result display.

    Presto Execution Mode

    When the user selects the Presto engine, Data Exploration supports the user to choose to run in "Fast Mode" or "Full Mode".
    Quick Query: This offers faster speed, but the query results cannot be persistently saved. It is suitable for the exploration phase.
    Full Mode: Execute a full query and save the data to object storage.
    
    
    

    Search results

    Through the SQL editor, you can directly view the query results. You can expand or collapse the display height of the query results by clicking the
    
    chart.
    
    
    
    You can configure the query result storage directory through the configuration button in the upper right corner, supporting configuration to the COS path or built-in storage.
    
    
    
    The console will return a maximum of 1000 results for a single task. If more results are needed, the API can be used. For instructions on API-related operations, refer to the API Documentation.
    Query results can be downloaded locally when no COS storage path is specified. For detailed instructions, refer to Obtaining Task Results.

    Querying statistical data

    The query results under the Presto engine and SparkSQL engine support the display of optimized quantification with different characteristics.
    The SparkSQL engine supports viewing:
    1. Data Scanning Volume
    2. Cache Acceleration
    3. Adaptive Shuffle
    4. Materialized View Acceleration
    The Presto engine supports viewing:
    1. Data Scanning Volume
    2. Cache Acceleration
    3. Materialized View Acceleration
    Click on the Statistics column to review the statistical data and optimization suggestions for the query results.
    
    
    

    Historical Queries

    Each query page can save the running history of the past three months and supports viewing the query results of the past 24 hours. You can quickly find past task information through the running history. For detailed operations, refer to Task History Records.

    Download History Management

    Each query result's download task can be viewed in the Download history, where you can check the status of the download task and related parameter information.
    
    
    
    
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support