tencent cloud

All product documents
Data Lake Compute
Historical Task Instances
Last updated: 2025-03-21 12:32:40
Historical Task Instances
Last updated: 2025-03-21 12:32:40
Historical Task Instances focus on recording and managing various types of tasks performed by users in DLC for subsequent tracking, review, and optimization. Through the Historical Task Instances feature, users can quickly view the execution status of tasks, including start and end times, execution status (such as successful or falied), input and output details, and generated logs or error information. It provides users with the convenience of auditing and retrieval, helping users identify task health status, potential issues, and optimize resource configuration, etc.

Operation Steps

2. Enter the historical task instances page. Administrators can view all historical operation tasks in the past 45 days, and general users can query tasks related to themselves in the past 45 days.
3. Support filtering and viewing by task type, task status, creator, task time range, task name, ID, content, sub-channel, and other methods.
4. Click the task ID/name. Support view task details, including modules such as basic information, running result, task insights, and task logs.
5. Support user click to modify task configuration, quickly enter job details to adjust configuration for optimization.

Historical Task Instances List

Note:
The *field supports after enabling the insight feature. For enablement method, please see How to Enable Insight Feature.
Field Name
Description
Task ID
Unique identifier of the task.
Task name
Prefix_yyyymmddhhmmss_eight-digit uuid, where yyyymmddhhmmss is the task execution time.
Prefix rule:
1. The job task submitted by the console is prefixed with the job name. For example, if the user-created job is customer_segmentation_job and it is executed at 21:25:10 on November 26, 2024, the task id will be customer_segmentation_job_20241126212510_f2a65wk1. According to the current data format restriction, the job name should be <= 100 characters.
2. SQL type submitted on the data exploration page, prefixed with sql_query. Example: sql_query_20241126212510_f2a65wk1.
3. Data optimization tasks, according to the prefixes of different sub-types of optimization tasks, among them:
3.1 The prefix of the optimizer is only optimizer.
3.2 The SQL type of the optimized instance is optimizer_sql.
3.3 The batch type of the optimized instance is optimizer_batch.
3.4 Configuration task created when configuring the data optimization policy: optimizer_config.
4. Import data task, prefixed with import, for example: import_20241126212510_f2a65wk1.
5. Export data task, prefixed with export, for example: export_20241126212510_f2a65wk1.
6. Wedata submission, prefixed with wd, for example: wd_20241126212510_f2a65wk1.
7. Other API submissions, prefixed with customized, for example: customized_20241126212510_f2a65wk1.
8. Tasks created for metadata operations on the metadata management page, prefixed with metadata, for example: metadata_20241126212510_f2a65wk1.
Task status
Starting
Executing
Queuing up
Successful
Failed
Canceled
Expired
Task run timeout
Task content
Detailed content of the task. For job type tasks, it is a hyperlink to job details; for SQL type tasks, it is the complete sql statement.
Task type
Be divided into Job type, SQL type.
Task source
The origin of this task. Support data exploration tasks, data job tasks, data optimization tasks, import tasks, export tasks, metadata management, Wedata tasks, and API submission tasks.
Sub-channel
Users can customize sub-channels when submitting tasks via the API.
Compute resource
The computing engine/resource group used to run the task.
Consumed CU*H
During task execution, CU*H consumption occurs. Please note that the final CU consumption is subject to the bill, and the final result may vary. In the Spark scenario, it is approximately equal to the sum of Spark task execution durations divided by 3600.
Compute time
1. If the task supports insight feature, it is the execution time within the engine.
2. If the task does not support insight feature:
2.1 For a Spark SQL task, it is the platform scheduling time + consumed queuing time within the engine + execution time within the engine.
2.2 For a Spark job task, it is the platform scheduling time + engine startup duration + queuing time within the engine + execution time within the engine.
The execution time within the engine is the duration from the start execution of the first task of a Spark task to the task completion.
Scanned data volume
The physical data volume read from storage by this task is approximately equal to the sum of Stage Input Size in Spark UI in the Spark scenario.
*Scanned data records
The number of physical data entries read from storage by this task is, in the Spark scenario, approximately equal to the sum of Stage Input Records in Spark UI.
Creator
If it is a job type task, it refers to the creator of the job.
Executor
The user running the task.
Submitted at
The time when the user submits tasks.
*Engine execution time
The time when the first preemption of the CPU starts execution of the task, the start execution time of the first task within the Spark engine.
*Number of output files
The collection of this metric requires upgrading the Spark engine kernel to a version later than 2024.11.16.
Total number of files written by tasks through statements such as Insert. Case-insensitive to task type.
*Output small-sized files
The collection of this metric requires upgrading the Spark engine kernel to a version later than 2024.11.16.
Small File Definition: An individual file size of the output that is less than 4 MB is defined as a small file (controlled by the parameter spark.dlc.monitorFileSizeThreshold, with a default value of 4 MB, which can be configured globally or at the task level for the engine).
This metric definition: Total number of small files written by tasks through statements such as insert.
Case-insensitive to task type.
*Total output lines
The number of records output after this task processes data is, in the Spark scenario, approximately equal to the sum of Stage Output Records in Spark UI.
*Total output size
The Size of the record output after this task processes data is, in the Spark scenario, approximately equal to the sum of Stage Output Size in Spark UI.
*Data shuffle lines
Approximately equal to the sum of Stage Shuffle Read Records in Spark UI in the Spark scenario.
*Data shuffle size
Approximately equal to the sum of Stage Shuffle Read Size in Spark UI in the Spark scenario.
*Health status
Analyze the task to judge the health status of the task and determine whether optimization is required. Please see task insight for details.

Historical Task Instances Details

Basic Info

1. Users can view specific task content in execution content. For SQL tasks, view the complete SQL statement; for job tasks, view job details and job parameters.
2. Users can view relevant content about task resources in resource consumption, including consumed CU*H, computational overhead, scanned data volume, compute resource, kernel version, Driver resource, Executor resource, and count of Executors.
3. Users can view basic information of tasks in basic info, including task name, task ID, task type, task source, creator, executor, submission time, and engine execution time.
4. For tasks running on the SuperSQL SparkSQL or SuperSQL Presto engine, users can view the task running progress bar in query statistics, which includes the time taken for stages such as creating tasks, scheduling tasks, executing tasks, and obtaining results.

Running Result

After task completion, users can query the task result on the execution result page. There are two types of task results:
1. Write file information: For file writing tasks running on SuperSQL, standard engine, or Spark kernel engine, support user viewing of write file information.
Average file size
minimum file size
maximum file size
Total file size
2. Execution result: SQL task query statement, which can display the query result of the current task and support users to download query results.

Task Insight

After task completion, users can view task insight results on the task insight page. It supports analyzing the aggregate metrics that each task has executed and insights into optimizable issues. Based on the actual execution situation of the current task, DLC task insight will combine data analysis and algorithm rules to provide corresponding optimization suggestions. For details, please see Task Insight.

Task Log

Users can view the logs of the current task on the task log page.
1. Support switching logs of nodes in different clusters through Pod Name, including Driver, Executor, etc.
2. Support three log level filters: All, Error, Warn.
3. This page only displays the last 1000 logs. If you need to view all log entries, you can export logs.
4. Support viewing log export records and the status of export tasks. In log export records, users can save log files locally.
Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support
Hong Kong, China
+852 800 906 020 (Toll Free)
United States
+1 844 606 0804 (Toll Free)
United Kingdom
+44 808 196 4551 (Toll Free)
Canada
+1 888 605 7930 (Toll Free)
Australia
+61 1300 986 386 (Toll Free)
EdgeOne hotline
+852 300 80699
More local hotlines coming soon