DLC Source Table Operation Configuration

Overview
When using DLC Native Table (Iceberg), users can follow the process below to create native tables and complete the necessary configurations.
﻿
﻿
﻿
﻿
Step I: Enabling Managed Storage
Note:
Managed storage must be enabled by a DLC administrator.
Enabling managed storage requires operations in the console. For details, see Managed Storage Configuration. If you use a metadata acceleration bucket, pay attention to permission configurations. For details, see Binding of Metadata Acceleration Bucket. Note that shared engines cannot access metadata acceleration buckets.
Step II: Creating the DLC Native Table
There are two ways to create native tables.
1. Create a visual table through the console interface.
2. Create a table using SQL.
Note:
 A database must be created before a DLC Native Table is created.
Creating Tables through the Console Interface
DLC provides a data management module for table creation. For detailed operations, see Data Management.
Creating Tables through SQL
When creating tables through SQL, users write their own CREATE TABLE SQL statements. For DLC Native Table (Iceberg) creation, table descriptions, locations, and table formats do not need to be specified. However, some advanced parameters need to be included depending on the use case, and those parameters are added through TBLPROPERTIES.
If parameters were not included when you created the table or if certain attributes need to be modified, use the alter table set tblproperties command. After the alter table command is executed, restart the upstream import tasks to complete the attribute modification or addition.
Typical table creation statements for Append and Upsert scenarios are shown as follows. Users can adjust these statements based on their actual needs.
Append Scenario Table Creation
 CREATE TABLE IF NOT EXISTS `DataLakeCatalog`.`axitest`.`append_case` (`id` int, `name` string, `pt` string) 
 PARTITIONED BY (`pt`) 
 TBLPROPERTIES (
    'format-version' = '1',
    'write.upsert.enabled' = 'false',
    'write.distribution-mode' = 'hash',
    'write.metadata.delete-after-commit.enabled' = 'true',
    'write.metadata.previous-versions-max' = '100',
    'write.metadata.metrics.default' = 'full',
    'smart-optimizer.inherit' = 'default'
);
Upsert Scenario Table Creation
For Upsert scenario table creation, specify the version as 2, and set the write.upsert.enabled attribute to true, and configure bloom filters according to upsert key-values. If users have multiple primary keys, generally use the first two key-values for bloom filter configuration. If the upsert table is not partitioned and updates frequently with large data volumes, consider doing bucketing by primary key for distribution.
Examples for both partitioned and non-partitioned tables are provided as follows.
// Partitioned table
CREATE TABLE IF NOT EXISTS `DataLakeCatalog`.`axitest`.`upsert_case` (`id` int, `name` string, `pt` string)
PARTITIONED BY (bucket(4, `id`))
TBLPROPERTIES (
    'format-version' = '2',
    'write.upsert.enabled' = 'true',
    'write.update.mode' = 'merge-on-read',
    'write.merge.mode' = 'merge-on-read',
    'write.parquet.bloom-filter-enabled.column.id' = 'true',
    'dlc.ao.data.govern.sorted.keys' = 'id',
    'write.distribution-mode' = 'hash',
    'write.metadata.delete-after-commit.enabled' = 'true',
    'write.metadata.previous-versions-max' = '100',
    'write.metadata.metrics.default' = 'full',
    'smart-optimizer.inherit' = 'default'
);
// Non-partitioned table
CREATE TABLE IF NOT EXISTS `DataLakeCatalog`.`axitest`.`upsert_case` (`id` int, `name` string, `pt` string)
TBLPROPERTIES (
    'format-version' = '2',
    'write.upsert.enabled' = 'true',
    'write.update.mode' = 'merge-on-read',
    'write.merge.mode' = 'merge-on-read',
    'write.parquet.bloom-filter-enabled.column.id' = 'true',
    'dlc.ao.data.govern.sorted.keys' = 'id',
    'write.distribution-mode' = 'hash',
    'write.metadata.delete-after-commit.enabled' = 'true',
    'write.metadata.previous-versions-max' = '100',
    'write.metadata.metrics.default' = 'full',
    'smart-optimizer.inherit' = 'default'
);
Modifying Table Attributes
If related attribute values were not included when the user created the table, use the alter table to modify, add, or remove attribute values, as shown below. Any changes to table attribute values can be made this way. Note that the Iceberg format-version field cannot be modified. Additionally, if the table already has real-time imports from InLong/Oceanus/Flink, you need to restart the upstream import businesses after modifications.
// Modify conflict retry attempts to 10
﻿
ALTER TABLE `DataLakeCatalog`.`axitest`.`upsert_case` SET TBLPROPERTIES('commit.retry.num-retries' = '10');
// Cancel bloom filter setting for the name field
﻿
ALTER TABLE `DataLakeCatalog`.`axitest`.`upsert_case` UNSET TBLPROPERTIES('write.parquet.bloom-filter-enabled.column.id');
Step III: Data Optimization and Lifecycle Configuration
Data optimization and lifecycle configuration can be done in two ways.
1. Through the console interface for visual configuration
2. Through SQL for configuration
Through the Console Interface for Configuration
DLC provides a data management module for configuration. For detailed operations, see Enable data optimization.
Through SQL for Configuration
DLC defines detailed attributes for managing data optimization and lifecycle. You can flexibly configure data management and lifecycle based on business characteristics. For detailed data optimization and lifecycle configuration values, see Enable data optimization.
Configuring the Database
The data optimization and lifecycle of the database can be adjusted through DBPROPERTIES, as shown below.
// Enable write optimization for the my_database table and do not inherit the data management policy.
﻿
ALTER DATABASE DataLakeCatalog.my_database SET  DBPROPERTIES ('smart-optimizer.inherit'='none', 'smart-optimizer.written.enable'='enable');
// Set my_database to inherit the data management policy.
﻿
ALTER DATABASE DataLakeCatalog.my_database SET  DBPROPERTIES ('smart-optimizer.inherit'='default');
// Disable lifecycle for the my_database table and do not inherit the data management policy.
﻿
ALTER DATABASE DataLakeCatalog.my_database SET  DBPROPERTIES ('smart-optimizer.inherit'='none', 'smart-optimizer.lifecycle.enable'='disable');
Configuring the Data Table
Data optimization and lifecycle for data tables are configured through TBLPROPERTIES, as shown below.
// Disable write optimization for the upsert_cast table and do not inherit the database policy.
﻿
ALTER TABLE `DataLakeCatalog`.`axitest`.`upsert_case` SET TBLPROPERTIES('smart-optimizer.inherit'='none', 'smart-optimizer.written.enable'='disable');
// Set the upsert_cast table to inherit the database policy.
﻿
ALTER TABLE `DataLakeCatalog`.`axitest`.`upsert_case` SET TBLPROPERTIES('smart-optimizer.inherit'='default');
// Enable lifecycle for the upsert_cast table, set the lifecycle duration to 7 days, and do not inherit the database policy.
﻿
ALTER TABLE `DataLakeCatalog`.`axitest`.`upsert_case` SET TBLPROPERTIES('smart-optimizer.inherit'='none', 'smart-optimizer.lifecycle.enable'='enable', 'smart-optimizer.lifecycle.expiration'='7');
Step IV: Data Ingestion into Native Table
DLC Native Table supports multiple data ingestion methods. Depending on your data source, see DLC Native Table Lake Ingestion Practice.
Step V: Viewing Data Optimization Tasks
You can view data governance tasks in the DLC console under the Data Operation and Maintenance menu by navigating to the Historical Tasks page. You can query tasks using keywords such as CALL, Auto, database name, and table name.
Note:
To view system data optimization tasks, users need the permissions of DLC administrators.
Tasks with IDs starting with "Auto" are automatically generated data optimization tasks. As shown in the table below.
﻿
﻿
﻿
﻿
You can also click View Details to check the basic information and results of running the tasks.
﻿
﻿
﻿
﻿

Was this page helpful?

You can also Contact Sales or Submit a Ticket for help.

Yes

tencent cloud

Sign Up

Log in

Compute

Microservice

Data Migration

Database SaaS Tool

Data Security

Application Security

Big Data

Voice Technology

Internet of Things

Stream Services

Cloud Real-time Rendering

Management and Audit Tools

Edge Computing

Serverless

Relational Database

Networking

Business Security

Domains & Websites

Face Recognition

AI Platform Service

Middleware

Media On-Demand

Game Services

Developer Tools

Container

Essential Storage Service

Enterprise Distributed DBMS

CDN and Acceleration

Security Services

Enterprise Applications

Image Creation

Natural Language Processing

Communication

Media Process Services

Education Sevices

Monitor and Operation

Distributed cloud

Data Process and Analysis

NoSQL Database

Network Security

Cloud Security

Office Collaboration

Tencent Big Model

Optical Character Recognition

Interactive Video Services

Media SDK

Cloud Resource Management

More

Overview

Step I: Enabling Managed Storage

Step II: Creating the DLC Native Table

Creating Tables through the Console Interface

Creating Tables through SQL

Append Scenario Table Creation

Upsert Scenario Table Creation

Modifying Table Attributes

Step III: Data Optimization and Lifecycle Configuration

Through the Console Interface for Configuration

Through SQL for Configuration

Configuring the Database

Configuring the Data Table

Step IV: Data Ingestion into Native Table

Step V: Viewing Data Optimization Tasks