Users can use the DLC console or API to execute DDL statements to create a database.
Creating Table
Approach one: Create in Data Exploration
1. Log in to the DLC console, select the service region, log in to users need to have the permission to create tables. 2. Enter the Data Exploration module, in the left list, click on an existing database, hover over the table row, then click the icon, click Create Native Table or Create External Table. Note:
A native table refers to a table on the DLC managed storage. With a native table, you don't need to worry about the underlying Iceberg storage format, and it has capabilities like data optimization. To use a native table, you need to enable managed storage first, see Managed Storage Configuration for details. The underlying data of the external table resides on your own COS. Creating an external table requires specifying the data path.
3. After clicking Create Native Table/Create External Table, the system will automatically generate an SQL template for creating a data table. Users can modify the SQL template to create a data table. After clicking Run, the SQL statement to create the data table is executed, completing the creation.
Approach two: Create in Data Management
The Data Management module supports managing native tables and external tables stored in DLC.
1. Log in to the DLC console, select the service region, log in to users need to have the permission to create tables. 2. Through the left menu, enter Data Management, enter Database, click the name of the database where the data table is located, enter the DMC page.
3. Click Create Native Table or Create External Table button to enter the data table configuration page.
Native table data sources support three different types: empty table, local upload, and COS COS. Choosing different data sources corresponds to different creation processes. Native tables support capabilities such as data optimization and can choose to inherit database governance rules or individually turn them on/off. 3.1 Create Empty Table: Create an empty table with no records.
Data Table Name: Cannot start with a number, supports uppercase and lowercase letters, numbers, and underscores—, with a maximum of 128 characters.
Support for entering data table description information.
Manually add and enter column names and field types. Supports the configuration of three complex type fields: array/map/struct.
3.2 Local Upload: Upload local form files to DLC to create data tables, supports files up to 100MB.
CSV: Supports visual configuration of CSV parsing rules, including Compression Format, Column Splitting Symbol, Field Domain Symbol. Supports automatic inference of the data file's Schema and parsing the first row as Column Names.
Json: DLC only recognizes the first level of Json as columns, supports automatic inference of the Json file's Schema. The system will recognize the first level fields of Json as Column Names.
Supports common Big Data Format files like Parquet, ORC, AVRO, etc.
Manually add and enter Column Names and Field Types.
If the Automatic Structure Inference is selected, DLC will automatically fill in the detected columns, Column Names, and Field Types. If incorrect, please manually modify.
3.3 Create a data table through COS COS.
Create a data table by reading the COS data buckets under the current account.
CSV: Supports visual configuration of CSV parsing rules, including Compression Format, Column Splitting Symbol, Field Domain Symbol. Supports automatic inference of the data file's Schema and parsing the first row as Column Names.
Json: DLC only recognizes the first level of Json as columns, supports automatic inference of the Json file's Schema. The system will recognize the first level fields of Json as Column Names.
Supports common Big Data Format files like Parquet, ORC, AVRO, etc.
Manually add and enter Column Names and Field Types.
If the Automatic Structure Inference is selected, DLC will automatically fill in the detected columns, Column Names, and Field Types. If incorrect, please manually modify.
4. Data Partitioning is often used to enhance Query Performance and is applied to large volume tables. DLC supports data querying by Data Partitioning. Users need to add partition information at this step. By partitioning your data, you can limit the amount of data scanned with each query, thereby improving Query Performance and reducing usage costs. DLC adheres to Apache Hive's partitioning rules.
The partition column corresponds to a subdirectory under the COS path of the table, with the directory naming convention being Partition Column Name=Partition Column Value.
Example:
cosn://nanjin-bucket/CSV/year=2021/month=10/day=10/demo1.csv
cosn://nanjin-bucket/CSV/year=2021/month=10/day=11/demo2.csv
If there are multiple partition columns, they need to be nested in the order specified in the create table statement.
CREATE EXTERNAL TABLE IF NOT EXISTS `COSDataCatalog`.`dlc_demo`.`table_demo` (
`_c0` string,
`_c1` string,
`_c2` string,
`_c3` string
) PARTITIONED BY (`year` string, `month` string, `day` string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES ('separatorChar' = ',', 'quoteChar' = '"')
STORED AS TEXTFILE
LOCATION 'cosn://bucket_name/folder_name/';
Query basic information of the data table
Approach one: Query in Data Exploration
In the Data Table Item, mouse hover over the Data Table Name row, then click the icon, in the Dropdown Menu click Basic info to view the basic information of the created data table. The basic information of the data table is as follows:
Approach two: View in Data Management
1. Log in to the DLC Console, select the service region. Users need to have the permission to view data tables. 2. Through the left menu, enter the Data Management page, click the name of the database where the data table is located, enter the DMC page. It supports querying information such as the number of rows, storage space, creator, fields, partitions, etc.
Preview Data Table Data
In the Data Table Item, hover the mouse over the Data Table Name row, then click the icon, in the Dropdown Menu click Preview Data. DLC will automatically generate a SQL Statements to preview the first 10 rows of data, executing the SQL Statements to query the top 10 rows of the data table.
Support for previewing data in Data Management > Database > Data Table > Data Table List.
The Data Preview Function by default displays the first 100 rows of data.
Editing Data Table Information
Support editing the Description information of the data table in the Data Management module.
1. Log in to the DLC Console, select the Service Region. Users need to have the permission to edit data tables. 2. Through the left menu, enter the Data Management > Database page, click the name of the database where the data table is located, enter the DMC page.
3. Find the data you need to edit, click the Edit button on the right to edit.
4. After modification, click the Confirm button to complete the editing.
Dropping a Table
Approach one: Delete in Data Exploration
In the Data Table Items, hover the mouse over the Data Table Name row, then click the icon, in the dropdown menu click Delete. DLC will automatically generate the SQL statement to drop the data table, execute the SQL statement to drop the table. Dropping an external table, dropping a data table only removes the metadata stored in DLC, it does not affect the data source file.
Deleting tables under the DataLakerCatalog directory will clear all data of that table, proceed with caution.
Approach two: Delete in Data Management
Currently, Data Management only supports the management of databases and tables hosted in DLC. For external tables, please use approach one for deletion.
1. log in to the DLC Console, select the service region, users need to have the permission to delete data tables. 2. Through the left menu, enter Data Management > Database, click the name of the database where the data table is located, to enter the DMC page.
3. Click the Delete button after the data table you wish to delete, after confirmation, the corresponding data table can be deleted and its data cleared.
Show create table statement
In the Data Table Item, hover the mouse over the Data Table Name row, then click the icon, in the dropdown menu click Show table creation statements. DLC will automatically generate the SQL statement to view the create table statement for that data table, execute the SQL statement to query the create table statement.
System constraints
DLC allows up to 4096 data tables under each database, supports a maximum of 100,000 partitions per data table, and the maximum number of attribute columns per data table is 4096.
DLC will recognize data files under the same COS path as data from the same table, please ensure data for separate tables is kept in separate folder hierarchies.
DLC does not support multi-version data in COS; it can only query the latest version of data in a COS bucket.
All tables created on DLC are external tables, and the SQL statement to create the table must include the EXTERNAL keyword.
Table names must be unique within the same database.
Table names are case-insensitive and only support letters, numbers, and underscores (_), with a maximum length of 128 characters.
If the table is a partitioned table, you must manually execute the ADD PARTITION statement or the MSCK statement to add partition information before you can query the partition data. For more details, see Query partitioned table. When creating a table with CSV, DLC will by default convert all field types to string, but this does not affect the computation and querying of raw data fields.
Was this page helpful?