tencent cloud

Impala Data Source
Last updated: 2024-11-01 17:52:37
Impala Data Source
Last updated: 2024-11-01 17:52:37

Supported Editions

Supported by Impala version 4.1.0.

Impala Offline Single Table Read Node Configuration




Parameters
Description
Data Source
Select the configured Impala data source from the source end.
Database
Supports selection or manual input of the library name to read from.
By default, the database bound to the data source is used as the default database. Other databases need to be manually entered.
If the data source network is not connected and the database information cannot be fetched directly, you can manually enter the database name. Data synchronization can still be performed when the Data Integration network is connected.
Table
Supports selecting or manually entering the table name to be read.
If the data source network is not connected and table information cannot be fetched directly, you can manually enter the table name. Data synchronization can still be performed when the Data Integration network is connected.
Split Key
Specify the field for data sharding. After specifying, concurrent tasks will be launched for data synchronization. You can use a column in the source data table as the partition key. It is recommended to use the primary key or indexed column as the partition key.
Filter Conditions (Optional)
Fill in the corresponding filter statement based on the data type. This statement will serve as the filter condition for the data to be synchronized.

Impala Offline Single Table Write Node Configuration




Parameters
Description
Data Destination
Select the configured Impala data source from the target end.
Database
Supports selection or manual input of the library name to read from.
By default, the database bound to the data source is used as the default database. Other databases need to be manually entered.
If the data source network is not connected and the database information cannot be fetched directly, you can manually enter the database name. Data synchronization can still be performed when the Data Integration network is connected.
Table
Supports selecting or manually entering the table name to be read.
If the data source network is not connected and table information cannot be fetched directly, you can manually enter the table name. Data synchronization can still be performed when the Data Integration network is connected.
Whether to Clear Table
You can manually choose whether to clear the Impala data table before writing to it.
Batch Submission Size
The size of the record batch submitted at once can greatly reduce the number of network interactions between the data synchronization system and Impala, enhancing overall throughput. If this value is set too high, it may cause the data synchronization process to encounter OOM exceptions.
Pre-Executed SQL
SQL statements executed before the synchronization task. Fill in the SQL according to the correct SQL syntax corresponding to the data source type.
Post-Executed SQL
SQL statements executed after the synchronization task. Fill in the SQL according to the correct SQL syntax corresponding to the data source type.

Data type conversion support

Read

Impala Data Type
Internal Types
BIGINT,INT,SMALLINT,TINYINT
Long
DECIMAL,DOUBLE,FLOAT,REAL
Double
CHAR,VARCHAR,ARRAY,STRUCT
String
TIMESTAMP
Date
BOOLEAN
Boolean

Write

Internal Types
Impala Data Type
Long
BIGINT,INT,SMALLINT,TINYINT
Double
DECIMAL,DOUBLE,FLOAT,REAL
String
CHAR,VARCHAR,ARRAY,STRUCT
Date
TIMESTAMP
Boolean
BOOLEAN
Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback