Overview
CKafka Connector provides data distribution capabilities. You can distribute CKafka data to the data warehouse ClickHouse for storage, query, and analysis.
Prerequisites
If you are using Tencent Cloud's managed ClickHouse products, you need to enable the relevant product feature. It also supports data distribution to self-built ClickHouse.
Create a table in ClickHouse and specify a column and type during table creation.
A connection to the target ClickHouse for data distribution has been created.
Directions
Creating a Task
2. In the left sidebar, click Connectors > Task List , select the right region, and then click Create Task .
3. Fill in the task name, select Data Distribution as the task type, and select ClickHouse as the data target type, click Next .
4. Configure data source information.
Topic Type: Select the data source Topic.
Elastic Topic: Select the pre-created elastic Topic. For details, see Topic Management. CKafka Instance Topic: Select the instance and Topic created in CKafka. If the instance has ACL policies configured, ensure the selected topic has read and write permissions. For details, see Creating Topic. Start Offset: Select how to handle historical messages during dump by setting the Topic offset.
5. After setting the above information, click Next , click Preview Topic Message , and the first message from the source topic will be obtained and parsed.
Currently, message parsing should meet the following requirements:
The message is a JSON string.
Source data should be in single-level JSON format. For nested JSON, you can use Data Processing to perform simple message format conversion. 6. (Optional) Enable the Processing Source Data button for source data. For detailed configuration, please see Simple Data Processing. 7. Click Next to configure the data target information.
Data Target: Select the created ClickHouse connection.
Cluster: The ClickHouse cluster name (Default is default_cluster
.).
Database: The ClickHouse database name.
Table: The name of the table created in the database. Currently, no table will be created automatically during data distribution to ClickHouse. You need to manually create the current ClickHouse target table. .
Source Data: Click to pull data from the source topic. Source data should be in single-level JSON format. For nested JSON format, you can use Data Processing++ to perform conversion.+ Handle Failed Message: Select how to handle messages that failed to deliver. Supports Discard , Retain and Ship to CLS (requires specifying the target log set and log topic and granting access to CLS) three methods.
Retain: It is suitable for test environments. The task will terminate without retrying if it fails to run and will record the failure reason in the Event Center.
Discard: It is suitable for production environments. The task will ignore the current failure message if it fails to run. It is recommended to use the Retain mode for test without errors before editing the task to Discard mode.
Delivery to CLS: Suitable for strict production environments. If a task fails to run, the failed messages, metadata, and reasons for failure will be uploaded to the specified CLS topic.
8. Click Submit. You can see the created task in the task list and view the task creation progress in the status bar.
Was this page helpful?