tencent cloud

HDFS Data Import
Last updated: 2025-03-31 14:55:26
HDFS Data Import
Last updated: 2025-03-31 14:55:26
This document describes how to import data from HDFS to Tencent Cloud TCHouse-C.

Prerequisites

1. Read permissions of HDFS are required for HDFS data access. See Access Control Overview for how to set permissions.
2. The HDFS instance and Tencent Cloud TCHouse-C cluster must be in the same VPC.

Directions

1. Log in to Tencent Cloud TCHouse-C and create an HDFS table.
CREATE TABLE hdfs_engine_table
(
`int_id` UInt32
)
ENGINE = ENGINE=HDFS('hdfs://hdfs1:9000/other_storage', 'TSV')
Reference
ENGINE = HDFS(URI, format) URI is the URI of the entire file in HDFS, and format specifies an available file format. For more formats, see Formats for Input and Output Data. A path URI may contain glob wildcards. In this case, the table will be read-only.
2. Create a ClickHouse target table.
If your cluster has one replica:
CREATE TABLE test.test on cluster default_cluster
(
`int_id` UInt32
)
engine = MergeTree()
order by int_id;
If your cluster has two replicas:
create table test.test on cluster default_cluster
(
`int_id` UInt32
)
engine = ReplicatedMergeTree('/clickhouse/tables/test/test/{shard}', '{replica}')
order by int_id;
Create a distributed table:
create table test.test_dis on cluster default
AS test.test
engine = Distributed('default_cluster', 'test', 'test', rand());
3. Write data to the target table.
INSERT INTO test.test SELECT * FROM hdfs_engine_table;
4. Query the data.
select * from test.test

Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback