Tencent Cloud

Recent Pages

HDFS Data Import

Last updated: 2024-01-19 16:45:30

This document describes how to import data from HDFS to Cloud Data Warehouse.
Prerequisites
1. Read permissions of HDFS are required for HDFS data access. See Access Control Overview for how to set permissions.
2. The HDFS instance and Cloud Data Warehouse cluster must be in the same VPC.
Directions
1. Log in to Cloud Data Warehouse and create an HDFS table.
CREATE TABLE hdfs_engine_table
(
 `int_id` UInt32
)
ENGINE = ENGINE=HDFS('hdfs://hdfs1:9000/other_storage', 'TSV')
Reference
ENGINE = HDFS(URI, format)
URI is the URI of the entire file in HDFS, and format specifies an available file format. For more formats, see Formats for Input and Output Data. A path URI may contain glob wildcards. In this case, the table will be read-only.
2. Create a ClickHouse target table.
If your cluster has one replica:
CREATE TABLE test.test on cluster default_cluster
(
 `int_id` UInt32
)
engine = MergeTree()
order by int_id;
If your cluster has two replicas:
create table test.test on cluster default_cluster
(
 `int_id` UInt32
)
engine = ReplicatedMergeTree('/clickhouse/tables/test/test/{shard}', '{replica}')
order by int_id;
Create a distributed table:
create table test.test_dis on cluster default
AS test.test
engine = Distributed('default_cluster', 'test', 'test', rand());
3. Write data to the target table.
INSERT INTO test.test SELECT * FROM hdfs_engine_table;
4. Query the data.
select * from test.test
﻿

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support

tencent cloud

Recent Pages

HDFS Data Import

Prerequisites

Directions

Was this page helpful?

Was this page helpful?

tencent cloud

Sign Up

Log in

Recent Pages

HDFS Data Import

Prerequisites

Directions

Was this page helpful?

Was this page helpful?