tencent cloud

Feedback

Mounting CHDFS Instance

Last updated: 2022-03-30 09:30:26
    This document describes how to mount a CHDFS instance. After creating a CHDFS instance and a mount point, you can mount the instance to the mount point.

    Prerequisites

    The target server or container has Java 1.8 installed.
    The target server or container and the mount point are in the same VPC.
    The VPC IP of the target server or container matches the address authorized by a permission rule in the specified permission group of the mount point.

    Directions

    1. Download the CHDFS for Hadoop JAR package.
    1.1
    Place the JAR package in the corresponding directory. For an EMR cluster, it can be synced to the `/usr/local/service/hadoop/share/hadoop/common/lib/` directory of all nodes.
    3.Edit the core-site.xml file to add the following basic configuration:
    <!--Implementation class of CHDFS-->
    <property>
    <name>fs.AbstractFileSystem.ofs.impl</name>
    <value>com.qcloud.chdfs.fs.CHDFSDelegateFSAdapter</value>
    </property>
    <property>
    <name>fs.ofs.impl</name>
    <value>com.qcloud.chdfs.fs.CHDFSHadoopFileSystemAdapter</value>
    </property>
    <!--Temporary directory of the local cache. For data read/write, data will be written to the local disk when the memory cache is insufficient. This path will be created automatically if it does not exist-->
    <property>
    <name>fs.ofs.tmp.cache.dir</name>
    <value>/data/chdfs_tmp_cache</value>
    </property>
    <!--appId-->
    <property>
    <name>fs.ofs.user.appid</name>
    <value>1250000000</value>
    </property>
    1.1.1
    Sync `core-site.xml` to all Hadoop nodes.
    Note:
    For an EMR cluster, you only need to modify the HDFS configuration in component management in the EMR console for the above steps 3 and 4.
    1.
    Use the `hadoop fs` command line tool to run the `hadoop fs –ls ofs://${mountpoint}/` command. Here, `mountpoint` is the mount address. If the file list is output properly, the CHDFS instance has been mounted successfully.
    1.1
    You can also use other configuration items of Hadoop or MR tasks to run data tasks on CHDFS. For an MR task, you can change the default input and output file systems of the task to `CHDFS` through `-Dfs.defaultFS=ofs://${mountpoint}/`.

    Other Configuration Items

    Configuration Item
    Description
    Default Value
    Required
    fs.ofs.tmp.cache.dir
    Stores temporary data
    None
    Yes
    fs.ofs.map.block.size
    Block size of the CHDFS file system in bytes. The default value is 128 MB (this item only affects map segmentation and has nothing to do with the size of the underlying storage block of CHDFS)
    134217728
    No
    fs.ofs.data.transfer.thread.count
    Number of parallel threads when CHDFS transfers data
    32
    No
    fs.ofs.block.max.memory.cache.mb
    Size of the memory buffer used by the CHDFS plugin in MB (which accelerates both reads and writes)
    16
    No
    fs.ofs.block.max.file.cache.mb
    Size of the disk buffer used by the CHDFS plugin in MB (which accelerates writes)
    256
    No
    fs.ofs.prev.read.block.count
    Number of CHDFS blocks read ahead during reads (the size of the underlying block of CHDFS is generally 4 MB)
    4
    No
    fs.ofs.plugin.info.log
    Specifies whether to print plugin debugging logs. Logs are printed at the info level. Valid values: true, false
    false
    No
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support