Using HBase Through API

Elastic MapReduce

Release Notes and Announcements

Release Notes

Announcements

Alarm Policy Migration

Security Announcements

Notice for Apache Log4j 2 RCE Vulnerability

Product Introduction

Constraints and Limits

Technical Support Scope

Product release

Version Overview

Overview of Component Versions

Purchase Guide

EMR on CVM Billing Instructions

Billing Overview

Purchase Instructions

Cost Allocation by Tag

EMR on TKE Billing Instructions

Billing Overview

Purchase Instructions

Payment Overdue

EMR Serverless HBase Billing Instructions

Billing Overview

Purchase Instructions

Pay-As-You-Go to Monthly Subscription

Monthly Subscription Refund Instructions

Renewal Instructions

Overdue Payment

Getting Started

EMR on CVM Quick Start

EMR on TKE Quick Start

EMR on CVM Operation Guide

Planning Cluster

Business Evaluation

Cluster Types

Cross-AZ Cluster

Cross-AZ Cluster Deployment

Cross-AZ High Availability

Configuring Cluster

Creating Cluster

Administrative rights

CAM Overview

Role Authorization

Collaborator/Sub-user Permissions

CAM-Enabled EMR API Authorization Granularity Details

Authentication Granularity Scheme

Cluster COS Service Role

Setting Tags

Bootstrap Actions

Software Configuration

Mounting CHDFS Instance

Unified Management of Hive Metadata

Setting Security Groups

Component Configuration Sharing

Managing Cluster

Managing Service

Managing Users

Adding Components

Restarting Service

Starting/Stopping Services

WebUI Access

Role Management

Client Management

Configuration Management

Configuration Status

Configuration Rollback

Configuration Group Management

Service List

YARN Resource Scheduling

Overview

Configuring Fair Scheduler

Configuring Capacity Scheduler

Label Management

Viewing Scheduling History

HBase RIT Fixing

Component Port Information

Monitoring and Alarms

Application Analysis

StarRocks Query Management

Hive Data Table Analysis

YARN Job Query

HDFS File Storage Analysis

Impala Query Management

Monitoring Metrics

EMR on TKE Operation Guide

Introduction to EMR on TKE

Configuring Cluster

Permission Management

Role Authorization

Creating Cluster

Cluster Management

Adjusting the Number of Pods

Modifying Configuration

Service Management

Deployment Instructions

Configuration Management

Configuration Update

Configuration Rollback

Monitoring and Ops

Application Analysis

EMR Serverless HBase Operation Guide

EMR Serverless HBase Product Introduction

Planning an Instance

Managing an Instance

Modifying an Instance

Table Management

Setting EC Policy

Terminating an Instance

Monitoring and Alarms

Instance Monitoring

Data Table Analysis

Configuring Alarms

Development Guide

Serverless HBase Instructions

EMR Development Guide

Hadoop Development Guide

HDFS Common Operations

HDFS Federation Management Development Guide

HDFS Federation Management

Submitting MapReduce Tasks

Automatically Adding Task Nodes Without Assigning ApplicationMasters

YARN Task Queue Management

Practices on YARN Label Scheduling

Hadoop Practical Tutorial

Using API to Analyze Data in HDFS and COS

Dumping YARN Job Logs to COS

Spark Development Guide

Spark Environment Info

Using Spark to Analyze Data in COS

Using Spark Python to Analyze Data in COS

SparkSQL Tutorial

Integrating Spark Streaming with Ckafka

Practices on Dynamic Scheduling of Spark Resources

Spark Integration with Kafka

Spark Dependencies in Each EMR Version

Hbase Development Guide

Using HBase Through API

Using Hbase with Thrift

Spark on Hbase

MapReduce on Hbase

Phoenix on Hbase Development Guide

Phoenix Client Usage

Phoenix JDBC Usage

Phoenix Practical Tutorial

Hive Development Guide

Hive Overview

Basic Hive Operations

Hive Connection Methods

Configuring Hive Execution Engine

Advanced Usage

Configuring LDAP Authentication

HiveServer2 CLB

Hive Metadata Management

Custom Functions UDF

Practical Tutorial

Mapping Hbase Tables

Practices on Loading JSON Data to Hive

Accessing Iceberg Data with Hive

Accessing Hudi Data with Hive

Creating Databases and Tables in COS/CHDFS with Hive

Presto Development Guide

Presto Web UI

Connector

Analyzing Data in COS

Sqoop Development Guide

Import/Export of Relational Database and HDFS

Incremental Data Import into HDFS

Importing and Exporting Data Between Hive and TencentDB for MySQL

Hue Development Guide

Hue Overview

Hue Practical Tutorial

Oozie Development Guide

Flume Development Guide

Flume Overview

Storing Kafka Data in Hive Through Flume

Storing Kafka Data in HDFS or COS Through Flume

Storing Kafka Data in Hive Through Flume

Kerberos Development Guide

Kerberos Overview

Knox Development Guide

Alluxio Development Guide

Alluxio Development Documentation

Common Alluxio Commands

Mounting File System to Unified Alluxio File System

Using Alluxio in Tencent Cloud

Support for COS Transparent-URI

Support for Authentication

Kylin Development Guide

Kylin Overview

Livy Development Guide

Livy Overview

Kyuubi Development Guide

Kyuubi Overview

Kyuubi Practical Tutorial

Zeppelin Development Guide

Zeppelin Overview

Zeppelin Interpreter Configuration

Hudi Development Guide

Hudi Overview

Superset Development Guide

Superset Overview

Impala Development Guide

Impala Overview

Impala OPS Manual

Analyzing Data on COS/CHDFS

Druid Development Guide

Druid Overview

Druid Usage

Ingesting Data from Hadoop in Batches

Ingesting Data from Kafka in Real Time

TensorFlow Development Guide

TensorFlow Overview

TensorFlowOnSpark Overview

Kudu Development Guide

Kudu Overview

Data Migration Guide for Kudu Node Scale-In

Ranger Development Guide

Ranger Overview

Ranger User Guide

Integrating HDFS with Ranger

Integrating YARN with Ranger

Integrating HBase with Ranger

Integrating Presto with Ranger

Ranger Audit Log Guide

Storing Ranger Audit Logs in Solr

Storing Ranger Audit Logs in Tencent Cloud ElasticSearch

Kafka Development Guide

Kafka Overview

Use Cases

Kafka Usage

Iceberg Development Guide

StarRocks Development Guide

StarRocks Overview

User Guide

Flink Development Guide

Flink Overview

Analyzing COS Data with Flink

Practical Tutorial

Practice of EMR on CVM Ops

Migration of HiveServer2 and MetaStore to Router

Practice of Troubleshooting Unexecuted Auto-Scaling Rules

Practice Tutorial on Switching HDFS DataNode Maintenance Status

Data Migration

HDFS Data Migration Using COS

HDFS Data Migration Using DistCp

Practice of Hive Data Migration

Practical Tutorial on Custom Scaling

Practical Tutorial on Setting Scaling Rules

Practical Tutorial on Setting Time-based Scaling Rules

Practical Tutorial on Setting Load-based Scaling Rules

Practical Tutorial on Setting Mixed Scaling Rules

API Documentation

FAQs

EMR on CVM

Billing

Cluster Management

Service Level Agreement

DocumentationElastic MapReduceEMR Development GuideHbase Development GuideUsing HBase Through API

Using HBase Through API

Download PDF

Last updated: 2025-02-12 16:39:57

Using HBase Through API

Last updated: 2025-02-12 16:39:57

Download PDF

HBase is an open-source, high-reliability, high-performance, column-oriented, scalable distributed storage system developed based on Google BigTable. It uses Hadoop file system (HDFS) as the file storage system, Hadoop MapReduce for processing massive amounts of data in HBase, and ZooKeeper for collaboration.
HBase consists of ZooKeeper, HMaster, and HRegionServer. ZooKeeper prevents single points of failure on HMaster, and its master election mechanism guarantees that there is always a master node available in the cluster.
HMaster manages the CRUD operations on the tables and the load balancing of the HRegionServers. In addition, when an HRegionServer exits, it moves the HRegion of that HRegionServer to another.
HRegionServer is the core module in HBase and responsible for reading and writing data from and to HDFS based on user's I/O requests. HRegionServer internally manages a series of HRegion objects, each of which corresponds to a Region and consists of multiple Stores. Each Store corresponds to the storage in the Column Family.
This development guide describes how to use an EMR cluster for development from a technical perspective. For data security reasons, only VPC access is currently supported for EMR.
1. Development Preparations
Confirm that you have activated Tencent Cloud and created an EMR cluster. When creating the EMR cluster, select the HBase and ZooKeeper components on the software configuration page.
2. Using HBase Shell
Log in to a master node of the EMR cluster first before using HBase Shell. For more information on how to log in to EMR, please see Logging in to Linux Instance Using Standard Login Method. You can choose to log in with WebShell. Click "Log in" on the right of the desired CVM instance to enter the login page. The default username is root, and the password is the one you set when creating the EMR cluster. Once the correct information is entered, you can enter the EMR command line interface.
Run the following command on the EMR command line interface to switch to the Hadoop user and go to the directory /usr/local/service/hbase:
[root@172 ~]# su hadoop
[hadoop@10root]$ cd /usr/local/service/hbase
You can enter HBase Shell by running the following command:
[hadoop@10hbase]$ bin/hbase shell
Enter help in HBase Shell to see basic usage information and command examples. Next, create a table by running the following command:
hbase(main):001:0> create 'test', 'cf'
Once the table is created, you can run the list command to see whether it exists.
hbase(main):002:0> list 'test'
TABLE                                                                             
test                                                                               
1 row(s) in 0.0030 seconds
﻿
=> ["test"]
Run the put command to add elements to the table you created:
hbase(main):003:0> put 'test', 'row1', 'cf:a', 'value1'
0 row(s) in 0.0850 seconds
﻿
hbase(main):004:0> put 'test', 'row2', 'cf:b', 'value2'
0 row(s) in 0.0110 seconds
﻿
hbase(main):005:0> put 'test', 'row3', 'cf:c', 'value3'
0 row(s) in 0.0100 seconds
Three values are added to the created table. The first is "value1" inserted into row "row1" column "cf:a", and so on.
Run the scan command to traverse the entire table:
hbase(main):006:0> scan 'test'
ROW  COLUMN+CELL                                                                   
row1   column=cf:a, timestamp=1530276759697, value=value1                         
row2   column=cf:b, timestamp=1530276777806, value=value2                         
row3   column=cf:c, timestamp=1530276792839, value=value3                         
3 row(s) in 0.2110 seconds
Run the get command to get the value of the specified row in the table:
hbase(main):007:0> get 'test', 'row1'
COLUMN  CELL                                                                       
 cf:a       timestamp=1530276759697, value=value                                   
1 row(s) in 0.0790 seconds
Run the drop command to delete a table, which should be disabled first before deletion by running the disable command:
hbase(main):010:0> disable 'test'
hbase(main):011:0> drop 'test'
Finally, run the quit command to close HBase Shell.
For more HBase Shell commands, please see the official documentation.
3. Using HBase with APIs
Download and install Maven first and then configure its environment variables. If you are using IDE, please configure Maven-related items in your IDE.
Creating Maven project
Enter the directory of the Maven project, such as D://mavenWorkplace, and create the project by running the following commands:
mvn      archetype:generate      -DgroupId=$yourgroupID       -DartifactId=$yourartifactID 
-DarchetypeArtifactId=maven-archetype-quickstart
Here, $yourgroupID is your package name, $yourartifactID is your project name, and maven-archetype-quickstart indicates to create a Maven Java project. Some files need to be downloaded during the project creation, so please keep the network connected.
After successfully creating the project, you will see a folder named $yourartifactID in the D://mavenWorkplace directory. The files included in the folder have the following structure:
simple
　　　---pom.xml　　　　Core configuration, under the project root directory
　　　---src
　　　　　---main　　　　　　
　　　　　　　---java　　　　  Java source code directory
　　      　---resources　  Java configuration file directory
　　　　---test
　　　　　　---java　　　　  Test source code directory
　　　　　　---resources　  Test configuration directory
Among the files above, pay extra attention to the pom.xml file and the Java folder under the main directory. The pom.xml file is primarily used to create dependencies and package configurations; the Java folder is used to store your source code.
Adding Hadoop dependencies and sample code
First, add the Maven dependencies to the pom.xml file:
<dependencies>
　　　    <dependency>
　　　            <groupId>org.apache.hbase</groupId>
　　　            <artifactId>hbase-client</artifactId>
　　　            <version>1.2.4</version>
        </dependency>
</dependencies>
Then, add the packaging and compiling plugins to the pom.xml file:
<build>
<plugins>
  <plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-compiler-plugin</artifactId>
    <configuration>
      <source>1.8</source>
      <target>1.8</target>
      <encoding>utf-8</encoding>
    </configuration>
  </plugin>
  <plugin>
    <artifactId>maven-assembly-plugin</artifactId>
    <configuration>
      <descriptorRefs>
      <descriptorRef>jar-with-dependencies</descriptorRef>
      </descriptorRefs>
    </configuration>
    <executions>
      <execution>
        <id>make-assembly</id>
        <phase>package</phase>
        <goals>
          <goal>single</goal>
        </goals>
      </execution>
    </executions>
  </plugin>
</plugins>
</build>
Before adding the sample code, you need to get the ZooKeeper address of the HBase cluster. Log in to any master or core node in EMR, go to the /usr/local/service/hbase/conf directory, and view the hbase.zookeeper.quorum configuration in the hbase-site.xml file for ZooKeeper's IP address $quorum and the hbase.zookeeper.property.clientPort configuration for the port number $clientPort.
Then, add the sample code by creating a Java Class named PutExample.java in the main>java folder and adding the following code to it:
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.hbase.io.compress.Compression.Algorithm;
﻿
import java.io.IOException;
﻿
/**
 * Created by tencent on 2018/6/30.
 */
public class PutExample {
    public static void main(String[] args) throws IOException {
        Configuration conf = HBaseConfiguration.create();
        conf.set("hbase.zookeeper.quorum","$quorum");
        conf.set("hbase.zookeeper.property.clientPort","$clientPort");
        conf.set("zookeeper.znode.parent", "$znodePath");
﻿
        Connection connection = ConnectionFactory.createConnection(conf);
        Admin admin = connection.getAdmin();
﻿
        HTableDescriptor table = new HTableDescriptor(TableName.valueOf("test1"));
        table.addFamily(new HColumnDescriptor("cf").setCompressionType(Algorithm.NONE));
﻿
        System.out.print("Creating table. ");
        if (admin.tableExists(table.getTableName())) {
            admin.disableTable(table.getTableName());
            admin.deleteTable(table.getTableName());
        }
        admin.createTable(table);
﻿
        Table table1 = connection.getTable(TableName.valueOf("test1"));
        Put put1 = new Put(Bytes.toBytes("row1"));
        put1.addColumn(Bytes.toBytes("cf"), Bytes.toBytes("a"),
                Bytes.toBytes("value1"));
        table1.put(put1);
        Put put2 = new Put(Bytes.toBytes("row2"));
        put2.addColumn(Bytes.toBytes("cf"), Bytes.toBytes("b"),
                Bytes.toBytes("value2"));
        table1.put(put2);
        Put put3 = new Put(Bytes.toBytes("row3"));
        put3.addColumn(Bytes.toBytes("cf"), Bytes.toBytes("c"),
                Bytes.toBytes("value3"));
        table1.put(put3);
﻿
        System.out.println(" Done.");
    }
}
Compiling code and packaging it for upload
Use the local command prompt to enter the project directory and run the following command to compile and package the project:
mvn package
"Build success" indicates that package is successfully created. You can see the generated .jar package in the target folder under the project directory.
Upload the package file to the EMR cluster with the scp or sftp tool. Be sure to include the dependencies in the .jar package to be uploaded. Run the following command in local command line mode:
scp $localfile root@public IP address:$remotefolder
Here, $localfile is the path and the name of your local file; root is the CVM instance username. You can look up the public IP address in the node information in the EMR or CVM console. $remotefolder is the path where you want to store the file in the CVM instance. After the upload is completed, you can check whether the file is in the corresponding folder on the EMR command line.
4. Running Demo
Log in to a master node of the EMR cluster and switch to the Hadoop user. Run the following command to run the demo:
[hadoop@10 hadoop]$ java –jar $package.jar
If the console outputs "Done", all operations are completed. You can switch to HBase Shell and run the list command to see whether the HBase table is successfully created with the API, and if yes, you can run the scan command to see the detailed content of the table.
[hadoop@10hbase]$ bin/hbase shell
hbase(main):002:0> list 'test1'
TABLE                                                                                           
Test1                                                                                           
1 row(s) in 0.0030 seconds
﻿
=> ["test1"]
hbase(main):006:0> scan 'test1'
ROW  COLUMN+CELL                                                                                 
row1   column=cf:a, timestamp=1530276759697, value=value1                                       
row2   column=cf:b, timestamp=1530276777806, value=value2                                       
row3   column=cf:c, timestamp=1530276792839, value=value3                                       
3 row(s) in 0.2110 seconds
For more information on API usage, please see the official documentation.

Was this page helpful?

You can also Contact Sales or Submit a Ticket for help.

Yes

tencent cloud

New User Offers

Next-Generation CDN：EdgeOne

Elasticsearch Service Special Offers

Free Tier

Tencent Cloud Startup Program

Special Offers

Lighthouse Special Offers

Cloud Object Storage Special Offers

Featured Products

New Products

Education

Tencent Cloud Online Education Solutions

Gaming

Gaming Solution

Game Media Solutions

Financial Services

Financial Services Solution

Audio & Video

Audio/Video Solution

LVB Recording Solution

Interactive Classroom Solution

Interactive Live Streaming Solution

Audio Chat Social Networking Solution

Real Estate

Tencent Cloud LinkBase(Weiling)

E-commerce

E-commerce retail solutions

Compute

Cloud Virtual Machine

Auto Scaling

Batch Compute

CVM Dedicated Host

Database

TencentDB for MySQL

TencentDB for Redis®

TencentDB for CTSDB

TDSQL for MySQL

Data Transfer Service

TencentDB for MongoDB

TencentDB for PostgreSQL

TencentDB for SQL Server

TencentDB for TcaplusDB

Video Service

Cloud Streaming Services

Video on Demand

Media Processing Service

Cloud Application Rendering

Cloud Contact Center

Game Multimedia Engine

Chat

Real-time Communication

Tencent Effect SDK

AI and Machine Learning

Image Creation Large Model

Face Fusion

eKYC

Optical Character Recognition

Video Creation Large Model

Industry Applications

Tencent HealthCare Omics Platform

Container and Middleware

TDMQ for CKafka

Serverless Cloud Function

Tencent Kubernetes Engine

Tencent Kubernetes Engine for Serverless

Networking

Cloud Load Balancer

Virtual Private Cloud

Direct Connect

Cloud Connect Network

NAT Gateway

VPN Connection

Bandwidth Package

Anycast Internet Acceleration

Elastic Network Interface

Flow Logs

Global Application Acceleration Platform

Security

Captcha