tencent cloud

Elastic MapReduce

Release Notes and Announcements

Announcements

Alarm Policy Migration

Security Announcements

Notice for Apache Log4j 2 RCE Vulnerability

Product Introduction

Constraints and Limits

Technical Support Scope

Product release

Version Overview

Overview of Component Versions

Purchase Guide

EMR on CVM Billing Instructions

Billing Overview

Purchase Instructions

Payment Overdue

Refund Instructions

Cluster Renewal

Cost Allocation by Tag

EMR on TKE Billing Instructions

Billing Overview

Purchase Instructions

Payment Overdue

EMR Serverless HBase Billing Instructions

Billing Overview

Purchase Instructions

Pay-As-You-Go to Monthly Subscription

Monthly Subscription Refund Instructions

Renewal Instructions

Overdue Payment

Getting Started

EMR on CVM Quick Start

EMR on TKE Quick Start

EMR on CVM Operation Guide

Planning Cluster

Business Evaluation

Cross-AZ Cluster

Cross-AZ Cluster Deployment

Cross-AZ High Availability

Configuring Cluster

Creating Cluster

Administrative rights

Role Authorization

Collaborator/Sub-user Permissions

CAM-Enabled EMR API Authorization Granularity Details

Authentication Granularity Scheme

Cluster COS Service Role

Bootstrap Actions

Software Configuration

Mounting CHDFS Instance

Unified Management of Hive Metadata

Setting Security Groups

Component Configuration Sharing

Managing Cluster

Managing Service

Adding Components

Restarting Service

Starting/Stopping Services

Role Management

Client Management

Configuration Management

Configuration Management

Configuration Status

Configuration Rollback

Configuration Group Management

YARN Resource Scheduling

Configuring Fair Scheduler

Configuring Capacity Scheduler

Label Management

Viewing Scheduling History

HBase RIT Fixing

Component Port Information

Monitoring and Alarms

Cluster Overview

Application Analysis

StarRocks Query Management

Hive Data Table Analysis

HDFS File Storage Analysis

Impala Query Management

HBase Table Analysis

Kudu Table Analysis

Cluster Inspection

Monitoring Metrics

Alarm Configurations

EMR on TKE Operation Guide

Introduction to EMR on TKE

Configuring Cluster

Permission Management

Role Authorization

Creating Cluster

Cluster Management

Adjusting the Number of Pods

Modifying Configuration

Service Management

Deployment Instructions

Adding Components

Restarting Services

Role Management

Configuration Management

Configuration Update

Configuration Rollback

Monitoring and Ops

Monitoring Dashboard

Configuring Alarms

Application Analysis

EMR Serverless HBase Operation Guide

EMR Serverless HBase Product Introduction

Quotas and Limits

Planning an Instance

Multi-AZ Deployment

Managing an Instance

Managing Permissions

Creating an Instance

Instance Information

Modifying an Instance

Table Management

Setting EC Policy

Terminating an Instance

Monitoring and Alarms

Instance Monitoring

Data Table Analysis

Configuring Alarms

Development Guide

Serverless HBase Instructions

EMR Development Guide

Hadoop Development Guide

HDFS Common Operations

HDFS Federation Management Development Guide

HDFS Federation Management

Submitting MapReduce Tasks

Automatically Adding Task Nodes Without Assigning ApplicationMasters

YARN Task Queue Management

Practices on YARN Label Scheduling

Hadoop Practical Tutorial

Using API to Analyze Data in HDFS and COS

Dumping YARN Job Logs to COS

Spark Development Guide

Spark Environment Info

Using Spark to Analyze Data in COS

Using Spark Python to Analyze Data in COS

SparkSQL Tutorial

Integrating Spark Streaming with Ckafka

Practices on Dynamic Scheduling of Spark Resources

Spark Integration with Kafka

Spark Dependencies in Each EMR Version

Hbase Development Guide

Using HBase Through API

Using Hbase with Thrift

MapReduce on Hbase

Phoenix on Hbase Development Guide

Phoenix Client Usage

Phoenix JDBC Usage

Phoenix Practical Tutorial

Hive Development Guide

Basic Hive Operations

Basic Hive Operations

Hive Connection Methods

Configuring Hive Execution Engine

Advanced Usage

Configuring LDAP Authentication

HiveServer2 CLB

Hive Metadata Management

Custom Functions UDF

Practical Tutorial

Mapping Hbase Tables

Practices on Loading JSON Data to Hive

Accessing Iceberg Data with Hive

Accessing Hudi Data with Hive

Creating Databases and Tables in COS/CHDFS with Hive

Presto Development Guide

Analyzing Data in COS

Sqoop Development Guide

Import/Export of Relational Database and HDFS

Incremental Data Import into HDFS

Importing and Exporting Data Between Hive and TencentDB for MySQL

Hue Development Guide

Hue Practical Tutorial

Oozie Development Guide

Flume Development Guide

Storing Kafka Data in Hive Through Flume

Storing Kafka Data in HDFS or COS Through Flume

Storing Kafka Data in Hive Through Flume

Kerberos Development Guide

Kerberos Overview

Knox Development Guide

Knox Development Guide

Alluxio Development Guide

Alluxio Development Documentation

Common Alluxio Commands

Mounting File System to Unified Alluxio File System

Using Alluxio in Tencent Cloud

Support for COS Transparent-URI

Support for Authentication

Kylin Development Guide

Livy Development Guide

Kyuubi Development Guide

Kyuubi Overview

Kyuubi Practical Tutorial

Zeppelin Development Guide

Zeppelin Overview

Zeppelin Interpreter Configuration

Hudi Development Guide

Superset Development Guide

Superset Overview

Impala Development Guide

Impala Overview

Impala OPS Manual

Analyzing Data on COS/CHDFS

Druid Development Guide

Ingesting Data from Hadoop in Batches

Ingesting Data from Kafka in Real Time

TensorFlow Development Guide

TensorFlow Overview

TensorFlowOnSpark Overview

Kudu Development Guide

Data Migration Guide for Kudu Node Scale-In

Ranger Development Guide

Ranger Overview

Ranger User Guide

Integrating HDFS with Ranger

Integrating YARN with Ranger

Integrating HBase with Ranger

Integrating Presto with Ranger

Ranger Audit Log Guide

Storing Ranger Audit Logs in Solr

Storing Ranger Audit Logs in Tencent Cloud ElasticSearch

Kafka Development Guide

Iceberg Development Guide

StarRocks Development Guide

StarRocks Overview

Flink Development Guide

Analyzing COS Data with Flink

Practical Tutorial

Practice of EMR on CVM Ops

Migration of HiveServer2 and MetaStore to Router

Practice of Troubleshooting Unexecuted Auto-Scaling Rules

Practice Tutorial on Switching HDFS DataNode Maintenance Status

Data Migration

HDFS Data Migration Using COS

HDFS Data Migration Using DistCp

Practice of Hive Data Migration

Practical Tutorial on Custom Scaling

Practical Tutorial on Setting Scaling Rules

Practical Tutorial on Setting Scaling Rules

Practical Tutorial on Setting Time-based Scaling Rules

Practical Tutorial on Setting Load-based Scaling Rules

Practical Tutorial on Setting Mixed Scaling Rules

API Documentation

FAQs

EMR on CVM

Cluster Management

Service Level Agreement

DocumentationElastic MapReduceEMR Development GuideHive Development GuidePractical TutorialPractices on Loading JSON Data to Hive

Practices on Loading JSON Data to Hive

Last updated: 2025-02-12 16:16:58

Practices on Loading JSON Data to Hive

Last updated: 2025-02-12 16:16:58

1. Connect to Hive
Log in to a master node of the EMR cluster, switch to the "hadoop" user, go to the Hive directory, and connect to Hive by running the following command:
[root@10 ~]# su hadoop
[hadoop@10 root]$ cd /usr/local/service/hive
2. Prepare data
Create a data file in JSON format. Compile the following code and save:
vim test.data
{"name":"Mary","age":12,"course":[{"name":"math","location":"b208"},{"name":"english","location":"b702"}],"grade":[99,98,95]}
{"name":"Bob","age":20,"course":[{"name":"music","location":"b108"},{"name":"history","location":"b711"}],"grade":[91,92,93]}
Store the data file in HDFS:
hadoop fs -put ./test.data /
3. Create a table
Connect to Hive:
[hadoop@10 hive]$ hive
Create a table based on the mapping:
hive> CREATE TABLE test (name string, age int, course array<map<string,string>>, grade array<int>) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' STORED AS TEXTFILE;
4. Import data
hive>LOAD DATA INPATH '/test.data' into table test;
5. Check whether data import is successful
Query all data:
hive> select * from test;
OK
Mary	12	[{"name":"math","location":"b208"},{"name":"english","location":"b702"}]    [99,98,95]
Bob	20	[{"name":"music","location":"b108"},{"name":"history","location":"b711"}]   [91,92,93]
Time taken: 0.153 seconds, Fetched: 2 row(s)
Query the first score of each record:
hive> select grade[0] from test;
OK
99
91
Time taken: 0.374 seconds, Fetched: 2 row(s)
Query the name and location of the first course of each record:
hive> select course[0]['name'], course[0]['location'] from test;
OK
math	b208
music	b108
Time taken: 0.162 seconds, Fetched: 2 row(s)
﻿

Was this page helpful?

You can also Contact Sales or Submit a Ticket for help.

Yes

No

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 available.

7x24 Phone Support