This document introduces the process of quickly creating an EMR on CVM cluster through the EMR console, submitting a job, and viewing the results.
Preparations
2. Grant the system default role EMR_QCSRole to the service account for EMR. For detailed directions, see Role Authorization. 3. For online account recharge, EMR on CVM offers two billing modes: pay-as-you-go and monthly subscription. Before creating a cluster, you need to recharge your account balance to ensure it is greater than or equal to the configuration fees required for cluster creation, excluding promo vouchers. For detailed directions, see Top-up. Creating Clusters
Log in to the EMR Console, click Create Cluster on the EMR on CVM cluster list page, and complete the relevant configuration on the purchase page. When the cluster status shows Running, it indicates that the cluster has been successfully created. |
Software configuration | Region | The physical data center where the cluster is deployed. Note: Once the cluster is created, the region cannot be changed, so choose carefully. | Beijing, Shanghai, Guangzhou, Nanjing, Chengdu, and Silicon Valley |
| Cluster type | EMR on CVM supports multiple cluster types, with Hadoop being the default cluster type. | Hadoop and StarRocks |
| Product version | The components and their versions bundled with different product versions vary. | EMR-V2.7.0 includes Hadoop 2.8.5 and Spark 3.2.1. |
| Deployment components | Optional components that can be customized and combined based on your needs. | Hive-2.3.9 and Impala-3.4.1. |
Region and hardware configuration | Billing mode | Billing mode for cluster deployment | Pay-as-You-go |
| Availability zone (AZ) and network configuration | AZ and cluster network settings. Note: Once the cluster is created, the AZ cannot be directly changed, so choose carefully. | Guangzhou Zone 7. |
| Secure login | Network access control settings for nodes, with a security group firewall feature. | Create a security group. |
| Node configuration | Select the appropriate model configuration for different node types based on business requirements. For more details, see Business Evaluation. | Enable high availability for node deployment. |
Basic configuration | Associated project | Assign the current cluster to different project groups. | The associated project cannot be modified once the cluster is created. |
| Cluster name | The name of the cluster, which is customizable. | EMR-7sx2aqmu |
| Login method | Custom password setup and key association. SSH keys are used only for quick access through the EMR-UI. | Password. |
Confirm configuration | Configuration list | Confirm if there is any error in the deployment information. | Select the terms of service and click Buy Now. |
Note
You can view the information of each node in the CVM console. To ensure the normal operation of the EMR cluster, do not change the node configuration in the CVM console.
Submitting Jobs and Viewing Results
After the cluster is successfully created, you can create and submit jobs on that cluster. This document uses a submitted Spark task as an example, with the following steps.
Note
When creating an EMR cluster, you need to select the Spark component in the software configuration interface.
1. Use SSH to log in and connect to the cluster (the local system is Linux/Mac OS). For more details, see Login to Cluster. 2. In the EMR command line, use the following commands to switch to the Hadoop user and navigate to the Spark installation directory /usr/local/service/spark:
[root@172 ~]
[hadoop@172 root]$ cd /usr/local/service/spark
3. Submit and run the task using the following command:
/usr/local/service/spark/bin/spark-submit \\
--class org.apache.spark.examples.SparkPi \\
--master yarn \\
--deploy-mode cluster \\
--proxy-user hadoop \\
--driver-memory 1g \\
--executor-memory 1g \\
--executor-cores 1 \\
/usr/local/service/spark/examples/jars/spark-examples*.jar \\
10
4. After the job is submitted, on the EMR on CVM page, click Cluster Services in the row of the target cluster, and then click the WebUI link in the row of YARN UI. After logging in, you will enter the YARN UI page. Click the ID of the target job to view the job’s detailed running information.
Terminating Clusters
When the created cluster is no longer needed, you can terminate the cluster and return the resources. Terminating the cluster will forcibly stop all services provided by the cluster and release the resources.
On the EMR on CVM page, select Terminate from the More options for the target cluster. In the pop-up dialog box, click Terminate Now.
Was this page helpful?