tencent cloud

Feedback

EMR on CVM Quick Start

Last updated: 2024-10-30 09:59:58
    This document introduces the process of quickly creating an EMR on CVM cluster through the EMR console, submitting a job, and viewing the results.

    Preparations

    1. Before using an EMR cluster, you need to register a Tencent Cloud account and complete identity verification. For detailed directions, see Enterprise Identity Verification Guide.
    2. Grant the system default role EMR_QCSRole to the service account for EMR. For detailed directions, see Role Authorization.
    3. For online account recharge, EMR on CVM offers two billing modes: pay-as-you-go and monthly subscription. Before creating a cluster, you need to recharge your account balance to ensure it is greater than or equal to the configuration fees required for cluster creation, excluding promo vouchers. For detailed directions, see Top-up.

    Creating Clusters

    Log in to the EMR Console, click Create Cluster on the EMR on CVM cluster list page, and complete the relevant configuration on the purchase page. When the cluster status shows Running, it indicates that the cluster has been successfully created.
    Purchase Steps
    Configuration Item
    Configuration Items Description
    Example
    Software configuration
    Region
    The physical data center where the cluster is deployed.
    Note: Once the cluster is created, the region cannot be changed, so choose carefully.
    Beijing, Shanghai, Guangzhou, Nanjing, Chengdu, and Silicon Valley
    Cluster type
    EMR on CVM supports multiple cluster types, with Hadoop being the default cluster type.
    Hadoop and StarRocks
    Product version
    The components and their versions bundled with different product versions vary.
    EMR-V2.7.0 includes Hadoop 2.8.5 and Spark 3.2.1.
    Deployment components
    Optional components that can be customized and combined based on your needs.
    Hive-2.3.9 and Impala-3.4.1.
    Region and hardware configuration
    Billing mode
    Billing mode for cluster deployment
    Pay-as-You-go
    Availability zone (AZ) and network configuration
    AZ and cluster network settings. Note: Once the cluster is created, the AZ cannot be directly changed, so choose carefully.
    Guangzhou Zone 7.
    Secure login
    Network access control settings for nodes, with a security group firewall feature.
    Create a security group.
    Node configuration
    Select the appropriate model configuration for different node types based on business requirements. For more details, see Business Evaluation.
    Enable high availability for node deployment.
    Basic configuration
    Associated project
    Assign the current cluster to different project groups.
    The associated project cannot be modified once the cluster is created.
    Cluster name
    The name of the cluster, which is customizable.
    EMR-7sx2aqmu
    Login method
    Custom password setup and key association. SSH keys are used only for quick access through the EMR-UI.
    Password.
    Confirm configuration
    Configuration list
    Confirm if there is any error in the deployment information.
    Select the terms of service and click Buy Now.
    Note
    You can view the information of each node in the CVM console. To ensure the normal operation of the EMR cluster, do not change the node configuration in the CVM console.

    Submitting Jobs and Viewing Results

    After the cluster is successfully created, you can create and submit jobs on that cluster. This document uses a submitted Spark task as an example, with the following steps.
    Note
    When creating an EMR cluster, you need to select the Spark component in the software configuration interface.
    1. Use SSH to log in and connect to the cluster (the local system is Linux/Mac OS). For more details, see Login to Cluster.
    2. In the EMR command line, use the following commands to switch to the Hadoop user and navigate to the Spark installation directory /usr/local/service/spark:
    [root@172 ~]# su hadoop
    [hadoop@172 root]$ cd /usr/local/service/spark
    3. Submit and run the task using the following command:
    /usr/local/service/spark/bin/spark-submit \\
    --class org.apache.spark.examples.SparkPi \\
    --master yarn \\
    --deploy-mode cluster \\
    --proxy-user hadoop \\
    --driver-memory 1g \\
    --executor-memory 1g \\
    --executor-cores 1 \\
    /usr/local/service/spark/examples/jars/spark-examples*.jar \\
    10
    4. After the job is submitted, on the EMR on CVM page, click Cluster Services in the row of the target cluster, and then click the WebUI link in the row of YARN UI. After logging in, you will enter the YARN UI page. Click the ID of the target job to view the job’s detailed running information.

    Terminating Clusters

    When the created cluster is no longer needed, you can terminate the cluster and return the resources. Terminating the cluster will forcibly stop all services provided by the cluster and release the resources.
    On the EMR on CVM page, select Terminate from the More options for the target cluster. In the pop-up dialog box, click Terminate Now.
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support