tencent cloud

Feedback

Using GPU Node

Last updated: 2022-06-10 16:48:44
This document is currently invalid. Please refer to the documentation page of the product.

    Overview

    If your business involves scenarios such as deep learning and high-performance computing, you can use TKE to support the GPU feature, which can help you quickly use a GPU container.
    There are many ways to create a GPU CVM instance:

    Usage Limits

    • You need to select the GPU model for the added node. You can have the GPU driver automatically installed as needed. For more information, see GPU Driver.
    • TKE supports GPU scheduling only if the Kubernetes version of the cluster is later than 1.8.*.
    • By default, GPUs are not shared among containers. A container can request one or more GPUs, but not part of a GPU.
    • The master node in a self-deployed cluster currently does not support the GPU model setting.

    Directions

    Creating GPU CVM instance

    For more information, see Adding a Node. When creating a GPU, you should pay special attention to the following parameters:

    Model

    On the Select Model page, set Model in Node Model to GPU.

    GPU driver, CUDA version, and cuDNN version

    After setting the model, you can select the GPU driver version, CUDA version, and cuDNN version as needed.

    • If you select Automatically install GPU driver on the backend, it will be installed automatically during system start, taking 15–25 minutes.
    • The supported driver versions are determined by both the operating system and the GPU model.
    • If you do not select Automatically install GPU driver on the backend, the GPU driver will be installed by default for some operating systems of earlier versions to ensure the normal use. The complete default driver version information is as shown below:
      Operating SystemDefault Driver Version Installed
      CentOS 7.6, Ubuntu 18, Tencent Linux2.4450
      CentOS 7.2 (not recommended)384.111
      Ubuntu 16 (not recommended)410.79

    MIG

    With multi-instance GPU (MIG) enabled, an A100 GPU will be divided into seven separate GPU instances to help you improve the GPU utilization when multiple jobs are running. For more information, see NVIDIA Multi-Instance GPU User Guide.

    To use the MIG feature, make sure the following conditions are met:

  • The GPU model is GT4.
  • You have selected **Automatically install GPU driver on the backend** in the console and configured the GPU, CUDA, and cuDNN versions.
  • Adding existing GPU CVM instance

    For detailed directions, see Adding a Node. When adding a node, you should pay attention to the following:

    • On the Select Node page, select an existing GPU node as shown below:
    • Configure the automatic installation of the GPU driver and MIG as needed.
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support