Using GPU Node

If your business involves scenarios such as deep learning and high-performance computing, you can use TKE to support the GPU feature, which can help you quickly use a GPU container.
There are many ways to create a GPU CVM instance:

Usage Limits

You need to select the GPU model for the added node. You can have the GPU driver automatically installed as needed. For more information, see GPU Driver.
TKE supports GPU scheduling only if the Kubernetes version of the cluster is later than 1.8.*.
By default, GPUs are not shared among containers. A container can request one or more GPUs, but not part of a GPU.
The master node in a self-deployed cluster currently does not support the GPU model setting.

Directions

Creating GPU CVM instance

For more information, see Adding a Node. When creating a GPU, you should pay special attention to the following parameters:

Model

On the Select Model page, set Model in Node Model to GPU.

GPU driver, CUDA version, and cuDNN version

After setting the model, you can select the GPU driver version, CUDA version, and cuDNN version as needed.

If you select Automatically install GPU driver on the backend, it will be installed automatically during system start, taking 15–25 minutes.
The supported driver versions are determined by both the operating system and the GPU model.
If you do not select Automatically install GPU driver on the backend, the GPU driver will be installed by default for some operating systems of earlier versions to ensure the normal use. The complete default driver version information is as shown below:
Operating System Default Driver Version Installed
CentOS 7.6, Ubuntu 18, Tencent Linux2.4 450
CentOS 7.2 (not recommended) 384.111
Ubuntu 16 (not recommended) 410.79

Operating System	Default Driver Version Installed
CentOS 7.6, Ubuntu 18, Tencent Linux2.4	450
CentOS 7.2 (not recommended)	384.111
Ubuntu 16 (not recommended)	410.79

MIG

With multi-instance GPU (MIG) enabled, an A100 GPU will be divided into seven separate GPU instances to help you improve the GPU utilization when multiple jobs are running. For more information, see NVIDIA Multi-Instance GPU User Guide.

To use the MIG feature, make sure the following conditions are met:
The GPU model is GT4.
You have selected **Automatically install GPU driver on the backend** in the console and configured the GPU, CUDA, and cuDNN versions.

Adding existing GPU CVM instance

For detailed directions, see Adding a Node. When adding a node, you should pay attention to the following:

On the Select Node page, select an existing GPU node as shown below:
Configure the automatic installation of the GPU driver and MIG as needed.

Was this page helpful?

You can also Contact Sales or Submit a Ticket for help.

Yes

tencent cloud

New User Offers

Next-Generation CDN：EdgeOne

Elasticsearch Service Special Offers

Free Tier

Tencent Cloud Startup Program

Special Offers

Lighthouse Special Offers

Cloud Object Storage Special Offers

Featured Products

New Products

Education

Tencent Cloud Online Education Solutions

Gaming

Gaming Solution

Game Media Solutions

Financial Services

Financial Services Solution

Audio & Video

Audio/Video Solution

LVB Recording Solution

Interactive Classroom Solution

Interactive Live Streaming Solution

Audio Chat Social Networking Solution

Real Estate

Tencent Cloud LinkBase(Weiling)

E-commerce

E-commerce retail solutions

Compute

Cloud Virtual Machine

Auto Scaling

Batch Compute

CVM Dedicated Host

Database

TencentDB for MySQL

TencentDB for Redis®

TencentDB for CTSDB

TDSQL for MySQL

Data Transfer Service

TencentDB for MongoDB

TencentDB for PostgreSQL

TencentDB for SQL Server

TencentDB for TcaplusDB

Video Service

Cloud Streaming Services

Video on Demand

Media Processing Service

Cloud Application Rendering

Cloud Contact Center

Game Multimedia Engine

Chat

Real-time Communication

Tencent Effect SDK

AI and Machine Learning

Image Creation Large Model

Face Fusion

eKYC

Optical Character Recognition

Video Creation Large Model

Industry Applications

Tencent HealthCare Omics Platform

Container and Middleware

TDMQ for CKafka

Serverless Cloud Function

Tencent Kubernetes Engine

Tencent Kubernetes Engine for Serverless

Networking

Cloud Load Balancer

Virtual Private Cloud

Direct Connect

Cloud Connect Network

NAT Gateway

VPN Connection

Bandwidth Package

Anycast Internet Acceleration

Elastic Network Interface

Flow Logs

Global Application Acceleration Platform

Security

Captcha