tencent cloud

All product documents
DocumentationHyper Computing ClusterOperation GuideInstalling nvidia-fabricmanager Service on GPU Instance
Installing nvidia-fabricmanager Service on GPU Instance
Last updated: 2024-08-20 17:04:57
Installing nvidia-fabricmanager Service on GPU Instance
Last updated: 2024-08-20 17:04:57

Overview

The Hyper Computing ClusterPNV4h instance is equipped with A100 GPUs and supports NvLink & NvSwitch. It requires the installation of the nvidia-fabricmanager service corresponding to the driver version to enable interconnection between GPUs. If you are using this instance, see this document to install the nvidia-fabricmanager service. Otherwise, you may not be able to use the GPU instance properly.

Directions

This document takes the driver version 470.103.01 as an example. You can follow the steps below for installation. You can replace the driver version after the version parameter as needed.

Installing nvidia-fabricmanager Service

1. Log in to the instance. For details, see Logging in to Linux Instance (Standard Method).
2. The installation varies by operating system. Run the corresponding command for installation.
CentOS 7.x Image
Ubuntu 18.04 Image
TencentOS 2.4 Image
version=470.103.01
yum -y install yum-utils
yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo
yum install -y nvidia-fabric-manager-${version}-1
version=470.103.01
main_version=$(echo $version | awk -F '.' '{print $1}')
apt-get updateapt
get -y install nvidia-fabricmanager-${main_version}=${version}-*
version=470.103.01
yum -y install yum-utils
yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo
yum install -y nvidia-fabric-manager-${version}-1

Starting nvidia-fabricmanager Service

Run the following commands in sequence to start the service.
systemctl enable nvidia-fabricmanager
systemctl start nvidia-fabricmanager

Viewing nvidia-fabricmanager Service Status

Run the following command to view the service status.
systemctl status nvidia-fabricmanager
If the following information is output, the service is installed successfully.



Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support
Hong Kong, China
+852 800 906 020 (Toll Free)
United States
+1 844 606 0804 (Toll Free)
United Kingdom
+44 808 196 4551 (Toll Free)
Canada
+1 888 605 7930 (Toll Free)
Australia
+61 1300 986 386 (Toll Free)
EdgeOne hotline
+852 300 80699
More local hotlines coming soon