ZStack Cloud Platform
Single Server, Free Trial for One Year
Before using the GPU virtualization feature, ensure that requirements in Preparations have been met completely and correctly. Below is detailed operational procedures for performing the virtualization:
rpm -i ${GPUDRIVERPACKETNAME} command to install the driver on the host.nvidia-smi command to check the GPU information. If the host successfully detects the GPU device, the GPU driver has been successfully installed on the host..png)
.png)
Note: After enabling the IOMMU setting on the host, you also need to ensure that IOMMU Status on the same page is available. Otherwise, the GPU virtualization feature cannot work as expected. If IOMMU State is enabled, yet the IOMMU Status is unavailable, the reason could be as follows: On the main menu of ZStack Cloud, choose . On the GPU Specification page, check the scanned pGPU specifications.
On the host details page, select one added host and choose . You have to select a virtualizable pGPU and click . The same is true of virtualizing AMD pGPUs.
Note: If you need to restore the vGPU to a pGPU, click . To restore the NVIDIA vGPU, ensure that all the vGPUs related to this pGPU have been detached from the VM instance.
Note: If you need to restore the vGPU to a pGPU, click . Before restoring the AMD vGPU, ensure that all AMD vGPUs related to the current AMD graphics card have been detached from the VM instance.On the host details page, select one added host and choose to check the vGPU devices on the host.
On the main menu of ZStack Cloud, choose to check the vGPU specifications.
Note: if the VM is stopped unexpectedly when Auto Detach is set to false, it cannot automatically start even though its HA mode is NeverStop.On the main menu of ZStack Cloud, choose . On the VM Instance page, click the name of an existing VM instance to enter its details page. Choose Configuration info on the top row. Find vGPU Device on this page and click Attach.
Select one or more stopped VM instances in the management interface of VM Instance, and click . Then you have two options to choose, that is, attach GPU specification or attach GPU device.
Obtain the driver and CUDA toolkit compatible with the GPU device.
lsmod | grep nouveau to check whether the Nouveau driver has been installed. If the output data suggests the Nouveau driver has been installed, you can perform the following operations to disable it. If no output is displayed, just skip this procedure.# touch /etc/modprobe.d/nvidia-installer-disable-nouveau.conf # Create a file and save the two lines below into it blacklist nouveau options nouveau modeset=0# yum install gcc kernel-devel-$(uname -r) kernel-headers-$(uname -r) # Reconstruct initramfs image # cp /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak # dracut /boot/initramfs-$(uname -r).img $(uname -r) --force # Only reboot the VM in the text mode # systemctl set-default multi-user.target # init 3 # reboot # lsmod | grep nouveau # After the VM instance is rebooted, check whether the nouveau driver is used or not# chmod +x NVIDIA-Linux-x86_64-346.47.run # Configure executable permissions # ./NVIDIA-Linux-x86_64-346.47.run # Execute the driver scriptAfter you run the commands, the driver package will begin to unpack and you can follow the installation instructions. During the installation, some warnings may appear. Confirm these warnings in sequence as they do not have any real impact. If some errors occur, please refer to the table below to check the environment.| Error Message | Solution |
|---|---|
| ERROR: Unable to find the kernel source tree for the currently running kernel. Please make sure you have installed the kernel source files for your kernel and that they are properly configured; on Red Hat Linux systems, for example, be sure you have the 'kernel-source' or 'kernel-devel' RPM installed. If you know the correct kernel source files are installed, you may specify the kernel source path with the '--kernel-source-path' command line option. | You need to have all of the kernel source files (including kernel, kernel-headers, and kernel-devel) installed and ensure that they are of the same version |
| ERROR: The Nouveau kernel driver is currently in use by your system. This driver is incompatible with the NVIDIA driver, and must be disabled before proceeding. Please consult the ow to correctly disable the Nouveau kernel driver. | You have to disable the Nouveau kernel driver |
| ERROR: Failed to find dkms on the system! ERROR: Failed to install the kernel module through DKMS. No kernel module was installed; please try installing again without DKMS, or check the DKMS logs for more information. | You need to install DKMS, which helps maintain out-of-tree drivers by automatically regenerating new modules when the kernel version changes |
| ERROR: Unable to load the kernel module 'nvidia.ko'. This happens most frequently when this kernel module was built against the wrong or improperly configured kernel sources, with a version of gcc that differs from the one used to build the target kernel, or if a driver such as rivafb, nvidiafb, or nouveau is present and prevents the NVIDIA kernel module from obtaining ownership of the NVIDIA graphics device(s), or no NVIDIA GPU installed in this system is supported by this NVIDIA Linux graphics driver release. | Run the commands ./NVIDIA-Linux-x86_64-384.98.run --kernel-source-path=/usr/src/kernels/3.10.0-XXX.x86_64/ -k $(uname -r) |
# lspci |grep NVIDIA # nvidia-smi# chmod +x cuda_8.0.61_375.26_linux.run #Set executable permission # ./cuda_8.0.61_375.26_linux.run #Run the driver fileDuring the installation, please set the following parameters:

vim /root/.bashrc command and save the content below to the same file:#gpu driver export CUDA_HOME=/usr/local/cuda-8.0 export PATH=/usr/local/cuda-8.0/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH export LD_LIBRARY_PATH="/usr/local/cuda-8.0/lib:${LD_LIBRARY_PATH}"# source ~/.bashrc # cd /usr/local/cuda-8.0/samples/1_Utilities/deviceQuery # make # ./deviceQuery