PolarSPARC |
How-to Enable NVidia GPU for Docker
Bhaskar S | 06/24/2023 |
Overview
With all the buzz and spotlight around AI/ML these days, it is inevitable for developers in an Enterprise to start integrating their business application(s) with the future AI/ML products. Majority of the AI/ML products depend on GPU enabled platforms to run efficiently, which is currently dominated by NVidia.
Most of the Enterprise business application(s) run in Docker containers these days. Hence it goes without saying, that for the AI/ML enabled business application(s) to run efficiently in the container environment, one needs to enable the GPU access to the Docker container.
Enter the NVidia Container Toolkit - which enables the Enterprise developers to build and run GPU enabled Docker containers.
The following diagram illustrates the high-level architecture of the Docker and NVidia integration:
The NVidia Container Toolkit includes a runtime driver, which enables Docker containers to access the underlying NVidia GPUs. The toolkit under-the-hood leverages the Compute Unified Device Architecture (or CUDA ) software framework to access the parallel computing power of the NVidia GPUs for faster data processing.
Installation and Setup
The installation and setup will be performed on a Linux desktop with a NVidia graphics installed and running Ubuntu 22.04 LTS operating system.
Open a Terminal window to perform the various steps.
To perform a system update and install the prerequisite software, execute the following command:
$ sudo apt update && sudo apt install apt-transport-https ca-certificates curl software-properties-common -y
The following would be a typical trimmed output:
...[ SNIP ]... ca-certificates is already the newest version (20211016ubuntu0.22.04.1). ca-certificates set to manually installed. The following additional packages will be installed: python3-software-properties software-properties-gtk The following NEW packages will be installed: apt-transport-https curl The following packages will be upgraded: python3-software-properties software-properties-common software-properties-gtk 3 upgraded, 2 newly installed, 0 to remove and 14 not upgraded. ...[ SNIP ]...
To add the Docker package repository, execute the following commands:
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
$ echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu jammy stable" | sudo tee /etc/apt/sources.list.d/docker.list
The following would be a typical output:
deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu jammy stable
To install docker, execute the following command:
$ sudo apt update && sudo apt install docker-ce containerd.io docker-compose-plugin -y
The following would be a typical trimmed output:
...[ SNIP ]... Get:5 https://download.docker.com/linux/ubuntu jammy InRelease [48.9 kB] Get:6 https://download.docker.com/linux/ubuntu jammy/stable amd64 Packages [13.6 kB] ...[ SNIP ]...
To add the logged in user alice to the group docker, execute the following command:
$ sudo usermod -aG docker ${USER}
REBOOT the system for the changes to take effect.
To verify docker installation was ok, execute the following command:
$ docker version
The following would be a typical output:
Client: Docker Engine - Community Version: 24.0.2 API version: 1.43 Go version: go1.20.4 Git commit: cb74dfc Built: Thu May 25 21:51:00 2023 OS/Arch: linux/amd64 Context: default Server: Docker Engine - Community Engine: Version: 24.0.2 API version: 1.43 (minimum version 1.12) Go version: go1.20.4 Git commit: 659604f Built: Thu May 25 21:51:00 2023 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.6.21 GitCommit: 3dce8eb055cbb6872793272b4f20ed16117344f8 runc: Version: 1.1.7 GitCommit: v1.1.7-0-g860f061 docker-init: Version: 0.19.0 GitCommit: de40ad0
To verify the appropriate NVidia drivers have been installed in the Linux desktop, execute the following command:
$ nvidia-smi
The following would be a typical output:
Sat Jun 24 09:23:28 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 525.116.04 Driver Version: 525.116.04 CUDA Version: 12.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:04:00.0 On | N/A | | 0% 48C P8 24W / 220W | 369MiB / 8192MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 1442 G /usr/lib/xorg/Xorg 308MiB | | 0 N/A N/A 3508 G ...RendererForSitePerProcess 57MiB | +-----------------------------------------------------------------------------+
To test the access of the NVidia GPU in docker, we will need some kind of a docker image. In order to perform the test, we will use the docker image nvidia/cuda:12.1.0-base-ubuntu22.04, which was the latest at the time of this article.
To download above mentioned docker image, execute the following command:
$ docker pull nvidia/cuda:12.1.0-base-ubuntu22.04
The following would be a typical output:
12.1.0-base-ubuntu22.04: Pulling from nvidia/cuda 6b851dcae6ca: Pull complete 532bc0192ccd: Pull complete f9bcd94e513a: Pull complete 971bd89a1a36: Pull complete a2855a2ef2e0: Pull complete Digest: sha256:937bda11a3146c55374c9a201bef12945f2ba98394ff0c46bd04807dc949ab51 Status: Downloaded newer image for nvidia/cuda:12.1.0-base-ubuntu22.04 docker.io/nvidia/cuda:12.1.0-base-ubuntu22.04
To test the access of the NVidia GPU from docker, execute the following command:
$ docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
The following would be a typical output:
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
From the Output.7 above, it is evident that docker has no access to the underlying NVidia GPU in the system.
To add the NVidia toolkit repository, execute the following commands:
$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
$ echo "deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://nvidia.github.io/libnvidia-container/stable/ubuntu22.04/amd64 /" | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
The following would be a typical output:
deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://nvidia.github.io/libnvidia-container/stable/ubuntu22.04/amd64 /
To perform a system update and to install the NVidia-Docker runtime integration, execute the following command:
$ sudo apt update && sudo apt install -y nvidia-docker2
The following would be a typical trimmed output:
...[ SNIP ]... Selecting previously unselected package libnvidia-container1:amd64. (Reading database ... 582595 files and directories currently installed.) Preparing to unpack .../libnvidia-container1_1.13.2-1_amd64.deb ... Unpacking libnvidia-container1:amd64 (1.13.2-1) ... Selecting previously unselected package libnvidia-container-tools. Preparing to unpack .../libnvidia-container-tools_1.13.2-1_amd64.deb ... Unpacking libnvidia-container-tools (1.13.2-1) ... Selecting previously unselected package nvidia-container-toolkit-base. Preparing to unpack .../nvidia-container-toolkit-base_1.13.2-1_amd64.deb ... Unpacking nvidia-container-toolkit-base (1.13.2-1) ... Selecting previously unselected package nvidia-container-toolkit. Preparing to unpack .../nvidia-container-toolkit_1.13.2-1_amd64.deb ... Unpacking nvidia-container-toolkit (1.13.2-1) ... Selecting previously unselected package nvidia-docker2. Preparing to unpack .../nvidia-docker2_2.13.0-1_all.deb ... Unpacking nvidia-docker2 (2.13.0-1) ... Setting up nvidia-container-toolkit-base (1.13.2-1) ... Setting up libnvidia-container1:amd64 (1.13.2-1) ... Setting up libnvidia-container-tools (1.13.2-1) ... Setting up nvidia-container-toolkit (1.13.2-1) ... Setting up nvidia-docker2 (2.13.0-1) ... Processing triggers for libc-bin (2.35-0ubuntu3.1) ... ...[ SNIP ]...
Once again, REBOOT the system for the changes to take effect.
Finally, to test the access of the NVidia GPU from docker, execute the following command:
$ docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
The following would be a typical output:
Sat Jun 24 09:31:19 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 525.116.04 Driver Version: 525.116.04 CUDA Version: 12.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:04:00.0 On | N/A | | 0% 43C P8 24W / 220W | 515MiB / 8192MiB | 1% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| +-----------------------------------------------------------------------------+
WALLA !!! - we have successfully integrated the NVidia GPU runtime with the docker environment.
References