Setting Up GPU Support for Docker Containers on AWS G4dn Instances

If you’re working with machine learning, deep learning, or other GPU-intensive workloads, setting up Docker with NVIDIA GPU support is essential. This guide will walk you through the process of installing and configuring Docker, NVIDIA drivers, and the NVIDIA Container Toolkit on an AWS G4dn instance running Ubuntu.

Table of Contents

Prerequisites

Ensure you have:

An AWS G4dn instance with Ubuntu installed.
sudo privileges to install and configure packages.

Step 1: Update System and Install Docker

First, update the system and install Docker:

sudo apt update && sudo apt upgrade -y && sudo apt install docker.io -y

This ensures you have the latest system updates and installs Docker.

Step 2: Install Docker Compose

Download and set up Docker Compose:

wget https://github.com/docker/compose/releases/download/v2.32.4/docker-compose-linux-x86_64
mv docker-compose-linux-x86_64 /usr/bin/docker-compose
chmod +x /usr/bin/docker-compose

Step 3: Install NVIDIA Drivers

Install the required NVIDIA utilities:

sudo apt install nvidia-utils-535 -y

Next, determine the latest available NVIDIA driver version for GCP instances and install it:

NVIDIA_DRIVER_VERSION=$(sudo apt-cache search 'linux-modules-nvidia-[0-9]+-gcp$' | awk '{print $1}' | sort | tail -n 1 | head -n 1 | awk -F"-" '{print $4}')
sudo apt install linux-modules-nvidia-${NVIDIA_DRIVER_VERSION}-gcp nvidia-driver-${NVIDIA_DRIVER_VERSION} -y

Check the installation:

sudo nvidia-smi

If the drivers are correctly installed, you should see your GPU details in the output.

Step 4: Install NVIDIA Container Toolkit

To enable GPU support for Docker containers, install the NVIDIA Container Toolkit:

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

Enable experimental features:

sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list

Update and install:

sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit

Configure Docker to use the NVIDIA runtime:

sudo nvidia-ctk runtime configure --runtime=docker
sudo nvidia-ctk runtime configure --runtime=docker --config=$HOME/.config/docker/daemon.json

Step 5: Restart Docker and Verify

Restart Docker to apply the changes:

sudo systemctl restart docker

Now, verify that your Docker installation recognizes the GPU by running:

docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

If everything is set up correctly, you should see the NVIDIA GPU details in the output.

Conclusion

You have now successfully installed and configured Docker with NVIDIA GPU support on an AWS G4dn instance. This setup allows you to run GPU-accelerated applications inside Docker containers, making it easier to deploy machine learning and AI workloads.

Stay tuned for more guides on optimizing GPU performance for your workloads!

Setting Up GPU Support for Docker Containers on AWS G4dn Instances

Prerequisites

Step 1: Update System and Install Docker

Step 2: Install Docker Compose

Step 3: Install NVIDIA Drivers

Step 4: Install NVIDIA Container Toolkit

Step 5: Restart Docker and Verify

Conclusion

Amritpal

Leave a Reply Cancel reply

Prerequisites

Step 1: Update System and Install Docker

Step 2: Install Docker Compose

Step 3: Install NVIDIA Drivers

Step 4: Install NVIDIA Container Toolkit

Step 5: Restart Docker and Verify

Conclusion

Amritpal

You might also like

How To Install Docker-Compose In Ubuntu

How to enable termination protection to EC2 in AWS.

MLOps – The DevOps for Machine Learning (ML) Pipelines

AWS Solutions Architect Interview Question/Answers.

How to grant access only one S3 Bucket to AWS IAM User

Leave a Reply Cancel reply