How to run Cluster Management Software K3s on NVIDIA Jetson?

Ever thought of learning how to run a cluster management software in your IoT applications? This tutorial will take you through everything from the purpose why we’re choosing K3s to how you can run a complete test! This original tutorial is by GPUSolution.

Without further ado, let’s first talk about the purpose of using k3s:

We’ll be using Kubernetes lightweight management tool K3S, to build a Docker container cluster using 4 node devices. The purpose of this is as follows:

  1. The Docker container is trending in the software development, this also includes Artificial Intelligence (AI) application
  2. Kubernetes is currently the most commonly used cluster management application in the Docker field
  3. Since K3S is a lightweight management tool, there would be fewer resources used, and it’ll be much more convenient to install. Thus, making it suitable for embedded AIOT platform applications

Let’s Get Started!

This experiment will use Xavier NX as the master node, and 3 Jetson Nano 4GB as the worker node. All devices will be using Jetpack 4.4.1 version as the development environment, with Docker 1.19 version and Nvidia-docker2 management tool pre-installed.

In the process of the experiment, you’ll also be required to download the compatible Jetpack 4.41, NVIDIA l4t-ml:r32.4.4-py3’s mirror from NGC (ngc.nvidia.com). This mirror supports a variety of Deep learning application frameworks and Jupyter interactive environment with an instruction.


Building K3S cluster using 4 Jetson nodes

Description of the cluster environment:

Appoint a node as the Master and the other nodes as the workers in Seeed’s devices. The example below shows the configuration of each node: (Set the IP portion of the table according to your environment)

RoleIPHost NameType of DeviceJetpack version
Masterxx.xx.xx.30Node0Xavier NX4.4.1
Workerxx.xx.xx.31Node1Jetson Nano 4GB4.4.1
Workerxx.xx.xx.32Node2Jetson Nano 4GB4.4.1
Workerxx.xx.xx.33Node3Jetson Nano 4GB4.4.1

Add all 4 IP and Hostname into the 4 nodes’/etc/hosts folder

127.0.0.1      localhost

127.0.1.1      node3<= Set your own hostname

# Add all the cluster nodes’ IP and Hostname down below

xx.xx.xx.30     node0

xx.xx.xx.31     node1

xx.xx.xx.32     node2

xx.xx.xx.33     node3

(Archive)

By doing this, you will be able to directly use every nodes’ hostname instead of memorising their IP when you are performing operation later.

Using K3S to build Nano management cluster

  • Install K3S Server on the Master (node0):

The executing commands are as follows:
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC=”–docker” sh -s –

Check if the installation is completed:

docker images

sudo kubectl get node

To test whether you’re able perform calculation, execute the third-party packaged cuda devicequery container:

sudo kubectl run -it nvidia –image=jitteam/devicequery –restart=Never

If everything went well, the screenshot below will be shown:

  • Install K3S agent on the 3 workers (node1/node2/node3):
  • 1. First, locate the k3s server token on the Master (node0) and execute the following commands sudo cat /var/lib/rancher/k3s/server/node-token

You should be able to see a similar alphabetic string about the length as shown below (it will be different):

2. On every worker (node1/node2.node3), execute
” export k3s_token= <node-token alphabetical string “as shown in the previous step

export

k3s_url=”https://<IP_OF_MASTER>:6443″#

The <IP_OF_MASTER>is node0

Then execute the command as shown below:

curl -sfL https://get.k3s.io | K3S_URL=${k3s_url} K3S_TOKEN=${k3s_token} sh –

*The steps, as shown above, are all executed on the worker nodes

3. Execute the following commands on Master, and check the agent installation:

sudo kubectl get nodes

This would indicate that 3 worker nodes have entered the scope of k3s management, but the roles haven’t been set.

4. To set the roles for each worker, execute role setting command on the Master node (mode0)
sudo kubectl label node node1 node2 node3 node-role.kubernetes.io/worker=worker

Then check the status of the node:

sudo kubectl get nodes

With that, you’ve completed building the k3s cluster.

5. Check the cluster information an execute the following command:

sudo kubectl cluster-info


Executing NVIDIA l4t-ml container’s TensorFlow

  • Download l4t-ml:r32.4.4-py3’s mirror docker pull nvcr.io/nvidia/l4t-ml:r32.4.4-py3
  • Type jetson-tf.yaml
apiVersion: v1
kind: Pod
metadata:
  name: jetson-tf
spec:
  restartPolicy: OnFailure
  containers:
  - name: nvidia-l4t-ml
    image: "nvcr.io/nvidia/l4t-ml:r32.4.4-py3"
    command: [ "/bin/bash", "-c", "--" ]
    args: [ "while true; do sleep 30; done;" ]
  • Check the pod status and execute:

sudo kubectl get pod

  • Once you’ve confirmed that pod (jetson-tf) is running, it’ll be ready for usage. But if its status states “ContainerCreating”, wait till it states running
  • To activate this container, execute sudo kubectl exec -it jetson-tf — python3

Enter container’s puthon3 interactive environment and execute the following codes:

from tensorflow.python.client import device_lib
device_lib.list_local_devices()

This would then display the GPUs supported by Tensorflow in the k3s container.

For a complete test, you can further execute the following code in Python3:

from tensorflow.python.client import device_lib
def get_available_gpus():
    local_device_protos = device_lib.list_local_devices()
return [x.name for x in local_device_protos if x.device_type == 'GPU']

get_available_gpus()

After executing, you’ll get the following output:


Summary

And that’s all for our tutorial on running K3s on your devices! Do let us know if you’ve enjoyed this article and hope that this has helped you in any sort of way!

About Author

Calendar

January 2021
M T W T F S S
 123
45678910
11121314151617
18192021222324
25262728293031