Setup SMB Persistent Storage on Nomad Cluster

This is the second article of the “Building Hashicorp Nomad Cluster With Raspberry Pi ” series. This time, we will discuss connecting Samba shares as a persistent network drive for your future workloads.

Before considering our cluster as “ready” we must do much more since we are building everything from scratch and bare-metal. As a next step, I propose to think about how we will persist data for our future workloads.

Why SMB?

The Samba shares are something other than what we want to use in the first place for persistent volumes. The proper setup would use other network protocols and a slightly different setup. We are building the model of a real-world cluster, the proof of concept, but something other than what we will use in enterprise production. Since I already have SMB server infrastructure in my home network, it would be good educational material for Nomad first without going too deep into setting up and configuring the network drive server.

Why not just use host volumes if we want to just learn Nomad?

If you use a host volume, your job is immediately bound to the node inside the cluster with the folder you want to use as a volume. That means you lose the whole idea behind clustering and significantly reduce fault tolerance for your workloads. If the node with your volume folder crashes, it will be impossible to reschedule the job on another node, and your job will crash.

We want to build a proof-of-concept closest to a real-world bare-metal cluster and also balance between enterprise-grade solutions and learning materials, so it is obvious that we have to use some network volumes that is managed outside our cluster. Still, it should not be necessary to work with enterprise-grade protocols.

The hardware

As I said earlier, I already have an SMB infrastructure in my home network. My infrastructure is a TP-Link Archer AX5400 router with a USB port and Samba server installed. This is far from an ideal setup, and we should not consider it as even close to a good solution, but keep in mind that we are focusing on the Nomad setup at this step.

As in the previous post, I decided to go with the easiest way and buy the first available SSD I saw in my local store: 1 TB Transcend ESD310S Portable SSD

So my setup looks the following:

The configuration of SSD on my router is pretty simple:

Open the router webpage at http://192.168.0.1 and log in.

Go to Advanced -> USB -> USB Storage Device.

Check the Enable checkbox in the row with the “Samba For Windows” Access Method, and everything is done. Now you have a 1 TB storage available in your network through SMB protocol at 192.168.0.1 (your router) IP address.

Configure Nomad CLI on the local machine

I will keep the official CLI installation guide link here . I want to mention one thing only.

As I explained in my previous post, our Nomad API URL is http://nomad.service.homelab:4646

To configure Nomad CLI to use our Nomad cluster, we should export the NOMAD_ADDR environment variable with our URL in your ~/.*rc file:

echo 'export NOMAD_ADDR="http://nomad.service.homelab:4646"' >> ~/.bashrc

We are ready to use the Nomad CLI tool on the local machine.

Install SMB CSI Plugin on Nomad cluster

CSI (or Container Storage Interface) is a unified storage interface for Container Orchestration Systems like Nomad or Kubernetes. If this is a unified interface, we can use CSI plugins from Kubernetes, right? - Let’s test.

Let’s review how CSI works.

In a few words, before mounting the volume to the container, you should mount it to the host. To make it work, we need to have at least these two components up and running on the cluster:

CSI node - a service for mounting, unmounting, and accounting of network drives on cluster nodes.

CSI controller - a service that is responsible for listening to job annotations (in the case of Kubernetes, it will be PersistentVolumeClaims; in the case of Nomad, it will be the volume stanza in job spec)

Let’s define a jobspecs for both components:

SMB CSI plugin for Nomad: CSI Nodes service

We will take the smb.csi.k8s.io plugin and try to deploy it with Nomad.

First, let’s return to our homelab repo we created before and create the new folder called platform where we will store IaC for platform services like volume controller, load balancers, monitoring, etc…

$ cd homelab
$ mkdir -p platform/volumes

Let’s create the platform namespace in Nomad.

nomad namespace apply platform

Create a job spec file for nodes service ./platform/volumes/csi-smb-plugin-nodes.job.hcl

job "csi-smb-plugin-nodes" {
  datacenters = ["pi-homelab-01"]
  type        = "system"
  namespace   = "platform"

  group "nodes" {

    task "plugin" {
      driver = "docker"

      config {
        image = "mcr.microsoft.com/k8s/csi/smb-csi:v1.7.0"
        args = [
          "--v=5",
          "--nodeid=${attr.unique.hostname}",
          "--endpoint=unix:///csi/csi.sock",
          "--drivername=smb.csi.k8s.io"
        ]
        # node plugins must run as privileged jobs because they
        # mount disks to the host
        privileged = true
      }

      csi_plugin {
        id        = "smb"
        type      = "node"
        mount_dir = "/csi"
      }

      resources {
        memory     = 50
        cpu        = 100
      }
    }
  }
}

As you can see, there are several specific things to this particular job:

job "csi-smb-plugin-nodes" {
  ...
  type        = "system"
  ...
}

The job type system means that the job we are about to run should be deployed on all cluster nodes. We want to deploy this just job this way because, as explained earlier, this job controls network drive mounting into the host system (node) and since we want to allow migrating jobs between the cluster nodes we should be able to mount network volumes to all hosts.

job "csi-smb-plugin-nodes" {
  ...
  group "nodes" {
  ...
    task "plugin" {
      ...
      config {
        ...
        privileged = true
      }
      ...
    }
  }
}

This job, yet another docker container, can mount network volumes to the host system. We must run this container in privileged mode to let this happen.

job "csi-smb-plugin-nodes" {
  ...
  group "nodes" {
    task "plugin" {
      ...
      csi_plugin {
        id        = "smb"
        type      = "node"
        mount_dir = "/csi"
      }
      ...
    }
  }
}

This is a csi_plugin configuration, which tells what kind of job it is and where on the host system we should mount network drives.

Before applying this job to the Nomad, we have to allow privileged containers on all our Nomad nodes. For that, we should return to our ansible/files/nomad.hcl file, add Docker plugin configuration, and re-run the Nomad playbook:

ansible/files/nomad.hcl

# Define the Consul datacenter where our nodes are running
datacenter = "pi-homelab-01"
data_dir  = "/opt/nomad/data"
bind_addr = "0.0.0.0"

# Server is a Nomad API server
server {
  enabled          = true
  # Expect at least 3 nodes to be online to start cluster leader election
  bootstrap_expect = 3
}

# Client is a service that run the jobs
client {
  enabled = true
}

# Allow privileged docker containers - needed for SMB CSI Plugin
plugin "docker" {
  config {
    allow_privileged = true
  }
}

ui {
  enabled = true
}

$ cd ansible
$ make Nomad

We are ready to apply the CSI plugin job.

$ nomad job run ./platform/volumes/csi-smb-plugin-nodes.job.hcl

SMB CSI plugin for Nomad: CSI Controller service

The Controller module is the same Docker image but deployed with a slightly different configuration.

./platform/volumes/csi-smb-plugin-controller.job.hcl

job "csi-smb-plugin-controller" {
  datacenters = ["pi-homelab-01"]
  namespace   = "platform"

  group "controller" {
    count = 2

    task "plugin" {
      driver = "docker"

      config {
        image = "mcr.microsoft.com/k8s/csi/smb-csi:v1.7.0"
        args = [
          "--v=5",
          "--nodeid=${attr.unique.hostname}",
          "--endpoint=unix:///csi/csi.sock",
          "--drivername=smb.csi.k8s.io"
        ]
      }

      csi_plugin {
        id        = "smb"
        type      = "controller"
        mount_dir = "/csi"
      }

      resources {
        memory = 50
        memory_max = 256
        cpu    = 100
      }
    } 
  } 
}

We don’t have to deploy Controller with type = "system" because this module only orchestrates communication between Nomad and CSI Node module. It follows jobs with a volume stanza, tells the CSI Node module to mount the drive, and tells Nomad to mount the folder from the mounted drive to the container.

We will deploy two replicas of this module to preserve the minimum redundancy and ensure no locking happens when Nomad schedules multiple jobs that require volumes simultaneously.

Also, you can see we set type="controller" in the csi_plugin stanza.

Let’s apply this job to the Nomad cluster.

$ nomad job run ./platform/volumes/csi-smb-plugin-controller.job.hcl

Test the setup

We will deploy the MySQL container and register the volume using the CSI plugin for the test.

First, we will create the namespace for test workloads.

$ nomad namespace apply test

Define SMB volume on Nomad cluster

./test/mysql/mysql.volume.hcl

plugin_id   = "smb"
type        = "csi"
namespace   = "test"
id          = "mysql"
name        = "mysql"
external_id = "mysql"

capability {
  access_mode = "multi-node-multi-writer"
  attachment_mode = "file-system"
}

context {
  source = "//192.168.0.1/G/.nomad_data/mysql"
}

mount_options {
  mount_flags = [ "guest","rw","iocharset=utf8","vers=2.0", "dir_mode=0700", "file_mode=0700", "uid=999", "gid=999" ]
}

In the context stanza, you can see the source field that defines the address to the folder on the SMB server that needs to be mounted. Make sure you create this folder on the server in advance.

Define CIFS flags to run MySQL job in Nomad

Let’s take a closer look at the mount flags we define to mount the folder for the MySQL server:

guest - means that mount util should skip authorization.
rw - read/write mode
iocharset=utf8 - define charset
vers=2.0 - define CIFS version. Needed to support further flags
dir_mode=0700 and file_mode=0700 - required permission flags by MySQL server
uid=999 and gid=999 - the mysql user and group ID. MySQL server requires mysql:mysql ownership for the data folder.

Define MySQL Nomad job

To mount our freshly created volume to the Nomad job, we must define the volume stanza in the tasks group spec and the volume_mount stanza in the task spec.

Let’s define the resulting job spec.

./test/mysql/mysql.job.hcl

job "mysql" {
  datacenters = ["pi-homelab-01"]
  namespace   = "test"
  type        = "service"

  group "mysql" {
    network {
      mode = "bridge"
      port "db" {
        static = 3306
        to     = 3306
      }
    }

    volume "mysql_data" {
      type            = "csi"
      source          = "mysql"
      access_mode     = "multi-node-multi-writer"
      attachment_mode = "file-system"
    }

    task "instance" {
      driver = "docker"

      service {
        name = "mysql"
        port = "db"
        check {
          type     = "tcp"
          interval = "30s"
          timeout  = "2s"
        }
      }

      config {
        image = "mysql:8.2.0"
      }

      volume_mount {
        volume      = "mysql_data"
        destination = "/var/lib/mysql"
      }

      env {
        MYSQL_ROOT_PASSWORD = "root"
      }

      resources {
        cpu        = 1800
        memory     = 1024
      }
    }
  }
}

Let’s apply volume first and then the job.

$ nomad volume register ./test/mysql/mysql.volume.hcl
$ nomad job run ./test/mysql/mysql.job.hcl

Now you can access your MySQL server under mysql.service.homelab:3600 with user root and password root. Create tables, add data, and restart the job to see that the MySQL data is persistent.

$ nomad job restart -namespace test mysql

Conclusion

In this article, we took another step forward toward the production-ish ready-ish Nomad cluster on bare-metal. We installed the CSI plugin that allows us to mount network drives to the nodes and connect them to the arbitrary jobs.

Even though SMB share served from the router is far from a good solution, it is much better than providing volumes from the host or installing SSD to one of the nodes in the cluster anyway. However, we achieved several crucial goals:

We separated the filesystem from the cluster. None of the cluster node failures will lead to the failure of the whole cluster.
Nomad can schedule jobs on any nodes since we serve volumes through the network.

The complete source code described in this article is available on GitHub .

Next Steps

As I mentioned earlier, the SSD mounted to the main router and exposed as a Samba drive is not stable and good enough for our cluster. However, it might be for a simple enthusiast homelab. Next time, we will build a Ceph cluster with redundancy and review how enterprise-grade network filesystems work.

Why SMB?#

Why not just use host volumes if we want to just learn Nomad?#

The hardware#

Configure Nomad CLI on the local machine#

Install SMB CSI Plugin on Nomad cluster#

SMB CSI plugin for Nomad: CSI Nodes service#

SMB CSI plugin for Nomad: CSI Controller service#

Test the setup#

Define SMB volume on Nomad cluster#

Define CIFS flags to run MySQL job in Nomad#

Define MySQL Nomad job#

Conclusion#

Next Steps#