This is the first post in the series about building a homelab bare-metal cluster using Hashicorp Consul and Nomad on a bunch of Raspberry Pi nodes.

Why Baremetal?

Modern backend developers use cloud providers such as AWS, Azure, or GCP and often need to remember their roots. We lose focus on underlying platforms and, as a result, make many mistakes that can deliver terrible consequences in production. We are all afraid of building things bare-metal because we refer to our ten years old experience and block the idea of cloud-agnostic systems before even considering the investigation in the field.

Working on tens of projects as a technical consultant and technical lead over the last five years, I see some issues with modern backend development:

  • The AWS (GCP, Azure) bills could grow significantly, and business has nothing to do with it because the project is vendor-locked
  • We wish to build cloud-agnostic systems, but sometimes the choice of underlying platform (AWS EKS, GCP CloudRun, etc…) restricts or forces development teams to lock on cloud-provider technologies (such as AWS SQS, AWS API Gateway, AWS Lambdas, etc…)
  • We never get the full potential of the underlying machine’s CPU, RAM, and OS core. It often leads to a massive waste of computing resources (Lambda cold starts, the overhead of virtualization, CPU/RAM oversell, lack of CPU/RAM instructions you can use, etc…)

It’s not the complete issue list I see, but it is enough for me to start building the bare-metal home lab and remember what actually runs your code in the cloud.

The setup

Hardware setup

Since I want to work with my homelab, as it was a real-world production cluster deployed with AWS, I want to put as many nodes under the hood as I can, so I decided to start with 4 Raspberry Pi nodes.

The next step is to plan their power supplement and network connection.

I hate cords. I want to have as few cords as possible when considering any setups. Thankfully, the power consumption of Raspberry Pi is extremely low, and I can apply PoE to my setup.

Since my focus is now on a quick start with cluster configuration and deployment, I decided not to investigate much and go with something available in my local stores:

Application setup

We will not invent bicycles here and go with the typical Consul+Nomad setup.

Since Nomad can run without Consul, we still want it for service discovery, service mesh, and node discovery later.

From my point of view, this tandem provides the most excellent separation of concerns in terms of building and wiring up the cluster:

You can think of a layered cake when you think of the Nomad and Consul tandem:

Layer 2: Nomad Client: Run jobs

Layer 1: Nomad server: Provide API and orchestrate jobs

Layer 0: Consul: Cluster network - join all nodes and define the cluster structure (nodes count, nodes health, how nodes are connected: WAN/LAN, etc…).

nomad-consul.png

So, the Consul will find nodes, join them into the cluster, and constantly check for the node’s health. Nomad will use information about the nodes provided by the Consul and will only define the leader of the cluster and deploy and orchestrate jobs on available nodes. When the Consul finds that some node is not healthy or lost, it will report it back to the Nomad, and the Nomad will reschedule the job to available nodes. In addition, Nomad can register jobs with the Consul to let the Consul check their health, and provide service discovery using the internal Consul DNS.

Configure the cluster

I choose Hashicorp Consul and Nomad as a cluster orchestrator because of several reasons:

  • I have used Kubernetes for too long and know it too well. I never touched Nomad and Consul and wanted to learn something new.
  • I hear about Nomad more and more. This is a relatively new technology. I hear about it a lot, and I am confident that I will see it more and more in real-life products I will work on in the future.
  • I need to suffer as I did ten years ago when I knew nothing when each step produced pain and a massive gain of knowledge :)

I will skip the part on installing an OS on Raspberry Pi, wire up the UTP cable , and focus here on the cluster’s configuration.

quality-control.gif

Prepare working environment

First, let’s create the folder where we will put all the IaC code of our cluster.

$ cd ~
$ mkdir homelab
$ cd homelab

Prepare Ansible Playbook

Before creating your first playbook, let’s define the Ansible Inventory with the information about your cluster nodes.

$ mkdir ansible
$ cd ansible
$ touch hosts

The hosts file is your inventory, where you define the addresses of your nodes and the information needed to connect to them:

[pi-homelab-01]
192.168.0.101
192.168.0.102
192.168.0.103
192.168.0.104

[pi-homelab-01:vars]
ansible_ssh_private_key_file=/Users/anatolii/.ssh/raspberry_cluster_id_rsa
ansible_user=raspberry

Now, you are good to go with your first playbook.

Configure Docker with Ansible

We will use Docker as a primary container engine for our cluster.

The Ansible playbook for docker installation is very straightforward and has nothing special:

---
- hosts: pi-homelab-01
  become: yes
  tasks:
    - name: 'Docker: Install docker dependencies'
      apt:
        pkg:
          - ca-certificates
          - curl
          - gnupg
          - lsb-release
          - uidmap
          - iptables
          - dbus-user-session
          - fuse-overlayfs
        state: present

    - name: 'Docker: Add Docker GPG apt Key'
      apt_key:
        url: https://download.docker.com/linux/ubuntu/gpg
        state: present

    - name: 'Docker: Add Docker Repository'
      apt_repository:
        repo: deb https://download.docker.com/linux/ubuntu focal stable
        state: present

    - name: 'Docker: Install docker'
      apt:
        pkg:
          - docker-ce
          - docker-ce-cli
          - containerd.io
        state: present

Let’s put this file under ~/homelab/ansible/docker.yml file.

Now, we can run this playbook by:

$ ansible-playbook -i hosts ./docker.yml

Install Hashicorp Consul with Ansible

Before writing the playbook, we must define the Consul configuration file and decide how to run the Consul daemon.

The whole Consul configuration is essentially just a declaration of:

  • The list of known IP addresses of the nodes
  • Declaration of network configuration for the Consul service: bind addr, client addr, advertise addr
  • Definition of top-level domain for DNS service discovery
  • A bunch of other configurations that are not important at this stage…
datacenter = "pi-homelab-01" # define datacenter ID
data_dir = "/opt/consul/data"
retry_join = ["192.168.0.101", "192.168.0.102", "192.168.0.103", "192.168.0.104"] # the list of all nodes IP addresses
bind_addr = "0.0.0.0"
client_addr = "0.0.0.0"
advertise_addr = "{{ GetInterfaceIP \"eth0\" }}"
domain = "homelab" # Define the top level DNS domain for service discovery

ui_config {
  enabled = true
}

Let’s put this file under ~/homelab/ansible/files/consul.hcl path.

It is obvious that the most sophisticated way to run any host daemons is to run them as a systemd daemon; let’s create a system configuration for Consul as well.

[Unit]
Description=Consul Agent
#Make sure that Consul daemon starts when network interfaces are up and running
Requires=network-online.target
After=network-online.target

[Service]
Restart=on-failure
EnvironmentFile=/etc/consul.d/consul.conf
ExecStart=/opt/consul/bin/consul agent -config-dir /etc/consul.d $FLAGS
ExecReload=/bin/kill -HUP $MAINPID
KillSignal=SIGTERM
User=consul
Group=consul

[Install]
WantedBy=multi-user.target

Let’s also put this file under ~/homelab/ansible/files/consul.service path.

Now, it’s time to install Consul on the nodes.

---
- hosts: pi-homelab-01
  become: yes
  tasks:
    - name: 'Consul: prepare install directory'
      file:
        path: /opt/consul/bin
        recurse: yes
        state: directory

    - name: 'Consul: Download Consul'
      get_url:
        url: https://releases.hashicorp.com/consul/1.17.0/consul_1.17.0_linux_arm64.zip
        dest: /opt/consul

    - name: 'Consul: Unzip Consul'
      unarchive:
        remote_src: yes
        src: /opt/consul/consul_1.17.0_linux_arm64.zip
        dest: /opt/consul/bin

    - name: 'Consul: create Consul group'
      group:
        name: consul
        state: present

    - name: 'Consul: create consul user'
      user:
        name: consul
        state: present
        groups:
         - consul
         - sudo
        shell: /bin/bash

    - name: 'Consul: prepare Consult config dir'
      file:
        path: /etc/consul.d
        state: directory

    - name: 'Consul: prepare Consult config file'
      copy:
        src: ./files/consul.conf
        dest: /etc/consul.d/consul.conf

    - name: 'Consul: Consul config'
      copy:
        src: ./files/consul.hcl
        dest: /etc/consul.d/consul.hcl

    - name: 'Consul: Setup consul data folder'
      file:
        path: /opt/consul/data
        owner: consul
        group: consul
        state: directory

    - name: 'Consul: Systemd config'
      copy:
        src: ./files/consul.service
        dest: /etc/systemd/system/consul.service

    - name: 'Consul: Enable and start Consul'
      systemd:
        name: consul
        enabled: yes
        state: started

Let’s put this file under ~/homelab/ansible/consul.yml file.

Now, we can run this playbook by:

$ ansible-playbook -i hosts ./consul.yml

Install Nomad with Ansible

Now, when Docker and Consul are installed, both up and running - your cluster is ready. All nodes should be connected; the Consul sees all nodes; - it is time to install your workload orchestrator and prepare for the first app to be installed on your cluster.

As well as Consul, Nomad must be configured with the hcl file.

At the current stage, the only important configurations there are:

  • Datacenter ID.
  • Bind addr - for Nomad API server binding.
  • The minimum amount of nodes needed to start the cluster leader election.

The Nomad documentation may be slightly misleading in describing the meaning of the bootstrap_expect configuration. After speaking with many people, I realise that it needs to be clarified.

So the documentation page with a step-by-step guide to Nomad cluster deployment says:

Note
Replace the bootstrap_expect value with the number of Nomad servers you are deploying; three or five is recommended.

The bootstrap_expect value has almost nothing to do with the number of Nomad servers you deploy. It is just an amount of servers Nomad Cluster expects to join the cluster before the cluster leader election starts. So, this number can be lower than the actual number of servers you deploy; the only requirement is that this number be odd. Since we have 4 Nomad servers, we can set this to 3 and move on. Further articles will explain this value in more detail with an overview of fault tolerance strategies.

# Define the Consul datacenter where our nodes are running
datacenter = "pi-homelab-01"
data_dir  = "/opt/nomad/data"
bind_addr = "0.0.0.0"

# server is a Nomad API server
server {
  enabled          = true
  # Expect at least 3 nodes to be online to start cluster leader election
  bootstrap_expect = 3
}

# client is a service that run the jobs
client {
  enabled = true
}

ui {
  enabled = true
}

Let’s put this file under ~/homelab/ansible/files/nomad.hcl path.

Let’s prepare the systemd configuration for the Nomad daemon.

[Unit]
Description=Nomad
Documentation=https://www.nomadproject.io/docs/
# we don't need to wait for network since we will wait for Consul service to start before running the Nomad 
#Wants=network-online.target
#After=network-online.target

# Make sure we run Nomad only after Consul daemon is up and running
Wants=consul.service
After=consul.service

[Service]

# Nomad server should be run as the nomad user. Nomad clients should be run as root
User=root
Group=root

ExecReload=/bin/kill -HUP $MAINPID
ExecStart=/usr/bin/nomad agent -config /etc/nomad.d
KillMode=process
KillSignal=SIGINT
LimitNOFILE=65536
LimitNPROC=infinity
Restart=on-failure
RestartSec=2
TasksMax=infinity
OOMScoreAdjust=-1000

[Install]
WantedBy=multi-user.target

Now, we are almost done and ready to install the Nomad. But before that, let’s discuss how we will use our docker containers and access their ports.

We all got used to port forward docker port into the host ports by running something like this:

$ docker run -p 3000:80 nginx

This command will forward docker container port 80 to the host port 3000, so the workload deployed with this command will be available for you on the http://localhost:3000 address.

Let’s have a closer look at how this port forwarding works.

When you run the docker run ..., the Docker runs your container inside the virtual network deployed to your machine. Each container’s IP address in this virtual network is never exposed to the outside world. The outside world here means all your default network interfaces (eth*, lo*). To expose the port from this virtual network to any existing network interfaces, we need to accept traffic from them, define that this traffic should be forwarded to the docker virtual network, and route it to the desired host inside the Docker network.

In Linux systems, Docker uses iptables to forward traffic from your network interfaces to the docker network host.

So, if you want to route all external traffic from port 3000 on your machine to port 80 on the docker container with your nginx, the Docker will put an iptables rule that defines the forwarding rule that describes that route.

iptables.png

Since Nomad orchestrates the jobs, and it should provide the ability to define the containers’ exposed ports, it should be able to control networking between host network interfaces and the Docker network.

To achieve that, Nomad will use Container Network Interface and bridge network mode. To let Nomad do that, we have to install CNI Plugin and enable the bridge network mode kernel flag and net.bridge configuration on our hosts:

---
- hosts: pi-homelab-01
  become: yes
  tasks:
    - name: 'Enable bridge network kernel flag'
      modprobe:
        name: br_netfilter
        state: present

    - name: 'CNI: create install directory'
      file:
        path: /opt/cni
        state: directory

    - name: 'CNI: create install directory'
      file:
        path: /opt/cni/bin
        state: directory

    - name: 'CNI: download CNI'
      get_url:
        url: https://github.com/containernetworking/plugins/releases/download/v1.0.0/cni-plugins-linux-arm64-v1.0.0.tgz
        dest: /opt/cni

    - name: 'CNI: install CNI'
      unarchive:
        remote_src: yes
        src: /opt/cni/cni-plugins-linux-arm64-v1.0.0.tgz
        dest: /opt/cni/bin

    - name: 'CNI: sysctl enable net.bridge.bridge-nf-call-arptables'
      sysctl:
        name: net.bridge.bridge-nf-call-arptables
        value: 1
        state: present

    - name: 'CNI: sysctl enable net.bridge.bridge-nf-call-ip6tables'
      sysctl:
        name: net.bridge.bridge-nf-call-ip6tables
        value: 1
        state: present

    - name: 'CNI: sysctl enable net.bridge.bridge-nf-call-iptables'
      sysctl:
        name: net.bridge.bridge-nf-call-iptables
        value: 1
        state: present

Let’s put this file under ~/homelab/ansible/cni.yml file.

Now, we can run this playbook by:

$ ansible-playbook -i hosts ./cni.yml

*Note that you might have to reboot your nodes to apply the kernel flag enablement.

Now, we are ready to extend our playbook and install and enable Nomad on our hosts:

---
- hosts: pi-homelab-01
  become: yes
  tasks:
    - name: 'Nomad: install gpg'
      apt_key:
        url: https://apt.releases.hashicorp.com/gpg
        state: present

    - name: 'Nomad: add Nomad repository'
      apt_repository:
          repo: deb https://apt.releases.hashicorp.com bookworm main
          state: present

    - name: 'Nomad: install Nomad'
      apt:
          name: nomad
          state: present

    - name: 'Nomad: make Nomad user sudoer'
      user:
        name: 'nomad'
        groups: sudo
        append: yes

    - name: 'Nomad: Systemd config'
      copy:
        src: ./files/nomad.service
        dest: /etc/systemd/system/nomad.service

    - name: 'Nomad: Nomad config'
      copy:
        src: ./files/nomad.hcl
        dest: /etc/nomad.d/nomad.hcl

    - name: 'Nomad: Enable and start Nomad'
      systemd:
        name: nomad
        enabled: yes
        state: started

Let’s put this file under ~/homelab/ansible/nomad.yml file.

Now, we can run this playbook by:

$ ansible-playbook -i hosts ./nomad.yml

Why do we use rootful Docker and rootful Nomad?

If you got this question before reading this line - good for you; it means you are the backend developer who has a proper feeling of potential security breaches :)

When this article was written, there was no or limited support for rootless Nomad and rootless docker setup. Here is a GitHub issues that describes what stops us from running the whole thing rootless:

https://github.com/hashicorp/nomad/issues/18307

https://github.com/hashicorp/nomad/issues/13669

But don’t be bothered too much with that. The production-ready rootless Docker was implemented relatively recently, and containers are still well protected on the application level by restricting linux capabilities for CGroups running your docker containers. While it just makes you feel uncomfortable, most of the Docker (and other) services on the internet run as root on their hosts :)

Put all together

For a convenience, let’s define a Makefile inside the ansible folder and define commands to run our playbooks:

$ touch Makefile
docker:
	ansible-playbook -i hosts ./docker.yml

cni:
	ansible-playbook -i hosts ./cni.yml

consul:
	ansible-playbook -i hosts ./consul.yml

nomad:
	ansible-playbook -i hosts ./nomad.yml

ALL: docker cni consul nomad

The full source code of this article is available on GitHub

Consul service discovery on your machine

You may consider using Consul service discovery DNS for an additional convenience finding nodes and services inside your cluster on your local machine.

All nodes running Consul exposes port 8600 with a DNS server that stores dynamic records with IP addresses of your nodes and all registered Nomad jobs.

Since we defined top-level domain for Consul service discovery inside the ./ansible/files/consul.hcl as homelab we can define resolver configuration for all *.homelab domains.

To do this on MacOS you can add a new file under /etc/resolver/homelab path with the following content:

domain homelab
nameserver 192.168.0.101#8600
nameserver 192.168.0.102#8600
nameserver 192.168.0.103#8600
nameserver 192.168.0.104#8600

This way we tell our local machine to ask for DNS records of any domain inside .homelab zone from any of our freshly installed Consul nodes.

To get the IP address of any of your nodes inside the cluster, just use this URL pattern:

<node-hostname>.node.consul

Later, when we will deploy our first job to Nomad we will be able to access the deployed service by using the following url:

<nomad-job-name>.service.homelab

Conclusion

Now, you can open your Nomad Dashboard on your browser by following this URL:

http://nomad.service.homelab:4646

This should open Nomad dashboard, and you should see all 4 nodes connected to the cluster

nomad-dashboard.jpg

In the next articles we will cover the essentials of building the rest of the platform that is necessary before deploying any workloads:

  1. Configure persistent volumes in Nomad using SMB shares.
  2. Install local docker registry and docker pull-through docker cache.
  3. Install load balancer and expose port 80 outside your local network.
  4. Monitoring Nomad nodes and workloads using Prometheus and Grafana.
  5. Failure recovery and fault tolerance.