Back in 2014 before Kubernetes even existed, CoreOs included a simple cluster-scheduler called Fleet. With Fleet you could aggregate your individual machines into a pool of resources and deploy systemd unit files to them. You could choose to either run your units globally on all machines at the same time or limit them to a set of hosts. The idea behind it was to treat your machines as if they would share an init system.

In 2018 Fleet was removed from CoreOs in favor of Kubernetes and is since then no longer maintained. Nevertheless the idea of being able to define a systemd unit and deploy it on a set of machines still seems like it could be useful. So after some tinkering I came up with a way to do exactly that using Nomad and my systemd-nspawn driver.

I am going to show you how to deploy a simple systemd unit for Consul running inside a vanilla Debian image into your Nomad cluster.

The unit file

Using the template stanza inside a Nomad job file, we will render a systemd unit for Consul in the local task directory.

template {
  data = <<EOH
[Unit]
Description="HashiCorp Consul - A service mesh solution"
Documentation=https://www.consul.io/
Requires=network-online.target
After=network-online.target

[Service]
ExecStart=[[ env "NOMAD_TASK_DIR" ]]/consul/consul agent -dev -bind '{{ GetInterfaceIP "host0" }}' -client '{{ GetInterfaceIP "host0" }}'
ExecReload=/bin/kill --signal HUP $MAINPID
KillMode=process
KillSignal=SIGTERM
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOH
  destination = "local/systemd/consul.service"
  left_delimiter = "[["
  right_delimiter = "]]"
}

The above generates a simple unit file which runs Consul in dev mode into the local/systemd directory inside the started task. Consul is instructed to bind it’s addresses to the host0 interface which is available inside a systemd-nspawn container running with private networking enabled. To not interfere with the command line arguments of Consul, we tell Nomad to use [[ and ]] to delimit templating commands instead of the usual {{ and }}.

To download the Consul binary, we make use of the artifact stanza.

artifact {
  source = "https://releases.hashicorp.com/consul/1.9.0/consul_1.9.0_linux_amd64.zip"
  destination = "local/consul"
}

With this unit file rendered, and the necessary binary downloaded, we need a way to enable it on startup. Also we need to figure out how to make systemd load a custom unit file from the local task directory instead of /etc/systemd/system

Enabling the unit

The usual way to enable a systemd unit is to run systemctl enable <unit name>. It will create a symbolic link inside the /etc/systemd/system/multi-user.target.wants pointing to your unit file. Another way to enable a unit file without running a command, is to create a drop-in file for the multi-user target. In this file you need to define a Wants section which contains your unit name. This ensures that your unit is started before the multi-user.target, the same way as using systemctl enable does.

Using another template stanza in our job file, we can use the second method to enable the unit file we rendered to the local task directory.

template {
  data = <<EOH
[Unit]
Wants=consul.service
EOH
  destination = "local/systemd/multi-user.target.d/wants.conf"
}

Loading systemd units from a different path

Systemd includes a nice little feature with allows you to specify additional paths from where unit files are loaded on startup. All you need to do, is to set the environment variable SYSTEMD_UNIT_PATH to the directory containing your files. If it ends with a :, the usual load paths will be appended to the content of the variable. This is similar to how you set the PATH variable inside your shell.

To set this, we simply need to make sure our systemd-container is started in boot mode. Then all environment variables we pass to it, will be available to systemd on startup. Using boot mode is the default behavior of the systemd-nspawn task driver so we only need to define the image we want to use and the mentioned environment variable.

config {
  image = "consul"
  image_download {
    url = "https://cloud.debian.org/images/cloud/buster/20201214-484/debian-10-generic-amd64-20201214-484.qcow2"
    force = true
    type = "raw"
  }
  environment = {
    SYSTEMD_UNIT_PATH = "${NOMAD_TASK_DIR}/systemd:"
  }
}

The complete job

With all of the above in place, the complete job file now looks like this

job "consul" {
  datacenters = ["dc1"]
  type = "service"
  group "linux" {
    count = 1

    task "consul" {
      driver = "nspawn"
      config {
        image = "consul"
        image_download {
          url = "https://cloud.debian.org/images/cloud/buster/20201214-484/debian-10-generic-amd64-20201214-484.qcow2"
          force = true
          type = "raw"
        }
        environment = {
          SYSTEMD_UNIT_PATH = "${NOMAD_TASK_DIR}/systemd:"
        }
      }

      artifact {
        source = "https://releases.hashicorp.com/consul/1.9.0/consul_1.9.0_linux_amd64.zip"
        destination = "local/consul"
      }

      template {
        data = <<EOH
[Unit]
Description="HashiCorp Consul - A service mesh solution"
Documentation=https://www.consul.io/
Requires=network-online.target
After=network-online.target

[Service]
ExecStart=[[ env "NOMAD_TASK_DIR" ]]/consul/consul agent -dev -bind '' -client ''
ExecReload=/bin/kill --signal HUP $MAINPID
KillMode=process
KillSignal=SIGTERM
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOH
        destination = "local/systemd/consul.service"
        left_delimiter = "[["
        right_delimiter = "]]"
      }

      template {
        data = <<EOH
[Unit]
Wants=systemd-networkd.service systemd-resolved.service consul.service
EOH
        destination = "local/systemd/multi-user.target.d/wants.conf"
      }
    }
  }
}

If you deploy the job inside your Nomad cluster and spawn a shell inside the task, you can see that the unit file defined in the job file is properly loaded on startup.

➜  ~ nomad exec -job consul /bin/bash
root@buster:/# systemctl status consul
● consul.service - "HashiCorp Consul - A service mesh solution"
   Loaded: loaded (/local/systemd/consul.service; disabled; vendor preset: enabled)
   Active: active (running) since Sun 2021-01-17 19:20:59 CET; 2min 15s ago
     Docs: https://www.consul.io/
 Main PID: 36 (consul)
   CGroup: /system.slice/consul.service
           └─36 /local/consul/consul agent -dev -bind  -client 

Jan 17 19:20:59 buster consul[36]:     2021-01-17T19:20:59.750+0100 [INFO]  agent.server: member joined, marking health alive: member=buster
Jan 17 19:20:59 buster consul[36]:     2021-01-17T19:20:59.791+0100 [INFO]  agent.server: federation state anti-entropy synced
Jan 17 19:20:59 buster consul[36]:     2021-01-17T19:20:59.826+0100 [DEBUG] agent: Skipping remote check since it is managed automatically: check=serfHealth
Jan 17 19:20:59 buster consul[36]:     2021-01-17T19:20:59.826+0100 [INFO]  agent: Synced node info
Jan 17 19:21:01 buster consul[36]:     2021-01-17T19:21:01.771+0100 [DEBUG] agent: Skipping remote check since it is managed automatically: check=serfHealth
Jan 17 19:21:01 buster consul[36]:     2021-01-17T19:21:01.771+0100 [DEBUG] agent: Node info in sync
Jan 17 19:22:53 buster consul[36]:     2021-01-17T19:22:53.841+0100 [DEBUG] agent: Skipping remote check since it is managed automatically: check=serfHealth
Jan 17 19:22:53 buster consul[36]:     2021-01-17T19:22:53.841+0100 [DEBUG] agent: Node info in sync
Jan 17 19:22:59 buster consul[36]:     2021-01-17T19:22:59.699+0100 [DEBUG] agent.router.manager: Rebalanced servers, new active server: number_of_servers=1 active_server="buster.dc1 (Addr: tcp/192.168.74.222:8300) (DC: dc1)"
Jan 17 19:22:59 buster consul[36]:     2021-01-17T19:22:59.700+0100 [DEBUG] agent.router.manager: Rebalanced servers, new active server: number_of_servers=1 active_server="buster (Addr: tcp/192.168.74.222:8300) (DC: dc1)"
root@buster:/# 

And that’s all there is to it :-). I hope you find this as useful as I do

Jan