Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,300 @@
---
layout: docs
page_title: Configure singleton deployments
description: |-
Declare a job that guarantees only a single instance can run at a time, with
minimal downtime.
---

# Configure singleton deployments

A singleton deployment is one where there is at most one instance of a given
allocation running on the cluster at one time. You might need this if the
workload needs exclusive access to a remote resource like a data store. Nomad
does not support singleton deployments as a built-in feature. Your workloads
continue to run even when the Nomad client agent has crashed, so ensuring
there's at most one allocation for a given workload some cooperation from the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
there's at most one allocation for a given workload some cooperation from the
there's at most one allocation for a given workload requires some cooperation from the

missing a verb between "workload" and "some" so guessing "requires"??

job. This document describes how to implement singleton deployments.

## Design Goals

The configuration described here meets two primary design goals:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The configuration described here meets two primary design goals:
The configuration described here meets these primary design goals:

you have more than 2 bullet points...


* The design will prevent a specific process with a task from running if there
is another instance of that task running anywhere else on the Nomad cluster.
* Nomad should be able to recover from failure of the task or the node on which
the task is running with minimal downtime, where "recovery" means that the
original task should be stopped and that Nomad should schedule a replacement
task.
* Nomad should minimize false positive detection of failures to avoid
unnecessary downtime during the cutover.
Comment on lines +23 to +30
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* The design will prevent a specific process with a task from running if there
is another instance of that task running anywhere else on the Nomad cluster.
* Nomad should be able to recover from failure of the task or the node on which
the task is running with minimal downtime, where "recovery" means that the
original task should be stopped and that Nomad should schedule a replacement
task.
* Nomad should minimize false positive detection of failures to avoid
unnecessary downtime during the cutover.
- The design prevents a specific process with a task from running if there
is another instance of that task running anywhere else on the Nomad cluster.
- Nomad should be able to recover from failure of the task or the node on which
the task is running with minimal downtime, where "recovery" means that Nomad should stop the
original task and schedule a replacement
task.
- Nomad should minimize false positive detection of failures to avoid
unnecessary downtime during the cutover.

style nits: - instead of * for unordered lists; present tense, active voice


There's a tradeoff between between recovery speed and false positives. The
faster you make Nomad attempt to recover from failure, the more likely that a
transient failure causes a replacement to be scheduled and a subsequent
Comment on lines +33 to +34
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
faster you make Nomad attempt to recover from failure, the more likely that a
transient failure causes a replacement to be scheduled and a subsequent
faster you make Nomad attempt to recover from failure, the more likely that a
transient failure causes Nomad to schedule a replacement and a subsequent

active voice nit

downtime.

Note that it's not possible to design a perfectly zero-downtime singleton
allocation in a distributed system. This design will err on the side of
correctness: having 0 or 1 allocations running rather than the incorrect 1 or 2
allocations running.
Comment on lines +38 to +40
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
allocation in a distributed system. This design will err on the side of
correctness: having 0 or 1 allocations running rather than the incorrect 1 or 2
allocations running.
allocation in a distributed system. This design errs on the side of
correctness: having zero or one allocations running rather than the incorrect one or two
allocations running.

nit spell out numbers zero thru nine


## Overview

There are several options available for some details of the implementation, but
all of them include the following:

* You must have a distributed lock with a TTL that's refreshed from the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* You must have a distributed lock with a TTL that's refreshed from the
- You must have a distributed lock with a TTL that's refreshed from the

allocation. The process that sets and refreshes the lock must have its
lifecycle tied to the main task. It can be either in-process, in-task with
supervision, or run as a sidecar. If the allocation cannot obtain the lock,
then it must not start whatever process or operations is intended to be a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
then it must not start whatever process or operations is intended to be a
then it must not start whatever process or operation you intend to be a

I think you mean "operation" singular?

singleton. After a configurable window without obtaining the lock, the
allocation must fail.
* You must set the [`group.disconnect.stop_on_client_after`][] field. This
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* You must set the [`group.disconnect.stop_on_client_after`][] field. This
- You must set the [`group.disconnect.stop_on_client_after`][] field. This

forces a Nomad client that's disconnected from the server to stop the
singleton allocation, which in turn releases the lock or allows its TTL to
expire.

The values for the three timers (the lock TTL, the time it takes the alloc to
give up, and the `stop_on_client_after` duration) are the values that can be
tuned to reduce the maximum amount of downtime the application can have.
Comment on lines +59 to +61
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The values for the three timers (the lock TTL, the time it takes the alloc to
give up, and the `stop_on_client_after` duration) are the values that can be
tuned to reduce the maximum amount of downtime the application can have.
Tune the lock TTL, the time it takes the alloc to
give up, and the `stop_on_client_after` duration timer values to reduce the
maximum amount of downtime the application can have.

Suggestion to simply the sentence structure.


The Nomad [Locks API][] can support the operations needed. In psuedo-code these
operations are:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
operations are:
operations are the following:


* To acquire the lock, `PUT /v1/var/:path?lock-acquire`
* On success: start heartbeat every 1/2 TTL
* On conflict or failure: retry with backoff and timeout.
* Once out of attempts, exit the process with error code.
* To heartbeat, `PUT /v1/var/:path?lock-renew`
* On success: continue
* On conflict: exit the process with error code
* On failure: retry with backoff up to TTL.
* If TTL expires, attempt to revoke lock, then exit the process with error code.
Comment on lines +66 to +74
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* To acquire the lock, `PUT /v1/var/:path?lock-acquire`
* On success: start heartbeat every 1/2 TTL
* On conflict or failure: retry with backoff and timeout.
* Once out of attempts, exit the process with error code.
* To heartbeat, `PUT /v1/var/:path?lock-renew`
* On success: continue
* On conflict: exit the process with error code
* On failure: retry with backoff up to TTL.
* If TTL expires, attempt to revoke lock, then exit the process with error code.
- To acquire the lock, `PUT /v1/var/:path?lock-acquire`
- On success: start heartbeat every 1/2 TTL
- On conflict or failure: retry with backoff and timeout.
- Once out of attempts, exit the process with error code.
- To heartbeat, `PUT /v1/var/:path?lock-renew`
- On success: continue
- On conflict: exit the process with error code
- On failure: retry with backoff up to TTL.
- If TTL expires, attempt to revoke lock, then exit the process with error code.


The allocation can safely use the Nomad [Task API][] socket to write to the
locks API, rather than communicating with the server directly. This reduces load
on the server and speeds up detection of failed client nodes because the
disconnected client cannot forward the Task API requests to the leader.

The [`nomad var lock`][] command implements this logic and can be used to shim
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The [`nomad var lock`][] command implements this logic and can be used to shim
The [`nomad var lock`][] command implements this logic, so you can use it to shim

style nit update to active voice

the process being locked.

### ACLs

Allocations cannot write to Variables by default. You must configure a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Allocations cannot write to Variables by default. You must configure a
Allocations cannot write to Nomad variables by default. You must configure a

Clarification

[workload-associated ACL policy][] that allows write access in the
[`namespace.variables`][] block. For example, the following ACL policy allows
access to write a lock on the path `nomad/jobs/example/lock` in the `prod`
namespace:

```
namespace "prod" {
variables {
path "nomad/jobs/example/lock" {
capabilities = ["write", "read", "list"]
}
}
}
```

You set this policy on the job with `nomad acl policy apply -namespace prod -job
example example-lock ./policy.hcl`.

### Using `nomad var lock`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Using `nomad var lock`
## Implementation
### Use `nomad var lock`

Add an H2 so we have Overview and Implementation in the right-hand page TOC


The easiest way to implement the locking logic is to use `nomad var lock` as a
shim in your task. The jobspec below assumes there's a Nomad binary in the
container image.
Comment on lines +107 to +109
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The easiest way to implement the locking logic is to use `nomad var lock` as a
shim in your task. The jobspec below assumes there's a Nomad binary in the
container image.
We recommend implementing the locking logic with `nomad var lock` as a shim in
your task. This example jobspec assumes there's a Nomad binary in the container
image.


```hcl
job "example" {
group "group" {

disconnect {
stop_on_client_after = "1m"
}

task "primary" {
config {
driver = "docker"
image = "example/app:1"
command = "nomad"
args = [
"var", "lock", "nomad/jobs/example/lock", # lock
"busybox", "httpd", # application
"-vv", "-f", "-p", "8001", "-h", "/local" # application args
]
}

identity {
env = true
}
}
}
}
```

If you don't want to ship a Nomad binary in the container image you can make a
read-only mount of the binary from a host volume. This will only work in cases
where the Nomad binary has been statically linked or you have glibc in the
container image.
Comment on lines +139 to +142
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you don't want to ship a Nomad binary in the container image you can make a
read-only mount of the binary from a host volume. This will only work in cases
where the Nomad binary has been statically linked or you have glibc in the
container image.
If you don't want to ship a Nomad binary in the container image, make a
read-only mount of the binary from a host volume. This only works in cases
where the Nomad binary has been statically linked or you have glibc in the
container image.


```hcl
job "example" {
group "group" {

disconnect {
stop_on_client_after = "1m"
}

volume "binaries" {
type = "host"
source = "binaries"
read_only = true
}

task "primary" {
config {
driver = "docker"
image = "example/app:1"
command = "/opt/bin/nomad"
args = [
"var", "lock", "nomad/jobs/example/lock", # lock
"busybox", "httpd", # application
"-vv", "-f", "-p", "8001", "-h", "/local" # application args
]
}

identity {
env = true # make NOMAD_TOKEN available to lock command
}

volume_mount {
volume = "binaries"
destination = "/opt/bin"
}
}
}
}
```
Comment on lines +144 to +181
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
```hcl
job "example" {
group "group" {
disconnect {
stop_on_client_after = "1m"
}
volume "binaries" {
type = "host"
source = "binaries"
read_only = true
}
task "primary" {
config {
driver = "docker"
image = "example/app:1"
command = "/opt/bin/nomad"
args = [
"var", "lock", "nomad/jobs/example/lock", # lock
"busybox", "httpd", # application
"-vv", "-f", "-p", "8001", "-h", "/local" # application args
]
}
identity {
env = true # make NOMAD_TOKEN available to lock command
}
volume_mount {
volume = "binaries"
destination = "/opt/bin"
}
}
}
}
```
<CodeBlockConfig lineNumbers highlight="8-12,30-33">
```hcl
job "example" {
group "group" {
disconnect {
stop_on_client_after = "1m"
}
volume "binaries" {
type = "host"
source = "binaries"
read_only = true
}
task "primary" {
config {
driver = "docker"
image = "example/app:1"
command = "/opt/bin/nomad"
args = [
"var", "lock", "nomad/jobs/example/lock", # lock
"busybox", "httpd", # application
"-vv", "-f", "-p", "8001", "-h", "/local" # application args
]
}
identity {
env = true # make NOMAD_TOKEN available to lock command
}
volume_mount {
volume = "binaries"
destination = "/opt/bin"
}
}
}
}
```
</CodeBlockConfig>

added code highlights


### Sidecar Lock

If cannot implement the lock logic in your application or with a shim such as
`nomad var lock`, you'rll need to implement it such that the task you are
locking is running as a sidecar of the locking task, which has
[`task.leader=true`][] set.
Comment on lines +183 to +188
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Sidecar Lock
If cannot implement the lock logic in your application or with a shim such as
`nomad var lock`, you'rll need to implement it such that the task you are
locking is running as a sidecar of the locking task, which has
[`task.leader=true`][] set.
### Sidecar lock
If you cannot implement the lock logic in your application or with a shim such
as `nomad var lock`, you need to implement it such that the task you are locking
is running as a sidecar of the locking task, which has [`task.leader=true`][]
set.


```hcl
job "example" {
group "group" {

disconnect {
stop_on_client_after = "1m"
}

task "lock" {
leader = true
config {
driver = "raw_exec"
command = "/opt/lock-script.sh"
pid_mode = "host"
}

identity {
env = true # make NOMAD_TOKEN available to lock command
}
}

task "application" {
lifecycle {
hook = "poststart"
sidecar = true
}

config {
driver = "docker"
image = "example/app:1"
}
}
}
}
```
Comment on lines +190 to +224
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
```hcl
job "example" {
group "group" {
disconnect {
stop_on_client_after = "1m"
}
task "lock" {
leader = true
config {
driver = "raw_exec"
command = "/opt/lock-script.sh"
pid_mode = "host"
}
identity {
env = true # make NOMAD_TOKEN available to lock command
}
}
task "application" {
lifecycle {
hook = "poststart"
sidecar = true
}
config {
driver = "docker"
image = "example/app:1"
}
}
}
}
```
<CodeBlockConfig lineNumbers highlight="9">
```hcl
job "example" {
group "group" {
disconnect {
stop_on_client_after = "1m"
}
task "lock" {
leader = true
config {
driver = "raw_exec"
command = "/opt/lock-script.sh"
pid_mode = "host"
}
identity {
env = true # make NOMAD_TOKEN available to lock command
}
}
task "application" {
lifecycle {
hook = "poststart"
sidecar = true
}
config {
driver = "docker"
image = "example/app:1"
}
}
}
}
```
</CodeBlockConfig>

add code highlight


The locking task has the following requirements:

* The locking task must be in the same group as the task being locked.
* The locking task must be able to terminate the task being locked without the
Nomad client being up (i.e. they share the same PID namespace, or the locking
task is privileged).
* The locking task must have a way of signalling the task being locked that it
is safe to start. For example, the locking task can write a sentinel file into
the /alloc directory, which the locked task tries to read on startup and
blocks until it exists.

If the third requirement cannot be met, then you’ll need to split the lock
acquisition and lock heartbeat into separate tasks:

```hcl
job "example" {
group "group" {

disconnect {
stop_on_client_after = "1m"
}

task "acquire" {
lifecycle {
hook = "prestart"
sidecar = false
}
config {
driver = "raw_exec"
command = "/opt/lock-acquire-script.sh"
}
identity {
env = true # make NOMAD_TOKEN available to lock command
}
}

task "heartbeat" {
leader = true
config {
driver = "raw_exec"
command = "/opt/lock-heartbeat-script.sh"
pid_mode = "host"
}
identity {
env = true # make NOMAD_TOKEN available to lock command
}
}

task "application" {
lifecycle {
hook = "poststart"
sidecar = true
}

config {
driver = "docker"
image = "example/app:1"
}
}
}
}
```

If the primary task is configured to [`restart`][], the task should be able to
restart within the lock TTL in order to minimize flapping on restart. This
improves availability but isn't required for correctness.
Comment on lines +228 to +291
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* The locking task must be in the same group as the task being locked.
* The locking task must be able to terminate the task being locked without the
Nomad client being up (i.e. they share the same PID namespace, or the locking
task is privileged).
* The locking task must have a way of signalling the task being locked that it
is safe to start. For example, the locking task can write a sentinel file into
the /alloc directory, which the locked task tries to read on startup and
blocks until it exists.
If the third requirement cannot be met, then you’ll need to split the lock
acquisition and lock heartbeat into separate tasks:
```hcl
job "example" {
group "group" {
disconnect {
stop_on_client_after = "1m"
}
task "acquire" {
lifecycle {
hook = "prestart"
sidecar = false
}
config {
driver = "raw_exec"
command = "/opt/lock-acquire-script.sh"
}
identity {
env = true # make NOMAD_TOKEN available to lock command
}
}
task "heartbeat" {
leader = true
config {
driver = "raw_exec"
command = "/opt/lock-heartbeat-script.sh"
pid_mode = "host"
}
identity {
env = true # make NOMAD_TOKEN available to lock command
}
}
task "application" {
lifecycle {
hook = "poststart"
sidecar = true
}
config {
driver = "docker"
image = "example/app:1"
}
}
}
}
```
If the primary task is configured to [`restart`][], the task should be able to
restart within the lock TTL in order to minimize flapping on restart. This
improves availability but isn't required for correctness.
- Must be in the same group as the task being locked.
- Must be able to terminate the task being locked without the Nomad client being
up. For example, they share the same PID namespace, or the locking task is
privileged.
- Must have a way of signalling the task being locked that it is safe to start.
For example, the locking task can write a Sentinel file into the `/alloc`
directory, which the locked task tries to read on startup and blocks until it
exists.
If you cannot meet the third requirement, then you need to split the lock
acquisition and lock heartbeat into separate tasks.
<CodeBlockConfig lineNumbers highlight="8-20,22-32">
```hcl
job "example" {
group "group" {
disconnect {
stop_on_client_after = "1m"
}
task "acquire" {
lifecycle {
hook = "prestart"
sidecar = false
}
config {
driver = "raw_exec"
command = "/opt/lock-acquire-script.sh"
}
identity {
env = true # make NOMAD_TOKEN available to lock command
}
}
task "heartbeat" {
leader = true
config {
driver = "raw_exec"
command = "/opt/lock-heartbeat-script.sh"
pid_mode = "host"
}
identity {
env = true # make NOMAD_TOKEN available to lock command
}
}
task "application" {
lifecycle {
hook = "poststart"
sidecar = true
}
config {
driver = "docker"
image = "example/app:1"
}
}
}
}
```
</CodeBlockConfig>
If you configured the primary task to [`restart`][], the task should be able to
restart within the lock TTL in order to minimize flapping on restart. This
improves availability but isn't required for correctness.


[`group.disconnect.stop_on_client_after`]: /nomad/docs/job-specification/disconnect#stop_on_client_after
[Locks API]: /nomad/api-docs/variables/locks
[Task API]: /nomad/api-docs/task-api
[`nomad var lock`]: /nomad/commands/var/lock
[workload-associated ACL policy]: /nomad/docs/concepts/workload-identity#workload-associated-acl-policies
[`namespace.variables`]: /nomad/docs/other-specifications/acl-policy#variables
[`task.leader=true`]: /nomad/docs/job-specification/task#leader
[`restart`]: /nomad/docs/job-specification/restart
4 changes: 4 additions & 0 deletions content/nomad/v1.11.x/data/docs-nav-data.json
Original file line number Diff line number Diff line change
Expand Up @@ -697,6 +697,10 @@
{
"title": "Configure rolling",
"path": "job-declare/strategy/rolling"
},
{
"title": "Configure singleton",
"path": "job-declare/strategy/singleton"
}
]
},
Expand Down
Loading