Multi-node Cluster
A single ahdapa instance is sufficient for most deployments. Run multiple nodes when you need horizontal redundancy — any node can serve all OAuth2 and OIDC endpoints, and gossip keeps their state in sync automatically.
How the cluster works
Each node holds its full cluster state in memory and in its local database. Every
gossip.interval_secs seconds (default 5 s) a node contacts each peer listed under
gossip.peers. By default the node sends only the CRDT entries that changed since
the last successful exchange with that peer (a delta), rather than the full state.
On first contact, or after any error, it falls back to a full-state push. The peer
merges the received state and replies (also as a delta or full state) with its own.
One round-trip brings both nodes to the same state. See
Gossip Protocol for full protocol details.
The cluster wrapping key is the AEAD key for session cookies and auth codes. All nodes must share the same wrapping key for session cookies to be interchangeable across nodes. The wrapping key itself is stored node-locally (not gossiped); only a UUID identifier is gossiped so that nodes can detect key rotation. Gossip messages are authenticated and encrypted using CMS (see Gossip encryption below).
Distributed modes
The [cluster] configuration section controls how tightly the cluster coordinates
token issuance. Four modes are available, ordered from least to most coordination:
JTI cache: Each auth code contains a unique JWT ID (JTI). When an auth code is exchanged for tokens, the JTI is recorded in an in-memory cache on the receiving node so that a second exchange attempt is rejected. This prevents a stolen auth code from being replayed — but only on the node that holds the cache entry.
| Mode | Auth code single-use | Cross-node auth code exchange | Session revocation | Quorum |
|---|---|---|---|---|
off (default) | Node-local JTI cache (issuing node only) | Node-pinned (must be exchanged on the issuing node) | Node-local DB only | No |
eventual | Node-local JTI cache (issuing node only) | Any node (~gossip-interval replay window) | CRDT-replicated (all nodes) | No |
forwarding | Forward to origin node | Zero replay window via forwarding | CRDT-replicated (all nodes) | No |
strict | Forward to origin node | Zero replay window via forwarding | CRDT-replicated (all nodes) | k-of-n peer pre-approval |
Modes are strictly ordered: each higher mode is a superset of the features below it.
The default (off) is correct for single-node deployments and benchmarks. For
high-availability clusters, eventual is the recommended starting point.
off — no cross-node coordination
Auth codes are single-use enforced by a node-local JTI cache. They can only be exchanged on the node that issued them. Session revocation is node-local only: a logout on node A does not propagate to node B. Suitable for single-node deployments.
eventual — CRDT session revocation
Session revocations (logout, back-channel logout) are written into the
revoked_sessions LwwMap in the CRDT and propagate to all peers via gossip within
one gossip interval (default 5 s). Any cluster node rejects cookies whose iat is
before the stored revoked_at for that subject.
Auth codes may be exchanged on any cluster node with a replay window of approximately one gossip interval. This is acceptable for most HA deployments where occasional replay is tolerable.
Propagation latency: revocation is not instantaneous. A logout issued on one node reaches all peers within one gossip interval (default 5 s). Use
forwardingorstrictmode if a shorter window is required for auth codes; session cookie revocation remains gossip-propagated in all modes.
forwarding — zero-window auth code exchange
Adds auth code forwarding on top of eventual. When a node receives an auth code
exchange request for a code it did not issue, it uses the internal endpoint
POST /api/internal/token/auth-code to forward the request to the origin node.
The origin node performs single-use enforcement, eliminating the gossip-interval
replay window.
Forwarding supports hub-spoke topologies via a 1-hop relay: if node A and node B
are not directly peered (e.g. in a hub-spoke topology), A sends the request to a
relay C that has both A and B as RecipientInfos in the CMS envelope. C forwards
the original encrypted body to B. The X-Ahdapa-Final-Dest header carries the
ultimate destination. A hop-count guard returns 508 Loop Detected when the hop
count reaches 2, preventing forwarding loops.
strict — quorum pre-approval
Adds k-of-n quorum pre-approval on top of forwarding. Before signing any access
token, the issuer broadcasts a VoteRequest to all live peers via
POST /api/internal/quorum/vote and waits for k approvals within
quorum_timeout_ms milliseconds. A minority partition cannot unilaterally issue
tokens.
The effective quorum size k is controlled by cluster.quorum_k:
0(default): majority —floor(live_peer_count / 2) + 1- positive integer: exact count required
When quorum cannot be reached and cluster.quorum_fallback = true, a warning is
logged and the token is issued anyway. When quorum_fallback = false (default),
the token request is rejected with an error.
Internal node-to-node endpoints
Both forwarding and strict modes use the following internal endpoints. These
endpoints return 404 Not Found when distributed_mode = off or eventual.
| Endpoint | Purpose |
|---|---|
POST /api/internal/token/auth-code | Auth code exchange forwarded from another cluster node. Body is SignedData(EnvelopedData) (CMS-authenticated). |
POST /api/internal/quorum/vote | Quorum vote request from the issuing node. Body is SignedData(EnvelopedData). Returns a VoteResponse JSON with approved: true/false. |
Both endpoints authenticate the request body using the sender’s gossip signing key
(ECDSA P-256). Nodes whose signing key is not yet registered in the CRDT are
rejected with 401 Unauthorized.
Like /api/gossip/sync, these endpoints authenticate the request body using the
sender’s pinned gossip signing key and reject unsigned or unknown senders with
401 Unauthorized.
Node configuration
Each node needs its own config file. The sections that differ between nodes are
[gossip].peers (which lists the other nodes) and whatever address-specific settings
your deployment uses ([server].issuer, TLS paths, etc.).
For FreeIPA-integrated deployments, you can replace the static peer list with automatic topology-based discovery — see IPA topology discovery.
Minimum gossip stanza (static peers)
# node1.toml
[gossip]
peers = ["https://node2.example.com:8080", "https://node3.example.com:8080"]
interval_secs = 5
# node2.toml
[gossip]
peers = ["https://node1.example.com:8080", "https://node3.example.com:8080"]
interval_secs = 5
# node3.toml
[gossip]
peers = ["https://node1.example.com:8080", "https://node2.example.com:8080"]
interval_secs = 5
Set a stable identity for each node. The preferred method is the [server] node_id
configuration key; ahdapa also falls back to the HOSTNAME environment variable,
then to the system hostname, then to a random UUID (which changes on every restart
and should be avoided in clustered deployments):
# node1.toml
[server]
node_id = "node1.example.com"
Full-mesh vs hub-and-spoke
ahdapa gossip is full-mesh by default: every node lists every other node as a peer. Full-mesh uses N×(N−1) pushes per interval. In steady state each push carries only the delta of entries that changed since the last successful sync with that peer; a full-state push (the worst case for bandwidth) occurs only on first contact or after an error.
These figures are theoretical maximums (every gossip round pushes a full state). In practice, two optimisations suppress traffic in converged clusters:
- Generation-skip: if the local CRDT has not changed since the last successful sync
with a peer (
CRDT_GENERATIONunchanged), the push is skipped entirely. - Delta exchange: when a push does occur, only entries newer than the peer’s last known generation are included.
A live token-issuing cluster observed 93% of rounds skipped (7 actual pushes / 108 attempts in a 35 s window). When pushes do occur, deltas are typically a small fraction of the full state. Pushes happen only when a client is created/deleted, a key is rotated, or a node joins/leaves. Steady-state bandwidth in a converged cluster is near zero.
Worst-case (full-state) bandwidth at the default 5-second interval with a 3-node baseline push size of ~7.5 KB (empirically observed; includes CMS envelope overhead):
| Nodes | Full-mesh | Hub-and-spoke |
|---|---|---|
| 3 | ~65 MB/h | ~43 MB/h |
| 5 | ~300 MB/h | ~120 MB/h |
| 7 | ~820 MB/h | ~235 MB/h |
Per-push size growth:
- Each additional cluster node: +~1,530 B (ML-KEM-768 public key dominates)
- Each registered OAuth2 client: +~150 B per push body (compact CBOR, 2-char field names)
- Each active session (refresh family): +~60 B — expired families are purged every gossip round, so growth is bounded by
max_refresh_token_age
Recommendation: Use full-mesh for up to 5 nodes. Switch to hub-and-spoke for 7+ nodes, or earlier if you have thousands of active sessions.
Hub-and-spoke configuration example (node1 is the hub):
# hub: node1.toml — peers with all spokes
[gossip]
peers = ["https://node2.example.com:8080", "https://node3.example.com:8080",
"https://node4.example.com:8080", "https://node5.example.com:8080"]
# spoke: node2.toml — peers with hub only
[gossip]
peers = ["https://node1.example.com:8080"]
With hub-and-spoke, state propagates from spoke to spoke in 2 gossip rounds (spoke →
hub → spoke) rather than 1, adding at most 2 × interval_secs of additional latency.
Provisioning a fresh cluster
IPA-integrated deployments: If you are deploying ahdapa on a FreeIPA cluster, the Ansible playbooks in
contrib/demo/ipa/ansible/handle node provisioning, wrapping-key synchronisation, and IPA privilege grants automatically. Runansible-playbook -i inventory.ini contrib/demo/ipa/ansible/site.ymland skip the manual steps below. See FreeIPA Co-deployment for details.
Using the admin CLI: The
ahdapactl cluster bootstrapcommand automates steps 2–5 below. Run it with the list of node URLs and it handles login, key generation, key distribution, and re-authentication in a single invocation:ahdapactl cluster bootstrap \ https://node1.example.com:8080 \ https://node2.example.com:8080 \ https://node3.example.com:8080After the command completes, verify gossip convergence with step 6 below. See Admin CLI for full
ahdapactlreference.
When nodes start with empty databases each generates its own random 32-byte wrapping key. Nodes with different wrapping keys cannot exchange session cookies. The bootstrap procedure aligns all nodes to a shared key without copying database files or restarting anything.
Step 1 — start all nodes
Start every node before attempting key synchronisation. Wait until each node’s health endpoint responds:
curl -sf https://node1.example.com:8080/api/auth/info >/dev/null && echo ready
Step 2 — generate a shared wrapping key
Generate 32 cryptographically random bytes and base64url-encode them (no padding).
python3 is available on any Linux system without extra dependencies:
CLUSTER_KEY=$(python3 -c "
import secrets, base64
print(base64.urlsafe_b64encode(secrets.token_bytes(32)).rstrip(b'=').decode())
")
Keep this value in a shell variable for the remainder of the provisioning session. Do not write it to disk or log it — it is the root secret for the cluster.
Step 3 — log in to every node before rotating any key
Session cookies are AEAD-sealed with each node’s current wrapping key. Rotating the key immediately invalidates all cookies issued under the previous key. Log in to all nodes before pushing the new key to any of them:
# Obtain one session cookie per node.
curl -c /tmp/n1.cookie -sf -X POST https://node1.example.com:8080/api/auth/login \
-H 'Content-Type: application/json' \
-d '{"username":"admin","password":"..."}'
curl -c /tmp/n2.cookie -sf -X POST https://node2.example.com:8080/api/auth/login \
-H 'Content-Type: application/json' \
-d '{"username":"admin","password":"..."}'
curl -c /tmp/n3.cookie -sf -X POST https://node3.example.com:8080/api/auth/login \
-H 'Content-Type: application/json' \
-d '{"username":"admin","password":"..."}'
Step 4 — push the shared key to every node
Use each node’s own pre-rotation cookie. The PUT /api/admin/keys/cluster endpoint
requires keys:rotate permission and takes effect immediately — no restart needed:
for i in 1 2 3; do
curl -sf -b "/tmp/n${i}.cookie" \
-X PUT "https://node${i}.example.com:8080/api/admin/keys/cluster" \
-H 'Content-Type: application/json' \
-d "{\"key\":\"$CLUSTER_KEY\"}"
echo "node${i}: key set"
done
Step 5 — re-authenticate to obtain a cross-node cookie
After the key rotation, log in to any one node to get a fresh cookie sealed under the shared key. This cookie is accepted by every node in the cluster:
curl -c /tmp/admin.cookie -sf \
-X POST https://node1.example.com:8080/api/auth/login \
-H 'Content-Type: application/json' \
-d '{"username":"admin","password":"..."}'
Clean up the per-node bootstrap cookies:
rm /tmp/n1.cookie /tmp/n2.cookie /tmp/n3.cookie
Step 6 — verify gossip convergence
You can inspect the gossip health of any node at any time — no authentication required:
curl -sf http://node1.example.com:8080/api/gossip/stats | python3 -m json.tool
The response shows live CRDT counts, the number of successfully completed gossip rounds,
the last sync time per peer, and cumulative error counters (persist_errors,
wrapping_key_pull_errors). This endpoint is also displayed in the admin web UI on the
Cluster Nodes page alongside the list of registered nodes.
Register an OAuth2 client on one node and poll the others until it appears:
# Create on node1.
CLIENT_ID=$(curl -sf -b /tmp/admin.cookie \
-X POST https://node1.example.com:8080/api/admin/clients \
-H 'Content-Type: application/json' \
-d '{"client_name":"test","redirect_uris":[],"scopes":["openid"]}' \
| python3 -c "import sys,json; print(json.load(sys.stdin)['client_id'])")
# Poll node2 until it appears (≤ 2 × interval_secs).
until curl -sf -b /tmp/admin.cookie \
https://node2.example.com:8080/api/admin/clients \
| python3 -c "import sys,json; [exit(0) for c in json.load(sys.stdin) if c['client_id']=='$CLIENT_ID']; exit(1)" \
2>/dev/null; do
sleep 1; echo "waiting..."
done
echo "converged on node2"
Delete the test client when done:
curl -sf -b /tmp/admin.cookie \
-X DELETE "https://node1.example.com:8080/api/admin/clients/$CLIENT_ID"
Adding a node to an existing cluster
- Start the new node with an empty database. Let it fully boot.
- Log in to the new node with its freshly generated key (step 3 pattern above).
- Push the cluster’s existing shared wrapping key to the new node (step 4 pattern).
- Make the new node reachable by existing nodes:
- Static peer list: add the new node’s address to the
gossip.peerslist on every existing node and restart each process. ahdapa does not reload configuration at runtime; a restart is required to pick up the updated peer list. - IPA topology (
ipa_topology = true): no configuration change is needed. Once the new IPA replica is added to the replication topology, ahdapa automatically discovers it at the next topology refresh (default: within 5 minutes) and begins gossiping with it.
- Static peer list: add the new node’s address to the
- The new node will receive the full cluster state on the first gossip round and
converge within
interval_secsseconds.
IPA topology discovery
When ahdapa runs on a FreeIPA replica, you can enable automatic peer discovery
instead of maintaining a static peers list:
[gossip]
ipa_topology = true
ipa_topology_interval_secs = 300 # re-query every 5 minutes (default)
interval_secs = 5
With ipa_topology = true:
- ahdapa queries
cn=topology,cn=ipa,cn=etc,<suffix>at startup (before the first gossip round) and everyipa_topology_interval_secsseconds thereafter. - Peer URLs are constructed as
https://<hostname><issuer-path>— for examplehttps://ipa2.example.com/idpwhen the issuer ishttps://ipa1.example.com/idp. - All discovered replica hostnames are automatically added to the gossip admission
allowlist (
allowed_node_ids), so no manual allowlist configuration is needed. - After each successful topology fetch, if
gssapi.initiator_principalis set, ahdapa callsPOST /api/gossip/register-kemon each newly-discovered peer that does not yet have this node’s ML-KEM-768 key. The request is authenticated with a Kerberos AP-REQ forHTTP@<peer_host>using the local machine credential, and includes both the ML-KEM-768 public key and the ECDSA P-256 gossip signing key. This pre-seeds both keys before the first gossip push so that the signing key is pinned and a rogue node cannot claim the allowlisted identity. The gossip layer rejects messages from senders with no pinned signing key, so seeding must complete before gossip can proceed. - If a topology peer returns HTTP 404 (replica exists in IPA but does not yet run ahdapa), the error is logged at debug level and gossip continues with the next peer.
- Gossip is not disabled when
peersis empty, providedipa_topology = true.
Prerequisite — IPA permission grants
The HTTP service principal must be granted three IPA privileges:
| Privilege | Permissions | Purpose |
|---|---|---|
Ahdapa Topology Read | System: Read Topology Segments | Peer discovery via replication topology |
Ahdapa IdP Read | Ahdapa - Read user IdP attributes (custom) | Read ipauserauthtype, ipaidpconfiglink, ipaidpsub per user — needed to enforce ipauserauthtype=idp and resolve federated users |
Ahdapa IdP Read | System: Read External IdP server (built-in) | Read ipaIdP entries for automatic IdP discovery |
These lookups use the service principal credential directly (no S4U2Self impersonation) because the target attributes are not visible to users reading their own entries.
With Ansible, playbooks/ipa_permissions.yml (part of site.yml) handles all of
this in one idempotent run — using ipapermission, ipaprivilege, iparole, and
delegation modules from ansible-freeipa — for every principal in the ipa_nodes
and ahdapa_standalone inventory groups:
ansible-playbook -i inventory.ini contrib/demo/ipa/ansible/playbooks/ipa_permissions.yml
For manual setups, run once as an IPA admin:
# Custom permission — read user IdP attributes
ipa permission-add "Ahdapa - Read user IdP attributes" \
--right=read --right=search --right=compare \
--attrs=ipauserauthtype --attrs=ipaidpconfiglink --attrs=ipaidpsub \
--type=user
# Topology read privilege
ipa privilege-add "Ahdapa Topology Read" \
--desc="Allows ahdapa to read IPA replication topology for peer discovery"
ipa privilege-add-permission "Ahdapa Topology Read" \
--permission="System: Read Topology Segments"
# IdP read privilege
ipa privilege-add "Ahdapa IdP Read" \
--desc="Allows ahdapa to read IPA IdP configurations and user IdP bindings"
ipa privilege-add-permission "Ahdapa IdP Read" \
--permissions="Ahdapa - Read user IdP attributes,System: Read External IdP server"
# Role — assign both privileges, add each HTTP service principal
ipa role-add "Ahdapa Services" \
--desc="Role for ahdapa service accounts"
ipa role-add-privilege "Ahdapa Services" \
--privileges="Ahdapa Topology Read,Ahdapa IdP Read"
ipa role-add-member "Ahdapa Services" \
--services="HTTP/ipa.example.com@EXAMPLE.COM"
Replace ipa.example.com@EXAMPLE.COM with the HTTP service principal for each
IPA server that runs ahdapa.
389-ds indexes for federated user lookups
ahdapa resolves federated users with an LDAP filter that matches both
ipaIdpConfigLink and ipaIdpSub. Without equality indexes on those attributes
the filter falls back to a full table scan (visible as notes=U in the 389-ds
access log) and adds hundreds of milliseconds to every federated login. Run once
per Directory Server instance:
INSTANCE="slapd-EXAMPLE-COM" # realm, dots replaced with hyphens
dsconf $INSTANCE backend index add --attr ipaIdpConfigLink --index-type eq userRoot
dsconf $INSTANCE backend index add --attr ipaIdpSub --index-type eq userRoot
dsconf $INSTANCE backend index reindex \
--attr ipaIdpConfigLink --attr ipaIdpSub --wait userRoot
The Ansible playbook runs these steps automatically when indexes are missing.
Standalone (non-replica) nodes
A standalone node is an IPA-enrolled host that runs ahdapa but is not an IPA replica (no local 389-ds or KDC). Its HTTP service principal must be granted Kerberos constrained delegation (S4U2Proxy) so it can forward impersonated user credentials to the IPA LDAP and HTTP services when performing S4U2Self.
Run once as an IPA admin after enrolling the standalone host:
# Enable S4U2Self on the standalone HTTP principal
ipa service-mod HTTP/ahdapa.example.com@EXAMPLE.COM \
--ok-to-auth-as-delegate=True
# Delegation target — list each IPA replica's HTTP and LDAP principals
ipa servicedelegationtarget-add ahdapa-delegation-targets
ipa servicedelegationtarget-add-member ahdapa-delegation-targets \
--principals=HTTP/ipa.example.com@EXAMPLE.COM
ipa servicedelegationtarget-add-member ahdapa-delegation-targets \
--principals=ldap/ipa.example.com@EXAMPLE.COM
# Repeat the two add-member lines for each additional IPA replica.
# Delegation rule — grant the standalone HTTP principal access to the target
ipa servicedelegationrule-add ahdapa-delegation
ipa servicedelegationrule-add-member ahdapa-delegation \
--principals=HTTP/ahdapa.example.com@EXAMPLE.COM
ipa servicedelegationrule-add-target ahdapa-delegation \
--servicedelegationtargets=ahdapa-delegation-targets
With Ansible, add the standalone node to the [ahdapa_standalone] inventory
group. ipa_permissions.yml applies the delegation automatically for every host
in that group. No ahdapa.toml changes are needed for standalone nodes beyond
the standard IPA stanzas; gssproxy is configured with
allow_constrained_delegation = true regardless of node type.
Gossip encryption
Each ahdapa node generates three cryptographic key pairs on first start:
- An ML-KEM-768 encryption key pair (post-quantum, FIPS 203). Peer nodes use the public key to encrypt gossip messages so that only this node can read them.
- An ECDSA P-256 signing key pair for gossip authentication. This node uses the private key to sign every gossip message it sends so that peers can verify the message came from this node and has not been tampered with.
- A JWT signing key pair using the algorithm configured by
[server] jwt_signing_algorithm(default: ES256 / ECDSA P-256). This node uses the private key to sign JWT access tokens and ID tokens it issues. The private key never leaves the node; only the public key is distributed to peers so they can validate tokens this node issued. If the configured algorithm differs from the algorithm of the stored key, a new key is generated automatically on startup.
The ML-KEM-768 and ECDSA P-256 public keys are automatically distributed to peer nodes
through the CRDT gossip mechanism. The JWT signing public key for token validation is
likewise gossiped as part of SigningKeyEntry. Once all nodes have exchanged public keys,
every gossip message is both encrypted (only the recipient can read it) and signed (the
recipient can verify who sent it).
Cross-node token validation: Because every node’s JWT signing public key is available
in the CRDT on all peers, any cluster node can cryptographically verify a token issued by
any other node. The iss claim check in verify_bearer_jwt (used by /userinfo and
/introspect) accepts not only the local [server] issuer but also any issuer URL
listed in gossip.peers and any URL discovered via ipa_topology. This means
introspecting a token issued by node A on node B succeeds without any session forwarding —
the gossiped signing key provides the cryptographic guarantee, and the iss check
confirms the token came from a known cluster member.
Key generation is automatic; all private keys are stored in the node’s local database
(node_keys table) and are never transmitted over the network. The cluster wrapping
key — used to protect session cookies — is CMS-encrypted with the node’s own ML-KEM-768
public key and stored locally; when a new node needs it, it pulls the key on demand
from an existing peer via GET /api/gossip/wrapping-key. The response is a full
SignedData(EnvelopedData) blob: the serving node signs the response with its gossip
ECDSA P-256 key and encrypts it to the requester’s ML-KEM-768 key, so both the sender’s
identity and the payload’s integrity are verified before the wrapping key is accepted.
Controlling which nodes can join
To prevent unauthorized nodes from self-registering, set gossip.allowed_node_ids to
the explicit list of node identifiers that are permitted to participate:
[gossip]
peers = ["https://node2.example.com:8080"]
allowed_node_ids = ["node1.example.com", "node2.example.com"]
The node_id is the value of the HOSTNAME environment variable (or the system
hostname) at startup. Gossip messages that attempt to register a node_id not in the
combined allowlist are silently dropped before merge.
When allowed_node_ids is empty and ipa_topology is false, the combined allowlist
is empty and no node can self-register (fail-closed). Enable ipa_topology so that
hostnames are discovered automatically, or list allowed_node_ids explicitly.
Bootstrapping a new cluster with CMS gossip
Important: The first gossip exchange between two nodes requires each node to already know the other’s ML-KEM-768 public key. On a completely fresh cluster, neither node knows the other’s key, so the gossip loop skips peers it has no KEM key for and waits.
IPA-integrated deployments (recommended): When ipa_topology = true and
gssapi.initiator_principal is set, the topology refresh task automatically seeds each
peer’s ML-KEM-768 key and ECDSA P-256 gossip signing key via POST /api/gossip/register-kem
using the node’s Kerberos machine credential (HTTP/<hostname>@<REALM>). This happens
before the first gossip round, so all IPA-enrolled nodes have pinned signing keys and
known KEM keys without any manual intervention. No database copying or manual key seeding
is needed.
Static-peer deployments: Use one of the following procedures:
-
Start Node A. Its public keys are registered in its own CRDT.
-
Copy Node A’s database to Node B before starting Node B (or start Node A first and let it run a few seconds, then start Node B). Node B will contain A’s NodeEntry and vice versa after the first successful merge.
-
Alternatively, fetch each node’s KEM key via
GET /api/gossip/kem-infoand seed it on all peers viaPOST /api/admin/nodes/seed(requireskeys:rotatepermission):# Fetch node A's KEM key. KEM_KEY=$(curl -sf https://nodeA.example.com:8080/api/gossip/kem-info \ | python3 -c "import sys,json; print(json.load(sys.stdin)['kem_public_key_der'])") # Seed it into node B. curl -sf -b /tmp/nb.cookie \ -X POST https://nodeB.example.com:8080/api/admin/nodes/seed \ -H 'Content-Type: application/json' \ -d "{\"node_id\":\"nodeA\",\"kem_public_key_der\":\"$KEM_KEY\"}"Or with ahdapactl:
ahdapactl cluster nodes seed --node-id nodeA --kem-key <base64url>
Once the first successful gossip exchange occurs, nodes learn each other’s KEM keys and all subsequent exchanges are encrypted automatically. The cluster wrapping key is re-sealed for the updated recipient set within one gossip cycle.
Joining a new node
IPA-integrated deployments: No manual key seeding is required. Once the new IPA
replica appears in the replication topology, the topology refresh task on each existing
node automatically discovers it, calls POST /api/gossip/register-kem to seed the new
node’s ML-KEM-768 key and gossip signing key (authenticated with the existing node’s
Kerberos machine credential), and begins gossiping with it. Similarly, the new node
seeds its keys on all existing peers after its first topology refresh. The new node
pulls the cluster wrapping key on the first successful gossip exchange.
Static-peer deployments:
- Start the new node — its ML-KEM-768 key pair is generated automatically.
- Add it to
allowed_node_idson all existing nodes (if configured). - Add the new node’s address to
gossip.peerson all existing nodes and reload. - Within two gossip cycles (default 10 s), existing nodes learn the new node’s KEM key.
The new node detects the cluster’s
wrapping_key_idvia gossip and pulls the actual wrapping key from a peer viaGET /api/gossip/wrapping-key. It can then unseal session cookies from other nodes once the pull completes.
Automatic KEM re-registration after peer restart
When a peer restarts and loses its CRDT state (e.g. after a database wipe or a fresh
reinstall), its KEM key and gossip signing key are gone. Subsequent gossip pushes from
existing nodes are rejected with 401 Unauthorized because the peer no longer recognises
their signing keys.
The gossip loop detects this automatically: a 401 response from a peer triggers an
immediate call to POST /api/gossip/register-kem on that peer, seeding this node’s
ML-KEM-768 public key and ECDSA P-256 gossip signing key. Re-registration is throttled
to at most once per 60 seconds per peer so that a persistently-rejecting peer does not
cause excessive Kerberos ticket acquisition. Once re-registration succeeds, gossip
resumes automatically on the next round.
This recovery is fully automatic in IPA-integrated deployments where
gssapi.initiator_principal is set. In static-peer deployments without a machine
credential, the 401 is logged as a warning and the gossip push to that peer is skipped
until the peer is manually re-seeded (see Bootstrapping a new cluster).
Removing a node
- Remove the node from
gossip.peerson all remaining nodes. - Remove it from
allowed_node_idsif configured. - Rotate the cluster wrapping key via
PUT /api/admin/keys/cluster(orahdapactl cluster key rotate) — the removed node’s KEM key is no longer included in future re-seals, so it loses access to the wrapping key after rotation.
The removed node’s CRDT entry (signing key, KEM key, etc.) remains visible in
GET /api/admin/nodes until it ages out past gossip.tombstone_ttl_secs. This is
harmless: with the node absent from every peer list and the wrapping key rotated, it
cannot participate in or decrypt cluster traffic.
Security considerations
Gossip and internal endpoint access control
POST /api/gossip/sync, GET /api/gossip/wrapping-key, and the internal forwarding
endpoints (/api/internal/token/auth-code, /api/internal/quorum/vote) are served
on the same port as the public OAuth2 endpoints — port-level firewall rules cannot
isolate them without also blocking public OAuth2 clients. Access is enforced at the
application layer by two complementary mechanisms:
Cryptographic authentication (primary)
Every gossip sync payload is a CMS SignedData(EnvelopedData) structure. The
receiver verifies the ECDSA P-256 outer signature against the sender’s pinned
gossip signing key. A sender whose key is not registered in the CRDT is rejected
with 401 Unauthorized before any payload is read or applied. The ML-KEM-768
inner encryption additionally ensures the payload is opaque to any party that does
not hold the addressed node’s private key.
Node admission allowlist (secondary)
The allowed_node_ids list and IPA topology discovery together control which
node_ids may exchange CRDT state. Gossip messages from unlisted node_ids are
dropped at the admission filter after signature verification. The combined allowlist
fails closed: an empty union admits nobody.
Proxy-layer path restriction (optional)
If a reverse proxy fronts the cluster, its Location / location block can
additionally restrict gossip paths to the cluster subnet as a defence-in-depth
measure — see Reverse Proxy Setup for Apache and nginx examples.
This is optional: the cryptographic and allowlist controls above already prevent
any useful action by an unauthenticated caller.
Protect the wrapping key
The cluster wrapping key is the root of trust for session cookies. Each node stores its
own CMS-encrypted copy locally (node_keys.wrapping_key_cms_der); the 32-byte key
never appears in gossip payloads.
- If a node is decommissioned, delete it from the cluster. Future key rotations will not serve the wrapping key to its (now-absent) KEM public key, so it loses access.
- If you need to manually rotate the wrapping key, follow the provisioning procedure (steps 3–5): log in to all nodes first, push the new key, then re-authenticate. Peers detect the UUID change via gossip and pull the new key automatically.
TLS for inter-node traffic
Configure TLS on all nodes ([tls] section) and use https:// peer URLs in
gossip.peers. TLS encrypts the transport; the CMS envelope encrypts the application
payload. Both layers together prevent passive monitoring and active interception. See
Configuration Reference.
Running the demo locally
contrib/demo/cluster/run.sh implements all of the above steps against three local
nodes on ports 8080–8082 and verifies convergence end-to-end. It is also used as a
CI integration test:
contrib/demo/cluster/run.sh # non-interactive: exits 0 on pass, 1 on fail
contrib/demo/cluster/run.sh --interactive # keeps nodes running until Ctrl-C