Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Multi-node Cluster

A single ahdapa instance is sufficient for most deployments. Run multiple nodes when you need horizontal redundancy — any node can serve all OAuth2 and OIDC endpoints, and gossip keeps their state in sync automatically.

How the cluster works

Each node holds its full cluster state in memory and in its local database. Every gossip.interval_secs seconds (default 5 s) a node contacts each peer listed under gossip.peers. By default the node sends only the CRDT entries that changed since the last successful exchange with that peer (a delta), rather than the full state. On first contact, or after any error, it falls back to a full-state push. The peer merges the received state and replies (also as a delta or full state) with its own. One round-trip brings both nodes to the same state. See Gossip Protocol for full protocol details.

The cluster wrapping key is the AEAD key for session cookies and auth codes. All nodes must share the same wrapping key for session cookies to be interchangeable across nodes. The wrapping key itself is stored node-locally (not gossiped); only a UUID identifier is gossiped so that nodes can detect key rotation. Gossip messages are authenticated and encrypted using CMS (see Gossip encryption below).

Distributed modes

The [cluster] configuration section controls how tightly the cluster coordinates token issuance. Four modes are available, ordered from least to most coordination:

JTI cache: Each auth code contains a unique JWT ID (JTI). When an auth code is exchanged for tokens, the JTI is recorded in an in-memory cache on the receiving node so that a second exchange attempt is rejected. This prevents a stolen auth code from being replayed — but only on the node that holds the cache entry.

ModeAuth code single-useCross-node auth code exchangeSession revocationQuorum
off (default)Node-local JTI cache (issuing node only)Node-pinned (must be exchanged on the issuing node)Node-local DB onlyNo
eventualNode-local JTI cache (issuing node only)Any node (~gossip-interval replay window)CRDT-replicated (all nodes)No
forwardingForward to origin nodeZero replay window via forwardingCRDT-replicated (all nodes)No
strictForward to origin nodeZero replay window via forwardingCRDT-replicated (all nodes)k-of-n peer pre-approval

Modes are strictly ordered: each higher mode is a superset of the features below it. The default (off) is correct for single-node deployments and benchmarks. For high-availability clusters, eventual is the recommended starting point.

off — no cross-node coordination

Auth codes are single-use enforced by a node-local JTI cache. They can only be exchanged on the node that issued them. Session revocation is node-local only: a logout on node A does not propagate to node B. Suitable for single-node deployments.

eventual — CRDT session revocation

Session revocations (logout, back-channel logout) are written into the revoked_sessions LwwMap in the CRDT and propagate to all peers via gossip within one gossip interval (default 5 s). Any cluster node rejects cookies whose iat is before the stored revoked_at for that subject.

Auth codes may be exchanged on any cluster node with a replay window of approximately one gossip interval. This is acceptable for most HA deployments where occasional replay is tolerable.

Propagation latency: revocation is not instantaneous. A logout issued on one node reaches all peers within one gossip interval (default 5 s). Use forwarding or strict mode if a shorter window is required for auth codes; session cookie revocation remains gossip-propagated in all modes.

forwarding — zero-window auth code exchange

Adds auth code forwarding on top of eventual. When a node receives an auth code exchange request for a code it did not issue, it uses the internal endpoint POST /api/internal/token/auth-code to forward the request to the origin node. The origin node performs single-use enforcement, eliminating the gossip-interval replay window.

Forwarding supports hub-spoke topologies via a 1-hop relay: if node A and node B are not directly peered (e.g. in a hub-spoke topology), A sends the request to a relay C that has both A and B as RecipientInfos in the CMS envelope. C forwards the original encrypted body to B. The X-Ahdapa-Final-Dest header carries the ultimate destination. A hop-count guard returns 508 Loop Detected when the hop count reaches 2, preventing forwarding loops.

strict — quorum pre-approval

Adds k-of-n quorum pre-approval on top of forwarding. Before signing any access token, the issuer broadcasts a VoteRequest to all live peers via POST /api/internal/quorum/vote and waits for k approvals within quorum_timeout_ms milliseconds. A minority partition cannot unilaterally issue tokens.

The effective quorum size k is controlled by cluster.quorum_k:

  • 0 (default): majority — floor(live_peer_count / 2) + 1
  • positive integer: exact count required

When quorum cannot be reached and cluster.quorum_fallback = true, a warning is logged and the token is issued anyway. When quorum_fallback = false (default), the token request is rejected with an error.

Internal node-to-node endpoints

Both forwarding and strict modes use the following internal endpoints. These endpoints return 404 Not Found when distributed_mode = off or eventual.

EndpointPurpose
POST /api/internal/token/auth-codeAuth code exchange forwarded from another cluster node. Body is SignedData(EnvelopedData) (CMS-authenticated).
POST /api/internal/quorum/voteQuorum vote request from the issuing node. Body is SignedData(EnvelopedData). Returns a VoteResponse JSON with approved: true/false.

Both endpoints authenticate the request body using the sender’s gossip signing key (ECDSA P-256). Nodes whose signing key is not yet registered in the CRDT are rejected with 401 Unauthorized.

Like /api/gossip/sync, these endpoints authenticate the request body using the sender’s pinned gossip signing key and reject unsigned or unknown senders with 401 Unauthorized.

Node configuration

Each node needs its own config file. The sections that differ between nodes are [gossip].peers (which lists the other nodes) and whatever address-specific settings your deployment uses ([server].issuer, TLS paths, etc.).

For FreeIPA-integrated deployments, you can replace the static peer list with automatic topology-based discovery — see IPA topology discovery.

Minimum gossip stanza (static peers)

# node1.toml
[gossip]
peers         = ["https://node2.example.com:8080", "https://node3.example.com:8080"]
interval_secs = 5
# node2.toml
[gossip]
peers         = ["https://node1.example.com:8080", "https://node3.example.com:8080"]
interval_secs = 5
# node3.toml
[gossip]
peers         = ["https://node1.example.com:8080", "https://node2.example.com:8080"]
interval_secs = 5

Set a stable identity for each node. The preferred method is the [server] node_id configuration key; ahdapa also falls back to the HOSTNAME environment variable, then to the system hostname, then to a random UUID (which changes on every restart and should be avoided in clustered deployments):

# node1.toml
[server]
node_id = "node1.example.com"

Full-mesh vs hub-and-spoke

ahdapa gossip is full-mesh by default: every node lists every other node as a peer. Full-mesh uses N×(N−1) pushes per interval. In steady state each push carries only the delta of entries that changed since the last successful sync with that peer; a full-state push (the worst case for bandwidth) occurs only on first contact or after an error.

These figures are theoretical maximums (every gossip round pushes a full state). In practice, two optimisations suppress traffic in converged clusters:

  1. Generation-skip: if the local CRDT has not changed since the last successful sync with a peer (CRDT_GENERATION unchanged), the push is skipped entirely.
  2. Delta exchange: when a push does occur, only entries newer than the peer’s last known generation are included.

A live token-issuing cluster observed 93% of rounds skipped (7 actual pushes / 108 attempts in a 35 s window). When pushes do occur, deltas are typically a small fraction of the full state. Pushes happen only when a client is created/deleted, a key is rotated, or a node joins/leaves. Steady-state bandwidth in a converged cluster is near zero.

Worst-case (full-state) bandwidth at the default 5-second interval with a 3-node baseline push size of ~7.5 KB (empirically observed; includes CMS envelope overhead):

NodesFull-meshHub-and-spoke
3~65 MB/h~43 MB/h
5~300 MB/h~120 MB/h
7~820 MB/h~235 MB/h

Per-push size growth:

  • Each additional cluster node: +~1,530 B (ML-KEM-768 public key dominates)
  • Each registered OAuth2 client: +~150 B per push body (compact CBOR, 2-char field names)
  • Each active session (refresh family): +~60 B — expired families are purged every gossip round, so growth is bounded by max_refresh_token_age

Recommendation: Use full-mesh for up to 5 nodes. Switch to hub-and-spoke for 7+ nodes, or earlier if you have thousands of active sessions.

Hub-and-spoke configuration example (node1 is the hub):

# hub: node1.toml — peers with all spokes
[gossip]
peers = ["https://node2.example.com:8080", "https://node3.example.com:8080",
         "https://node4.example.com:8080", "https://node5.example.com:8080"]

# spoke: node2.toml — peers with hub only
[gossip]
peers = ["https://node1.example.com:8080"]

With hub-and-spoke, state propagates from spoke to spoke in 2 gossip rounds (spoke → hub → spoke) rather than 1, adding at most 2 × interval_secs of additional latency.

Provisioning a fresh cluster

IPA-integrated deployments: If you are deploying ahdapa on a FreeIPA cluster, the Ansible playbooks in contrib/demo/ipa/ansible/ handle node provisioning, wrapping-key synchronisation, and IPA privilege grants automatically. Run ansible-playbook -i inventory.ini contrib/demo/ipa/ansible/site.yml and skip the manual steps below. See FreeIPA Co-deployment for details.

Using the admin CLI: The ahdapactl cluster bootstrap command automates steps 2–5 below. Run it with the list of node URLs and it handles login, key generation, key distribution, and re-authentication in a single invocation:

ahdapactl cluster bootstrap \
  https://node1.example.com:8080 \
  https://node2.example.com:8080 \
  https://node3.example.com:8080

After the command completes, verify gossip convergence with step 6 below. See Admin CLI for full ahdapactl reference.

When nodes start with empty databases each generates its own random 32-byte wrapping key. Nodes with different wrapping keys cannot exchange session cookies. The bootstrap procedure aligns all nodes to a shared key without copying database files or restarting anything.

Step 1 — start all nodes

Start every node before attempting key synchronisation. Wait until each node’s health endpoint responds:

curl -sf https://node1.example.com:8080/api/auth/info >/dev/null && echo ready

Step 2 — generate a shared wrapping key

Generate 32 cryptographically random bytes and base64url-encode them (no padding). python3 is available on any Linux system without extra dependencies:

CLUSTER_KEY=$(python3 -c "
import secrets, base64
print(base64.urlsafe_b64encode(secrets.token_bytes(32)).rstrip(b'=').decode())
")

Keep this value in a shell variable for the remainder of the provisioning session. Do not write it to disk or log it — it is the root secret for the cluster.

Step 3 — log in to every node before rotating any key

Session cookies are AEAD-sealed with each node’s current wrapping key. Rotating the key immediately invalidates all cookies issued under the previous key. Log in to all nodes before pushing the new key to any of them:

# Obtain one session cookie per node.
curl -c /tmp/n1.cookie -sf -X POST https://node1.example.com:8080/api/auth/login \
     -H 'Content-Type: application/json' \
     -d '{"username":"admin","password":"..."}'

curl -c /tmp/n2.cookie -sf -X POST https://node2.example.com:8080/api/auth/login \
     -H 'Content-Type: application/json' \
     -d '{"username":"admin","password":"..."}'

curl -c /tmp/n3.cookie -sf -X POST https://node3.example.com:8080/api/auth/login \
     -H 'Content-Type: application/json' \
     -d '{"username":"admin","password":"..."}'

Step 4 — push the shared key to every node

Use each node’s own pre-rotation cookie. The PUT /api/admin/keys/cluster endpoint requires keys:rotate permission and takes effect immediately — no restart needed:

for i in 1 2 3; do
  curl -sf -b "/tmp/n${i}.cookie" \
       -X PUT "https://node${i}.example.com:8080/api/admin/keys/cluster" \
       -H 'Content-Type: application/json' \
       -d "{\"key\":\"$CLUSTER_KEY\"}"
  echo "node${i}: key set"
done

After the key rotation, log in to any one node to get a fresh cookie sealed under the shared key. This cookie is accepted by every node in the cluster:

curl -c /tmp/admin.cookie -sf \
     -X POST https://node1.example.com:8080/api/auth/login \
     -H 'Content-Type: application/json' \
     -d '{"username":"admin","password":"..."}'

Clean up the per-node bootstrap cookies:

rm /tmp/n1.cookie /tmp/n2.cookie /tmp/n3.cookie

Step 6 — verify gossip convergence

You can inspect the gossip health of any node at any time — no authentication required:

curl -sf http://node1.example.com:8080/api/gossip/stats | python3 -m json.tool

The response shows live CRDT counts, the number of successfully completed gossip rounds, the last sync time per peer, and cumulative error counters (persist_errors, wrapping_key_pull_errors). This endpoint is also displayed in the admin web UI on the Cluster Nodes page alongside the list of registered nodes.

Register an OAuth2 client on one node and poll the others until it appears:

# Create on node1.
CLIENT_ID=$(curl -sf -b /tmp/admin.cookie \
  -X POST https://node1.example.com:8080/api/admin/clients \
  -H 'Content-Type: application/json' \
  -d '{"client_name":"test","redirect_uris":[],"scopes":["openid"]}' \
  | python3 -c "import sys,json; print(json.load(sys.stdin)['client_id'])")

# Poll node2 until it appears (≤ 2 × interval_secs).
until curl -sf -b /tmp/admin.cookie \
      https://node2.example.com:8080/api/admin/clients \
      | python3 -c "import sys,json; [exit(0) for c in json.load(sys.stdin) if c['client_id']=='$CLIENT_ID']; exit(1)" \
      2>/dev/null; do
  sleep 1; echo "waiting..."
done
echo "converged on node2"

Delete the test client when done:

curl -sf -b /tmp/admin.cookie \
     -X DELETE "https://node1.example.com:8080/api/admin/clients/$CLIENT_ID"

Adding a node to an existing cluster

  1. Start the new node with an empty database. Let it fully boot.
  2. Log in to the new node with its freshly generated key (step 3 pattern above).
  3. Push the cluster’s existing shared wrapping key to the new node (step 4 pattern).
  4. Make the new node reachable by existing nodes:
    • Static peer list: add the new node’s address to the gossip.peers list on every existing node and restart each process. ahdapa does not reload configuration at runtime; a restart is required to pick up the updated peer list.
    • IPA topology (ipa_topology = true): no configuration change is needed. Once the new IPA replica is added to the replication topology, ahdapa automatically discovers it at the next topology refresh (default: within 5 minutes) and begins gossiping with it.
  5. The new node will receive the full cluster state on the first gossip round and converge within interval_secs seconds.

IPA topology discovery

When ahdapa runs on a FreeIPA replica, you can enable automatic peer discovery instead of maintaining a static peers list:

[gossip]
ipa_topology              = true
ipa_topology_interval_secs = 300   # re-query every 5 minutes (default)
interval_secs             = 5

With ipa_topology = true:

  • ahdapa queries cn=topology,cn=ipa,cn=etc,<suffix> at startup (before the first gossip round) and every ipa_topology_interval_secs seconds thereafter.
  • Peer URLs are constructed as https://<hostname><issuer-path> — for example https://ipa2.example.com/idp when the issuer is https://ipa1.example.com/idp.
  • All discovered replica hostnames are automatically added to the gossip admission allowlist (allowed_node_ids), so no manual allowlist configuration is needed.
  • After each successful topology fetch, if gssapi.initiator_principal is set, ahdapa calls POST /api/gossip/register-kem on each newly-discovered peer that does not yet have this node’s ML-KEM-768 key. The request is authenticated with a Kerberos AP-REQ for HTTP@<peer_host> using the local machine credential, and includes both the ML-KEM-768 public key and the ECDSA P-256 gossip signing key. This pre-seeds both keys before the first gossip push so that the signing key is pinned and a rogue node cannot claim the allowlisted identity. The gossip layer rejects messages from senders with no pinned signing key, so seeding must complete before gossip can proceed.
  • If a topology peer returns HTTP 404 (replica exists in IPA but does not yet run ahdapa), the error is logged at debug level and gossip continues with the next peer.
  • Gossip is not disabled when peers is empty, provided ipa_topology = true.

Prerequisite — IPA permission grants

The HTTP service principal must be granted three IPA privileges:

PrivilegePermissionsPurpose
Ahdapa Topology ReadSystem: Read Topology SegmentsPeer discovery via replication topology
Ahdapa IdP ReadAhdapa - Read user IdP attributes (custom)Read ipauserauthtype, ipaidpconfiglink, ipaidpsub per user — needed to enforce ipauserauthtype=idp and resolve federated users
Ahdapa IdP ReadSystem: Read External IdP server (built-in)Read ipaIdP entries for automatic IdP discovery

These lookups use the service principal credential directly (no S4U2Self impersonation) because the target attributes are not visible to users reading their own entries.

With Ansible, playbooks/ipa_permissions.yml (part of site.yml) handles all of this in one idempotent run — using ipapermission, ipaprivilege, iparole, and delegation modules from ansible-freeipa — for every principal in the ipa_nodes and ahdapa_standalone inventory groups:

ansible-playbook -i inventory.ini contrib/demo/ipa/ansible/playbooks/ipa_permissions.yml

For manual setups, run once as an IPA admin:

# Custom permission — read user IdP attributes
ipa permission-add "Ahdapa - Read user IdP attributes" \
    --right=read --right=search --right=compare \
    --attrs=ipauserauthtype --attrs=ipaidpconfiglink --attrs=ipaidpsub \
    --type=user

# Topology read privilege
ipa privilege-add "Ahdapa Topology Read" \
    --desc="Allows ahdapa to read IPA replication topology for peer discovery"
ipa privilege-add-permission "Ahdapa Topology Read" \
    --permission="System: Read Topology Segments"

# IdP read privilege
ipa privilege-add "Ahdapa IdP Read" \
    --desc="Allows ahdapa to read IPA IdP configurations and user IdP bindings"
ipa privilege-add-permission "Ahdapa IdP Read" \
    --permissions="Ahdapa - Read user IdP attributes,System: Read External IdP server"

# Role — assign both privileges, add each HTTP service principal
ipa role-add "Ahdapa Services" \
    --desc="Role for ahdapa service accounts"
ipa role-add-privilege "Ahdapa Services" \
    --privileges="Ahdapa Topology Read,Ahdapa IdP Read"
ipa role-add-member "Ahdapa Services" \
    --services="HTTP/ipa.example.com@EXAMPLE.COM"

Replace ipa.example.com@EXAMPLE.COM with the HTTP service principal for each IPA server that runs ahdapa.

389-ds indexes for federated user lookups

ahdapa resolves federated users with an LDAP filter that matches both ipaIdpConfigLink and ipaIdpSub. Without equality indexes on those attributes the filter falls back to a full table scan (visible as notes=U in the 389-ds access log) and adds hundreds of milliseconds to every federated login. Run once per Directory Server instance:

INSTANCE="slapd-EXAMPLE-COM"   # realm, dots replaced with hyphens
dsconf $INSTANCE backend index add --attr ipaIdpConfigLink --index-type eq userRoot
dsconf $INSTANCE backend index add --attr ipaIdpSub        --index-type eq userRoot
dsconf $INSTANCE backend index reindex \
    --attr ipaIdpConfigLink --attr ipaIdpSub --wait userRoot

The Ansible playbook runs these steps automatically when indexes are missing.

Standalone (non-replica) nodes

A standalone node is an IPA-enrolled host that runs ahdapa but is not an IPA replica (no local 389-ds or KDC). Its HTTP service principal must be granted Kerberos constrained delegation (S4U2Proxy) so it can forward impersonated user credentials to the IPA LDAP and HTTP services when performing S4U2Self.

Run once as an IPA admin after enrolling the standalone host:

# Enable S4U2Self on the standalone HTTP principal
ipa service-mod HTTP/ahdapa.example.com@EXAMPLE.COM \
    --ok-to-auth-as-delegate=True

# Delegation target — list each IPA replica's HTTP and LDAP principals
ipa servicedelegationtarget-add ahdapa-delegation-targets
ipa servicedelegationtarget-add-member ahdapa-delegation-targets \
    --principals=HTTP/ipa.example.com@EXAMPLE.COM
ipa servicedelegationtarget-add-member ahdapa-delegation-targets \
    --principals=ldap/ipa.example.com@EXAMPLE.COM
# Repeat the two add-member lines for each additional IPA replica.

# Delegation rule — grant the standalone HTTP principal access to the target
ipa servicedelegationrule-add ahdapa-delegation
ipa servicedelegationrule-add-member ahdapa-delegation \
    --principals=HTTP/ahdapa.example.com@EXAMPLE.COM
ipa servicedelegationrule-add-target ahdapa-delegation \
    --servicedelegationtargets=ahdapa-delegation-targets

With Ansible, add the standalone node to the [ahdapa_standalone] inventory group. ipa_permissions.yml applies the delegation automatically for every host in that group. No ahdapa.toml changes are needed for standalone nodes beyond the standard IPA stanzas; gssproxy is configured with allow_constrained_delegation = true regardless of node type.

Gossip encryption

Each ahdapa node generates three cryptographic key pairs on first start:

  • An ML-KEM-768 encryption key pair (post-quantum, FIPS 203). Peer nodes use the public key to encrypt gossip messages so that only this node can read them.
  • An ECDSA P-256 signing key pair for gossip authentication. This node uses the private key to sign every gossip message it sends so that peers can verify the message came from this node and has not been tampered with.
  • A JWT signing key pair using the algorithm configured by [server] jwt_signing_algorithm (default: ES256 / ECDSA P-256). This node uses the private key to sign JWT access tokens and ID tokens it issues. The private key never leaves the node; only the public key is distributed to peers so they can validate tokens this node issued. If the configured algorithm differs from the algorithm of the stored key, a new key is generated automatically on startup.

The ML-KEM-768 and ECDSA P-256 public keys are automatically distributed to peer nodes through the CRDT gossip mechanism. The JWT signing public key for token validation is likewise gossiped as part of SigningKeyEntry. Once all nodes have exchanged public keys, every gossip message is both encrypted (only the recipient can read it) and signed (the recipient can verify who sent it).

Cross-node token validation: Because every node’s JWT signing public key is available in the CRDT on all peers, any cluster node can cryptographically verify a token issued by any other node. The iss claim check in verify_bearer_jwt (used by /userinfo and /introspect) accepts not only the local [server] issuer but also any issuer URL listed in gossip.peers and any URL discovered via ipa_topology. This means introspecting a token issued by node A on node B succeeds without any session forwarding — the gossiped signing key provides the cryptographic guarantee, and the iss check confirms the token came from a known cluster member.

Key generation is automatic; all private keys are stored in the node’s local database (node_keys table) and are never transmitted over the network. The cluster wrapping key — used to protect session cookies — is CMS-encrypted with the node’s own ML-KEM-768 public key and stored locally; when a new node needs it, it pulls the key on demand from an existing peer via GET /api/gossip/wrapping-key. The response is a full SignedData(EnvelopedData) blob: the serving node signs the response with its gossip ECDSA P-256 key and encrypts it to the requester’s ML-KEM-768 key, so both the sender’s identity and the payload’s integrity are verified before the wrapping key is accepted.

Controlling which nodes can join

To prevent unauthorized nodes from self-registering, set gossip.allowed_node_ids to the explicit list of node identifiers that are permitted to participate:

[gossip]
peers            = ["https://node2.example.com:8080"]
allowed_node_ids = ["node1.example.com", "node2.example.com"]

The node_id is the value of the HOSTNAME environment variable (or the system hostname) at startup. Gossip messages that attempt to register a node_id not in the combined allowlist are silently dropped before merge.

When allowed_node_ids is empty and ipa_topology is false, the combined allowlist is empty and no node can self-register (fail-closed). Enable ipa_topology so that hostnames are discovered automatically, or list allowed_node_ids explicitly.

Bootstrapping a new cluster with CMS gossip

Important: The first gossip exchange between two nodes requires each node to already know the other’s ML-KEM-768 public key. On a completely fresh cluster, neither node knows the other’s key, so the gossip loop skips peers it has no KEM key for and waits.

IPA-integrated deployments (recommended): When ipa_topology = true and gssapi.initiator_principal is set, the topology refresh task automatically seeds each peer’s ML-KEM-768 key and ECDSA P-256 gossip signing key via POST /api/gossip/register-kem using the node’s Kerberos machine credential (HTTP/<hostname>@<REALM>). This happens before the first gossip round, so all IPA-enrolled nodes have pinned signing keys and known KEM keys without any manual intervention. No database copying or manual key seeding is needed.

Static-peer deployments: Use one of the following procedures:

  1. Start Node A. Its public keys are registered in its own CRDT.

  2. Copy Node A’s database to Node B before starting Node B (or start Node A first and let it run a few seconds, then start Node B). Node B will contain A’s NodeEntry and vice versa after the first successful merge.

  3. Alternatively, fetch each node’s KEM key via GET /api/gossip/kem-info and seed it on all peers via POST /api/admin/nodes/seed (requires keys:rotate permission):

    # Fetch node A's KEM key.
    KEM_KEY=$(curl -sf https://nodeA.example.com:8080/api/gossip/kem-info \
      | python3 -c "import sys,json; print(json.load(sys.stdin)['kem_public_key_der'])")
    
    # Seed it into node B.
    curl -sf -b /tmp/nb.cookie \
      -X POST https://nodeB.example.com:8080/api/admin/nodes/seed \
      -H 'Content-Type: application/json' \
      -d "{\"node_id\":\"nodeA\",\"kem_public_key_der\":\"$KEM_KEY\"}"
    

    Or with ahdapactl: ahdapactl cluster nodes seed --node-id nodeA --kem-key <base64url>

Once the first successful gossip exchange occurs, nodes learn each other’s KEM keys and all subsequent exchanges are encrypted automatically. The cluster wrapping key is re-sealed for the updated recipient set within one gossip cycle.

Joining a new node

IPA-integrated deployments: No manual key seeding is required. Once the new IPA replica appears in the replication topology, the topology refresh task on each existing node automatically discovers it, calls POST /api/gossip/register-kem to seed the new node’s ML-KEM-768 key and gossip signing key (authenticated with the existing node’s Kerberos machine credential), and begins gossiping with it. Similarly, the new node seeds its keys on all existing peers after its first topology refresh. The new node pulls the cluster wrapping key on the first successful gossip exchange.

Static-peer deployments:

  1. Start the new node — its ML-KEM-768 key pair is generated automatically.
  2. Add it to allowed_node_ids on all existing nodes (if configured).
  3. Add the new node’s address to gossip.peers on all existing nodes and reload.
  4. Within two gossip cycles (default 10 s), existing nodes learn the new node’s KEM key. The new node detects the cluster’s wrapping_key_id via gossip and pulls the actual wrapping key from a peer via GET /api/gossip/wrapping-key. It can then unseal session cookies from other nodes once the pull completes.

Automatic KEM re-registration after peer restart

When a peer restarts and loses its CRDT state (e.g. after a database wipe or a fresh reinstall), its KEM key and gossip signing key are gone. Subsequent gossip pushes from existing nodes are rejected with 401 Unauthorized because the peer no longer recognises their signing keys.

The gossip loop detects this automatically: a 401 response from a peer triggers an immediate call to POST /api/gossip/register-kem on that peer, seeding this node’s ML-KEM-768 public key and ECDSA P-256 gossip signing key. Re-registration is throttled to at most once per 60 seconds per peer so that a persistently-rejecting peer does not cause excessive Kerberos ticket acquisition. Once re-registration succeeds, gossip resumes automatically on the next round.

This recovery is fully automatic in IPA-integrated deployments where gssapi.initiator_principal is set. In static-peer deployments without a machine credential, the 401 is logged as a warning and the gossip push to that peer is skipped until the peer is manually re-seeded (see Bootstrapping a new cluster).

Removing a node

  1. Remove the node from gossip.peers on all remaining nodes.
  2. Remove it from allowed_node_ids if configured.
  3. Rotate the cluster wrapping key via PUT /api/admin/keys/cluster (or ahdapactl cluster key rotate) — the removed node’s KEM key is no longer included in future re-seals, so it loses access to the wrapping key after rotation.

The removed node’s CRDT entry (signing key, KEM key, etc.) remains visible in GET /api/admin/nodes until it ages out past gossip.tombstone_ttl_secs. This is harmless: with the node absent from every peer list and the wrapping key rotated, it cannot participate in or decrypt cluster traffic.

Security considerations

Gossip and internal endpoint access control

POST /api/gossip/sync, GET /api/gossip/wrapping-key, and the internal forwarding endpoints (/api/internal/token/auth-code, /api/internal/quorum/vote) are served on the same port as the public OAuth2 endpoints — port-level firewall rules cannot isolate them without also blocking public OAuth2 clients. Access is enforced at the application layer by two complementary mechanisms:

Cryptographic authentication (primary)

Every gossip sync payload is a CMS SignedData(EnvelopedData) structure. The receiver verifies the ECDSA P-256 outer signature against the sender’s pinned gossip signing key. A sender whose key is not registered in the CRDT is rejected with 401 Unauthorized before any payload is read or applied. The ML-KEM-768 inner encryption additionally ensures the payload is opaque to any party that does not hold the addressed node’s private key.

Node admission allowlist (secondary)

The allowed_node_ids list and IPA topology discovery together control which node_ids may exchange CRDT state. Gossip messages from unlisted node_ids are dropped at the admission filter after signature verification. The combined allowlist fails closed: an empty union admits nobody.

Proxy-layer path restriction (optional)

If a reverse proxy fronts the cluster, its Location / location block can additionally restrict gossip paths to the cluster subnet as a defence-in-depth measure — see Reverse Proxy Setup for Apache and nginx examples. This is optional: the cryptographic and allowlist controls above already prevent any useful action by an unauthenticated caller.

Protect the wrapping key

The cluster wrapping key is the root of trust for session cookies. Each node stores its own CMS-encrypted copy locally (node_keys.wrapping_key_cms_der); the 32-byte key never appears in gossip payloads.

  • If a node is decommissioned, delete it from the cluster. Future key rotations will not serve the wrapping key to its (now-absent) KEM public key, so it loses access.
  • If you need to manually rotate the wrapping key, follow the provisioning procedure (steps 3–5): log in to all nodes first, push the new key, then re-authenticate. Peers detect the UUID change via gossip and pull the new key automatically.

TLS for inter-node traffic

Configure TLS on all nodes ([tls] section) and use https:// peer URLs in gossip.peers. TLS encrypts the transport; the CMS envelope encrypts the application payload. Both layers together prevent passive monitoring and active interception. See Configuration Reference.

Running the demo locally

contrib/demo/cluster/run.sh implements all of the above steps against three local nodes on ports 8080–8082 and verifies convergence end-to-end. It is also used as a CI integration test:

contrib/demo/cluster/run.sh            # non-interactive: exits 0 on pass, 1 on fail
contrib/demo/cluster/run.sh --interactive  # keeps nodes running until Ctrl-C