Gossip Protocol
The gossip protocol synchronises IdpCrdt state across cluster nodes. It is implemented in src/routes/gossip.rs. The CMS envelope construction lives in crates/ahdapa-cms/.
Design
The protocol uses delta-based exchange by default, falling back to full-state on first
contact or after an error. After a successful round, a node sends only the CRDT entries
that changed since the last successful exchange with each peer (a sparse IdpCrdt delta),
rather than the full state every time. Peers exchange generation counters in each envelope
to coordinate what they already have. One round-trip still brings both nodes to the same
merged state.
Full-state pushes occur in the following cases:
- First contact with a peer (no prior
peer_last_genentry). - After any connection error or non-2xx response (generation tracking is cleared).
- When a
pull_wrapping_keyfailure occurs (generation tracking is cleared to force retry).
Full-state exchange is the safe baseline because CRDT merge is always additive: a delta
merged into a full state, or a full state merged into a delta, produces the same result
as two full states merged. Old nodes that do not understand the is_delta field decode
it as false (full state) and merge safely. See Payload size and bandwidth for the full bandwidth model.
Protocol
Endpoint
POST /api/gossip/sync
Content-Type: application/pkcs7-mime
X-Ahdapa-Node-Id: <sender's node_id>
<DER SignedData wrapping EnvelopedData>
The handler verifies and decrypts the message, applies admission filters, merges the received CRDT, persists the result, and replies with its own state — either a delta (when request_delta_since is set in the inbound envelope) or the full CRDT — in the same CMS format.
CMS wire format
Gossip messages use a two-layer CMS structure:
OUTER: SignedData {
eContentType = id-envelopedData
eContent = <inner EnvelopedData DER>
certificates = { sender_self_signed_cert } // carries sender's P-256 public key
signerInfos = { SignerInfo {
signatureAlgorithm = id-ecPublicKey (P-256)
signature = ECDSA-P256-Sign(sha256(eContent))
} }
}
INNER: EnvelopedData {
recipientInfos = SET OF OtherRecipientInfo {
oriType = id-ori-kem
oriValue = KEMRecipientInfo {
kem = id-alg-ml-kem-768
kemct = <ML-KEM-768 encapsulated ciphertext>
wrap = id-aes256-wrap
encryptedKey = AES-256-KeyWrap(kek, CEK) // 40 bytes
}
} // one ORI per recipient
encryptedContentInfo {
contentEncryptionAlgorithm = id-aes256-gcm + GcmParameters(nonce)
encryptedContent = AES-256-GCM(CEK, nonce, CBOR) ‖ tag[16]
}
}
Key derivation: kek = HKDF-SHA256(ml_kem_shared_secret, info="ahdapa-cms-kek", 32)
Key material
Each node has two key pairs stored in the local node_keys table (never gossiped):
| Key | Type | PKCS#8 DER column | SPKI DER column | Published in CRDT |
|---|---|---|---|---|
| KEM encryption | ML-KEM-768 | private_key_der | public_key_der | NodeEntry.kem_public_key_der |
| Gossip signing | ECDSA P-256 | signing_private_key_der | signing_public_key_der | NodeEntry.gossip_signing_pub_key_der |
| JWT signing | Configured by jwt_signing_algorithm (default: ES256) | jwt_signing_priv_der | (derived from jwt_signing_priv_der) | SigningKeyEntry.public_key_der (public only) |
A minimal self-signed X.509 certificate for the P-256 key is also stored
(signing_certificate_der) and embedded in every outbound SignedData so the
receiver can extract the sender’s public key for SPKI comparison without needing a CA.
Keys are generated on first start by bootstrap_node_kem_key() in src/routes/mod.rs
and reused across restarts.
Sender logic (sign_and_seal)
1. check peer_last_gen[peer] == current_gen → skip (CRDT unchanged since last sync)
2. select payload:
- if peer_last_gen[peer] exists: crdt.delta_since(peer_last_gen[peer]) → is_delta=true
- otherwise (first contact or after error): crdt.clone() → is_delta=false
3. serialize payload IdpCrdt to CBOR (ciborium) → crdt_bytes
4. look up peer's kem_public_key_der in local CRDT by hostname match
→ skip peer if KEM key not found
5. wrap in GossipEnvelope {
crdt: crdt_bytes,
issued_at: now,
is_delta,
my_gen: current_gen,
request_delta_since: peer_response_gen.get(peer), // ask peer for delta response
}
6. serialize GossipEnvelope to CBOR → plaintext
7. ahdapa_cms::sign_and_seal(plaintext, [peer_kem_spki],
own_signing_priv_pkcs8, own_signing_cert_der)
a. seal(plaintext, recipients):
i. generate random CEK (256-bit) and nonce (96-bit)
ii. AES-256-GCM(CEK, nonce, plaintext) → ciphertext ‖ tag
iii. for each recipient: ML-KEM-768 encapsulate → (kemct, ss)
kek = HKDF-SHA256(ss, "ahdapa-cms-kek", 32)
AES-256 key-wrap(kek, CEK) → encryptedKey
encode KEMRecipientInfo + OtherRecipientInfo
iv. EnvelopedDataBuilder.build() → enveloped_der
b. CmsContentInfo::sign(enveloped_der, own_cert, own_priv_key) → signed_der
8. POST /api/gossip/sync with Content-Type: application/pkcs7-mime
Receiver logic (verify_and_open)
1. read X-Ahdapa-Node-Id header → sender_node_id
2. look up sender's gossip_signing_pub_key_der in local CRDT
→ None (no pinned key) → reject 401; TOFU is no longer accepted
3. ahdapa_cms::verify_and_open(body, own_kem_priv_pkcs8, sender_signing_pub_spki)
a. cms.certs()[0] → embedded signer cert → extract SPKI
b. compare embedded SPKI against pinned sender_signing_pub_spki; mismatch → 401
c. cms.verify(NO_SIGNER_CERT_VERIFY) → validates ECDSA signature; → 401 on failure
d. open(enveloped_der, own_kem_priv_pkcs8):
i. find OtherRecipientInfo with oriType = id-ori-kem
ii. ML-KEM-768 decapsulate(own_priv, kemct) → ss
kek = HKDF-SHA256(ss, "ahdapa-cms-kek", 32)
AES-256 key-unwrap(kek, encryptedKey) → CEK
iii. parse GcmParameters → nonce
iv. AES-256-GCM decrypt(CEK, nonce, encryptedContent) → CBOR bytes
4. ciborium::from_reader(CBOR bytes) → GossipEnvelope { crdt: Vec<u8>, issued_at: i64, is_delta: bool, my_gen: u64, request_delta_since: Option<u64> }
5. reject if issued_at < now - tombstone_ttl_secs (default 7 days) → replay prevention
6. ciborium::from_reader(envelope.crdt) → peer_crdt
7. apply admission filters (see below)
8. merge + persist
9. determine response payload:
- if envelope.request_delta_since is Some(since): crdt.delta_range(since, pre_merge_gen) → is_delta=true
- otherwise: crdt.clone() → is_delta=false
wrap in GossipEnvelope { crdt: CBOR(response_crdt), issued_at: now, is_delta, my_gen: post_merge_gen, request_delta_since: None }
sign_and_seal → response
Admission filters
The receiver applies two layered filters before merging:
Layer 1 — node allowlist (gossip.allowed_node_ids): the combined static and
topology-derived allowlist is always enforced. An empty union of both lists admits
nobody (fail-closed). Any new NodeEntry whose node_id is not in the combined
allowlist is dropped from peer_crdt before merge. Protects against rogue nodes
self-registering and obtaining the cluster wrapping key.
Layer 2 — self-registration rule: a sender may only add its own NodeEntry via
gossip. Any new entry (not already in the local CRDT) whose node_id does not match
the X-Ahdapa-Node-Id header is dropped. This is defense-in-depth only since the
header is forgeable.
KEM self-registration and signing-key pinning
A node whose KEM key is not yet in the CRDT cannot receive encrypted gossip — the
sender skips peers with no known KEM key. POST /api/gossip/register-kem seeds both
the ML-KEM-768 public key and the ECDSA P-256 gossip signing key before the first
gossip exchange.
In IPA deployments, after each topology refresh, the local node calls
register_self_with_peer() for every newly discovered peer that does not yet have
this node’s KEM key. That function:
- Acquires a Kerberos service ticket for
HTTP@<peer_host>using the local machine credential (gss_initiator). - POSTs this node’s ML-KEM-768 public key and its ECDSA P-256 gossip signing
public key to
<peer_url>/api/gossip/register-kemwithAuthorization: Negotiate <AP-REQ>. - The peer verifies the AP-REQ, extracts the authenticated principal
(
HTTP/<hostname>@<REALM>viaServicePrincipal::parse), and stores both keys in theNodeEntryunder<hostname>— provided that hostname matches thenode_idin the request body, is in the allowlist, and (whengossip.kerberos_realmis set) the principal’s realm matches the expected realm.
The insert uses a three-case match: insert-fresh (neither key known), upsert-signing-
key-only (KEM key known but signing key absent), or no-op (both keys already set). Once
the signing key is pinned, gossip_sync rejects any message from that sender whose
embedded ECDSA key does not match the pinned value — there is no TOFU fallback. This
requires gssapi.initiator_principal to be set so that AppState::ipa.gss_initiator
is Some; the mechanism is a no-op when it is absent.
Background loop
routes::gossip::run(state) is spawned from main.rs after AppState is constructed.
When ipa_topology = true, a separate task (topology::run_topology_refresh) is also
spawned; it populates AppState::dynamic_peers and AppState::dynamic_allowed_nodes
before the first gossip round.
#![allow(unused)]
fn main() {
tokio::spawn(routes::gossip::run(state.clone()));
// When ipa_topology = true:
tokio::spawn(topology::run_topology_refresh(state.clone()));
}
The gossip loop maintains two per-peer maps:
peer_last_gen[peer]— the localCRDT_GENERATIONafter the last successful sync with this peer. Used to skip pushing when nothing has changed locally, and to compute the delta payload (delta_since(peer_last_gen[peer])).peer_response_gen[peer]— the peer’sCRDT_GENERATIONreported in their last response envelope. Sent back in the next push asrequest_delta_sinceso the peer can respond with only new entries it has written since that generation.
Both maps are cleared on any error so the next round falls back to a full-state exchange.
loop:
sleep(interval_secs) [or wake on gossip_notify signal]
purge_expired_families(now) // remove expired refresh families before push
current_gen = CRDT_GENERATION.load()
// effective peer list = gossip.peers ∪ dynamic_peers (from IPA topology)
// stale entries for removed peers are pruned from both maps
any_synced = false
for each peer in (gossip.peers + state.dynamic_peers):
if peer_last_gen[peer] == current_gen: skip (CRDT unchanged, log DEBUG)
find peer's KEM key in local CRDT (hostname match on node_id)
if no KEM key: warn and skip
// Compute payload
if peer_last_gen[peer] exists:
payload = crdt.delta_since(peer_last_gen[peer]) // sparse delta
is_delta = true
else:
payload = crdt.clone() // full state
is_delta = false
envelope = GossipEnvelope {
crdt: CBOR(payload),
issued_at: now,
is_delta,
my_gen: current_gen,
request_delta_since: peer_response_gen.get(peer),
}
sign_and_seal(CBOR(envelope), [peer_kem_spki])
POST {peer}/api/gossip/sync
if response is 404 and peer was topology-discovered: log at DEBUG (peer not yet running ahdapa); continue
if success:
verify_and_open(response, own_kem_priv, peer_signing_pub)
CBOR deserialize → peer_envelope (GossipEnvelope)
CBOR deserialize peer_envelope.crdt → peer_crdt // may be delta or full state
merge into local CRDT; purge_expired_families(now)
persist_to_db
if persist fails: gossip_stats.persist_errors += 1
peer_last_gen[peer] = CRDT_GENERATION.load() // post-merge gen
if peer_envelope.my_gen > 0:
peer_response_gen[peer] = peer_envelope.my_gen
if peer's wrapping_key_id ≠ local_wrapping_key_id:
GET {peer}/api/gossip/wrapping-key
→ SignedData(EnvelopedData) + X-Ahdapa-Node-Id header
look up peer's pinned signing key; reject if absent
verify_and_open(blob, own_kem_priv, peer_signing_pub) → raw_key
update in-memory key pair; persist to node_keys
if pull fails:
clear peer_last_gen[peer]; clear peer_response_gen[peer]
gossip_stats.wrapping_key_pull_errors += 1
any_synced = true
if error: clear peer_last_gen[peer]; clear peer_response_gen[peer]
// Update round statistics — only when at least one peer synced successfully
if any_synced:
gossip_stats.rounds_completed += 1
gossip_stats.last_round_at = now
every ~1 hour:
cleanup_expired_families(db, now) // DB-level purge
cleanup_old_tombstones(db, now - tombstone_ttl_secs)
If a peer is unreachable, the error is logged and the loop continues to the next peer. The loop does not back off — it retries on every interval.
The topology refresh task runs an initial fetch immediately on startup (before the first
gossip sleep) and then sleeps for ipa_topology_interval_secs (minimum 30 s, default
300 s). On LDAP error, the previous peer list is kept unchanged and a warning is logged.
After each successful topology fetch, if gss_initiator is available, the topology task
also calls register_self_with_peer() for each newly-discovered peer whose KEM key is
not yet in the CRDT. This pre-seeds the key via POST /api/gossip/register-kem with a
Kerberos AP-REQ so that the legitimate node wins the OR-Map first-write-wins race before
the first gossip round fires.
Cluster wrapping key
The 32-byte cluster wrapping key (used for session cookies) is stored node-locally
in node_keys.wrapping_key_cms_der as a CMS EnvelopedData blob sealed to the node’s
own ML-KEM-768 public key. It is never gossiped in plaintext or as a multi-recipient
blob.
Only a short UUID string (wrapping_key_id) is gossiped in the CRDT. When a node
observes a different UUID after a gossip merge, it fetches the actual key on demand:
GET /api/gossip/wrapping-key
X-Ahdapa-Node-Id: <requester's node_id>
Response: 200 OK
Content-Type: application/octet-stream
X-Ahdapa-Node-Id: <responder's node_id>
Body: SignedData(EnvelopedData) DER
The response is a full SignedData(EnvelopedData) blob produced by sign_and_seal(),
sealed to exactly one recipient (the requester’s ML-KEM-768 public key) and signed
with the responder’s ECDSA P-256 gossip signing key. The requester looks up the
responder’s pinned signing key in the CRDT (from the X-Ahdapa-Node-Id header) and
calls verify_and_open(). A response from a node with no pinned signing key is
rejected. Confidentiality is ensured by the inner ML-KEM-768 encryption; integrity and
sender authentication are ensured by the outer ECDSA P-256 signature.
Node statistics endpoint
GET /api/gossip/stats
Unauthenticated. Intentionally unauthenticated — like /api/gossip/kem-info — because the
admin web UI fetches it before an admin session is established. What is exposed is aggregate
counts and gossip health indicators; no key material, user data, or token content is returned.
Response body (JSON):
{
"node_id": "ipa1.example.com",
"crdt_generation": 42,
"counts": {
"clients": 3,
"signing_keys": 2,
"cluster_nodes": 3,
"refresh_families": 7,
"revoked_sessions": 1,
"scope_definitions": 8,
"ipa_idp_overrides": 0
},
"peers": ["https://ipa2.example.com/idp", "https://ipa3.example.com/idp"],
"active_signing_kid": "abc123",
"kem_enrolled": true,
"gossip_signing_enrolled": true,
"gossip": {
"started_at": 1716000000,
"rounds_completed": 12,
"last_round_at": 1716000060,
"peer_last_sync": { "ipa2.example.com": 1716000058, "ipa3.example.com": 1716000059 },
"persist_errors": 0,
"wrapping_key_pull_errors": 0
}
}
Field notes:
crdt_generation— current value of theCRDT_GENERATIONatomic counter.counts.*— live (non-tombstoned) entry counts for each CRDT collection.peers— union of configuredgossip.peersand topology-discovered peers.active_signing_kid— thekidof the currently active JWT signing key.kem_enrolled/gossip_signing_enrolled— whether both cryptographic identities are registered in the CRDT.gossip.started_at— Unix timestamp when the gossip background task started.gossip.rounds_completed— number of gossip rounds in which at least one peer was successfully synced. Idle rounds (CRDT unchanged, all pushes skipped) and rounds where all peers fail do not increment this counter.gossip.last_round_at— Unix timestamp of the most recent round that synced at least one peer.nulluntil the first successful sync.gossip.peer_last_sync— Unix timestamp of the most recent successful inbound sync from each peer (recorded by the/api/gossip/syncreceiver).gossip.persist_errors— cumulative DB persist failures since startup (incremented after both inbound sync and outbound merge failures).gossip.wrapping_key_pull_errors— cumulative failures to pull the cluster wrapping key from a peer after detecting a UUID change.
This endpoint is used by the admin web UI Cluster Nodes page to display per-node runtime gossip health alongside the static CRDT node entries from GET /api/admin/nodes.
Kerberos KEM self-registration endpoint
POST /api/gossip/register-kem
Authorization: Negotiate <base64-AP-REQ>
Content-Type: application/json
{
"node_id": "<hostname>",
"kem_public_key_der": "<base64url-ML-KEM-768-SPKI-DER>",
"gossip_signing_pub_key_der": "<base64url-ECDSA-P256-SPKI-DER>"
}
Used by topology-discovered peers to seed both their ML-KEM-768 public key and their
ECDSA P-256 gossip signing key before the first gossip round. All three fields are
required; missing or empty fields return 400 Bad Request. The server:
- Returns
503 Service Unavailableif the GSSAPI server credential is unavailable (state.gss_credisNone— indicates a configuration or keytab problem). - Calls
try_spnego()to accept the Kerberos AP-REQ. Returns401 Negotiateif absent,401if the token is invalid. - Calls
ServicePrincipal::parse()on the authenticated principal. Rejects with403if the principal is notHTTP/<host>@<REALM>(user principals and non-HTTP service types are excluded). - When
gossip.kerberos_realmis set, rejects with403if the principal’s realm does not match — prevents cross-realm trust escalation. - Checks that
req.node_id.to_lowercase() == authed_host. Rejects with403if they differ — a machine can only register its own identity. - Checks that
authed_hostis in the topology-derived or static allowlist. Rejects with403if not admitted. - Applies a three-case match on the existing CRDT entry for this
node_id:- Insert-fresh: neither key known → insert
NodeEntrywith both keys. - Upsert-signing-key-only: KEM key present but
gossip_signing_pub_key_derempty → update the entry to add the signing key. - No-op: both keys already present → return
200 OKimmediately (idempotent).
- Insert-fresh: neither key known → insert
- Returns
200 OK, optionally with aWWW-Authenticate: Negotiate <mutual-auth-token>header if GSSAPI produced a mutual-authentication output token.
At startup, bootstrap_wrapping_key() reads node_keys.wrapping_key_cms_der. If
present, it decrypts the blob to recover the 32-byte key. If absent (first start), it
generates a fresh key, seals it to the node’s own KEM key, and stores the result in
node_keys. A UUID is generated and published to the CRDT as wrapping_key_id with
timestamp=1 so that the established cluster’s UUID wins the LWW merge on the first
gossip round.
When the cluster wrapping key is rotated via PUT /api/admin/keys/cluster, the node
re-seals the new key to its own KEM key, stores it in node_keys, and updates
crdt.wrapping_key_id to a new UUID. Peers detect the UUID change via gossip and pull
the new key via the on-demand endpoint.
Convergence
| Scenario | Convergence |
|---|---|
| Single node | Instant (no peers) |
| Two-node cluster (KEM keys known) | After 1 gossip round (≤ interval_secs seconds) |
| Three-node cluster, all connected | After 1–2 gossip rounds |
| Partition healed after T seconds | After ≤ 2 gossip rounds from partition heal |
| New node joining (static peers) | After 2 gossip rounds (learn KEM key → pull wrapping key via on-demand endpoint) |
New node joining (IPA topology, gss_initiator set) | After 1 gossip round — both the KEM key and the gossip signing key are pre-seeded via Kerberos register-kem before first gossip push; wrapping key pulled on first exchange. Requires both nodes to complete their mutual register-kem calls before the first gossip interval fires; this holds in practice because the topology refresh runs immediately on startup, before the first gossip sleep. |
New signing key propagation: a key added on node A is available on node B after at most 1 gossip round from A to B. Resource servers should cache JWKS with a short TTL (≤ interval_secs × 2) to avoid key-not-found errors during propagation.
Security considerations
| Property | Value |
|---|---|
| Confidentiality | AES-256-GCM per-recipient (inner EnvelopedData) |
| Integrity | AES-256-GCM auth tag + ECDSA P-256 signature |
| Sender authentication | ECDSA P-256 over eContent (outer SignedData); signing key pinned via register-kem before first gossip |
| Node admission control | allowed_node_ids allowlist (layer 1, fail-closed on empty) + self-registration rule (layer 2) |
| Replay prevention | GossipEnvelope.issued_at checked against now - tombstone_ttl_secs |
| Post-quantum | ML-KEM-768 for key encapsulation (FIPS 203) |
/api/gossip/syncSHOULD be firewalled to the cluster’s subnet as defense-in-depth. CMS encryption ensures confidentiality even if traffic is captured, but network isolation prevents unauthorized nodes from attempting to self-register.- The allowlist is fail-closed. When both the static
allowed_node_idslist and the topology-derived allowlist are empty, no node can self-register via gossip or the wrapping-key endpoint. This is intentional: operators must either configure an explicit allowlist or enableipa_topologyso that hostnames are discovered automatically. - Gossip envelopes carry a timestamp (
issued_at). Envelopes older thantombstone_ttl_secs(default 7 days) are rejected, preventing an attacker from replaying a captured gossip message after its tombstones have been GC-purged. - The gossip
reqwest::Clienthas a 10-second request timeout. Slow peers do not block the gossip loop. - ECDSA P-256, not Ed25519, is used for gossip signing. OpenSSL’s
CMS_sign()API requires a key type that has a default digest algorithm; Ed25519 (PureEdDSA) does not satisfy this requirement. P-256 provides the same 128-bit security level. - The JWT signing algorithm is configurable, not gossip signing. Each node generates
its own JWT signing key pair (algorithm set by
[server] jwt_signing_algorithm, default: ES256) stored innode_keys.jwt_signing_priv_der. The private key never leaves the node; only the public key is gossiped inSigningKeyEntry. This is distinct from the ECDSA P-256 gossip signing key.
Payload size and bandwidth
Raw field sizes
The binary fields that dominate gossip payload size (measured from a three-node demo cluster):
| Field | Bytes | Gossiped |
|---|---|---|
ML-KEM-768 public key SPKI (NodeEntry.kem_public_key_der) | 1,206 | Yes |
ECDSA P-256 gossip signing pub key SPKI (NodeEntry.gossip_signing_pub_key_der) | 91 | Yes |
JWT signing private key DER (SigningKeyEntry.private_key_der) | varies by algorithm | No — #[serde(skip_serializing)]; stays in node_keys |
JWT signing public key SPKI DER (SigningKeyEntry.public_key_der) | varies by algorithm | Yes |
ECDSA P-256 gossip signing certificate (node_keys.signing_certificate_der) | 291 | No — local only |
ML-KEM-768 private key PKCS#8 (node_keys.private_key_der) | 2,498 | No — local only |
The ML-KEM-768 public key is the dominant field by a factor of ~13× over the next largest gossiped value.
Per-entity CBOR contribution
The CRDT is serialised as CBOR (ciborium). CBOR stores binary fields as raw bytes (no base64 overhead). Approximate CBOR size per entry:
| Entity | ~CBOR bytes | Dominant field |
|---|---|---|
NodeEntry (one cluster node) | ~1,530 B | ML-KEM-768 pub key (1,206 B raw) |
SigningKeyEntry (one JWT signing key) | ~100 B (ES256) – ~2,600 B (ML-DSA-87) | JWT public key (size varies by algorithm); private key not gossiped |
ClientEntry (typical OAuth2 client) | ~150 B | UUIDs + scopes; short serde field names (2 chars) keep the CBOR compact |
RefreshFamilyState (one active session) | ~100 B | UUIDs + counters |
CMS envelope overhead
Each gossip message is sent to exactly one peer, so the CMS overhead is constant regardless of cluster size:
| Layer | Bytes |
|---|---|
Outer SignedData headers + ECDSA P-256 signature (64 B) + signer cert (291 B) | ~555 B |
Inner EnvelopedData KEMRecipientInfo: ML-KEM-768 ct (1,088 B) + wrapped CEK (40 B) + headers | ~1,233 B |
| AEAD overhead (12 B nonce + 16 B GCM tag) | 28 B |
| Fixed CMS overhead per gossip message | ~1,816 B |
Wire size per gossip push
Total CMS-encrypted wire size for one outbound push (one recipient). The cluster wrapping key blob is no longer in the gossip body; only its UUID is gossiped:
| Scenario | Wire bytes |
|---|---|
| 3 nodes, 3 signing keys, 4 clients, 0 sessions (demo measured) | ~7,454 B |
| 3 nodes, 5 clients, 50 sessions | ~8 KB |
| 5 nodes, 5 clients, 50 sessions | ~11 KB |
| 10 nodes, 10 clients, 100 sessions | ~19 KB |
Bandwidth per gossip cycle
The topology is full-mesh: each node pushes to every configured peer and receives a response. Total cluster bandwidth per cycle (worst case, full-state) = N × (N−1) × 2 × wire_bytes. In practice, delta exchange reduces per-push payload to the size of changed entries only.
These figures are theoretical maximums — they assume every gossip round produces an actual push. In practice, the generation-skip optimisation suppresses pushes when the CRDT has not changed since the last successful round. In the demo cluster (active token issuance, no schema mutations), 93% of rounds were skipped, reducing steady-state bandwidth to near zero. Pushes happen only when the CRDT actually changes (client creation/deletion, key rotation, node join/leave).
At interval_secs = 2 (demo default; config default is 5 s):
| Nodes | Wire/msg | Per round (2 s) | Per hour (theoretical max) |
|---|---|---|---|
| 2 | 7.5 KB | 60 KB | 108 MB |
| 3 | 7.5 KB | 90 KB | 162 MB |
| 5 | 10.5 KB | 420 KB | 756 MB |
| 10 | 19 KB | 3.4 MB | 6.1 GB |
The O(N²) topology is practical for the expected deployment range of 2–5 nodes. Above ~10 nodes the bandwidth cost becomes significant and a partial-mesh peer configuration (each node lists only a subset of peers) should be considered.
Marginal cost per added entity
Measured at a three-node baseline:
| Change | Extra bytes per gossip message |
|---|---|
| +1 cluster node | ~+1,530 B (NodeEntry: ML-KEM-768 pub key 1,206 B + other fields; CBOR-encoded, no wrapping key blob) |
| +1 removed node (tombstone) | +~220 B (tombstone metadata; key fields absent) |
| +1 OAuth2 client | +~150 B |
| +1 active refresh token family (session) | +~100 B |
| +1 JWT signing key (rotation) | +~100 B (public key only; private key not gossiped) |
Sessions and clients are cheap. Nodes are the dominant cost because every node contributes 1,206 B of ML-KEM-768 public key material (gossiped in NodeEntry). The previous per-node cost of +1,644 B from the cluster wrapping key CMS blob is eliminated — the wrapping key is no longer gossiped.
Tombstone accumulation and GC
Removed nodes, deleted clients, and revoked signing keys leave OrMap tombstones. Each
tombstone adds ~220 B to gossip messages until it is garbage-collected. Tombstones older
than gossip.tombstone_ttl_secs (default: 7 days) are purged from both the in-memory
CRDT and the database approximately once per hour. This bounds tombstone growth even in
high-churn deployments.
The TTL must exceed the longest expected node downtime: a node that is offline longer than the TTL may re-gossip entries that were since deleted (those entries would be re-merged on reconnection). The default 7-day TTL is conservative and suitable for most deployments.
Configuration
| Key | Default | Description |
|---|---|---|
peers | [] | Peer node base URLs. Gossip is disabled when this list is empty and ipa_topology is false. |
interval_secs | 5 | Push interval in seconds. |
allowed_node_ids | [] | Allowlist of node_id values permitted to self-register. The allowlist is always enforced — an empty union of both the static list and the topology-derived list admits nobody (fail-closed). When ipa_topology = true, discovered replica hostnames are appended automatically. |
tombstone_ttl_secs | 604800 | Seconds to retain OR-Map tombstones before GC. Also the maximum age of accepted gossip envelopes (issued_at window). Must exceed the longest expected node downtime. Default: 7 days. |
ipa_topology | false | When true, a background task (src/topology.rs) queries cn=topology,cn=ipa,cn=etc,<suffix> for ipaReplTopoSegment entries and derives gossip peer URLs of the form https://<hostname><base_path>. The peer list is stored in AppState::dynamic_peers and the allowlist in AppState::dynamic_allowed_nodes; both are merged with any statically configured peers and allowed_node_ids at each gossip round. |
ipa_topology_interval_secs | 300 | How often (seconds) to re-query the IPA topology. Only used when ipa_topology = true. Minimum: 30 s. |
kerberos_realm | — | Expected Kerberos realm for register-kem callers (e.g. "IPA.EXAMPLE.COM"). When set, principals whose realm does not match are rejected with 403, preventing cross-realm trust escalation. When unset, realm is not checked. |
[gossip]
peers = ["https://node2.example.com:8080", "https://node3.example.com:8080"]
interval_secs = 5
allowed_node_ids = ["node1.example.com", "node2.example.com", "node3.example.com"]
tombstone_ttl_secs = 604800 # 7 days
For IPA-integrated deployments, the static peers list can be omitted entirely when
ipa_topology = true. A single-node deployment requires no gossip configuration at all.