Skip to content

Mesh Networking

RivetOS supports a multi-node mesh that lets agents delegate tasks to each other across instances. One node can ask another node’s agent to handle a task — even if that agent runs on different hardware or uses a different LLM provider.


Every mesh-enabled node runs an agent channel — an HTTPS server that receives delegated tasks and routes them to the local DelegationEngine. When one node needs an agent it doesn’t host locally, it looks up the target node in the shared mesh.json registry and sends the task over mTLS.

┌─────────────────────────────────┐ HTTPS/mTLS ┌─────────────────────────────────┐
│ ct110 — opus │ ──────────────────▶ │ ct111 — grok │
│ │ POST /api/message │ │
│ MeshDelegationEngine │ │ AgentChannelServer (port 3000) │
│ mesh.json (NFS r/w) │◀──────────────────── │ mesh.json (NFS r/w) │
└─────────────────────────────────┘ delegation result └─────────────────────────────────┘

All nodes read and write a single mesh.json file at /rivet-shared/mesh.json (NFS-mounted from the datahub CT). This is the source of truth — no extra coordination service needed.

ModeHow it works
staticPeer list hard-coded in config. Good for stable infra.
seedNew node contacts a seed’s /api/mesh endpoint to bootstrap its view.
mdnsmDNS discovery (local network).

Starting from Phase 0.5, all mesh agent-channel traffic is mutual TLS. There is no plaintext fallback and no bearer-token authentication on the agent channel. CA-signed certificate = trusted. Everything else = rejected at the TLS handshake level.

  1. Each node has a certificate issued by the mesh CA (/rivet-shared/rivet-ca/).
  2. The agent channel server requires a client cert and verifies it against the CA chain.
  3. The delegation client builds an mTLS connection using the same cert pair.
  4. Connections to remote nodes use <nodeName>.mesh DNS names so the cert SANs match.
/rivet-shared/rivet-ca/
intermediate/
ca-chain.pem ← CA chain (validates all node certs)
issued/
ct110.crt ← ct110 node cert (CN=ct110, SAN=ct110.mesh + mesh IP)
ct110.key ← ct110 node private key
ct111.crt / .key ← same for ct111…ct114
<agent>@<node>.crt ← agent certs (reserved, unused on the wire in Phase 0.5)

Permissions: rivet:rivet, NFS-visible on all nodes.


Minimal mesh config (tls: true → default paths)

Section titled “Minimal mesh config (tls: true → default paths)”
mesh:
enabled: true
node_name: ct110 # must match the cert CN
tls: true # use /rivet-shared/rivet-ca/issued/<node_name>.{crt,key}
agent_channel_port: 3000
storage_dir: /rivet-shared
heartbeat_interval_ms: 30000
stale_threshold_ms: 90000
discovery:
mode: seed
seed_host: ct110.mesh # use .mesh hostname — matches cert SAN
seed_port: 3000
mesh:
enabled: true
node_name: ct110
tls:
ca_path: /rivet-shared/rivet-ca/intermediate/ca-chain.pem
cert_path: /rivet-shared/rivet-ca/issued/ct110.crt
key_path: /rivet-shared/rivet-ca/issued/ct110.key
KeyTypeDefaultDescription
mesh.enabledboolfalseEnable mesh networking.
mesh.node_namestringhostnameNode identifier — must match cert CN.
mesh.tlsbool | objectmTLS config. Required — mesh refuses to start without it.
mesh.tls.ca_pathstring/rivet-shared/rivet-ca/intermediate/ca-chain.pemCA chain PEM path.
mesh.tls.cert_pathstring/rivet-shared/rivet-ca/issued/<node_name>.crtNode cert PEM path.
mesh.tls.key_pathstring/rivet-shared/rivet-ca/issued/<node_name>.keyNode private key PEM path.
mesh.agent_channel_portnumber3000HTTPS port for the agent channel.
mesh.storage_dirstring/rivet-sharedDirectory containing mesh.json.
mesh.heartbeat_interval_msnumber30000How often to write a heartbeat.
mesh.stale_threshold_msnumber90000Age before a node is considered stale.
mesh.discovery.modestringseed | static | mdns.
mesh.discovery.seed_hoststringSeed node hostname. Use <nodeName>.mesh.
mesh.discovery.seed_portnumber3100Seed node port.
mesh.secretstringDeprecated — no longer used for agent-channel auth. Retained for update --mesh orchestration.

dnsmasq on every CT resolves <nodeName>.mesh to the node’s mesh IP. Always use .mesh names for seed hosts and anywhere you reference a peer by URL. This ensures the cert SAN matches the connection hostname and TLS succeeds without rejectUnauthorized: false.


All endpoints are served over HTTPS. The TLS handshake requires a valid client certificate; connections without one are rejected before any HTTP code runs.

MethodPathDescription
GET/api/mesh/pingLiveness probe. Returns { status, node, tls, cn }.
POST/api/messageReceive a delegated task. Body: MessageRequest.
GET/api/meshReturn mesh registry for seed sync.
GET/api/agentsList local agents.

Every accepted request logs peer.cn=<nodeName>. You can grep for it in journalctl -u rivetos or wherever your log sink is:

INFO [AgentChannel] Received mesh delegation peer.cn=ct110 from opus → grok: Summarise...

TLS handshake failures log at WARN:

WARN [AgentChannel] TLS handshake failed from 192.168.10.112: peer did not return a certificate

The Phase 0.5 cutover (all nodes upgraded together for shared-CA mTLS) is complete on the supported releases. The historical procedure was documented in MIGRATION.md, which has since been removed; see CHANGELOG.md (Phase 0.5 entry) for the original steps and rationale.