Enclave Migration
This document specifies enclave migration — the controlled transfer of an enclave from one node to another while preserving its full event history and identity. The Migrate event, the three migration modes (eager / lazy / fork), checkpoint verification, split-brain prevention, and the backup pattern that re-binds the old enclave's SMT root are all defined here.
Table of Contents
- Migrate Event
- Migration Modes
- Checkpoint Verification
- Split-Brain Prevention
- Backup Pattern
- Enclave Snapshot Format
- Snapshot Endpoints
Migrate Event
Type: Migrate
{
"new_sequencer": "<new_seq_pub>",
"prev_seq": 1234,
"ct_root": "<ct_root_at_prev_seq>"
}| Field | Required | Description |
|---|---|---|
| new_sequencer | Yes | Public key of the new sequencer node |
| prev_seq | Yes | Sequence number of the last event BEFORE the Migrate event (Migrate will have seq = prev_seq + 1) |
| ct_root | Yes | CT root at prev_seq (proves log and state before Migrate) |
- Only Owner can issue Migrate.
- The commit MUST be signed by Owner.
Once a node accepts a Migrate commit:
- The node MUST reject all other pending commits.
- The Migrate event MUST be the final event from this sequencer.
- No concurrent commits are allowed during migration.
- The node MUST immediately close the current bundle after finalizing Migrate.
The Migrate event does NOT need to be alone in its bundle. Events accepted before the Migrate commit MAY be in the same bundle. The bundle closes immediately after Migrate is finalized, regardless of bundle.size or bundle.timeout configuration.
Example with bundle.size = 10:
- Events seq=100-105 are in an open bundle
- Migrate commit arrives, finalized as seq=106
- Bundle closes immediately with events seq=100-106
- No events seq=107+ are possible from this sequencer
This ensures a clean handoff with no ambiguity about which events belong to which sequencer.
Migration Modes
Peaceful Handoff (old node online):- Owner submits
Migratecommit to old node - Old node finalizes Migrate as the last event
- Old node transfers full event log to new node
- New node verifies log matches
ct_root - New node continues sequencing; next event will be seq =
prev_seq + 2(Migrate event wasprev_seq + 1) - Owner updates Registry (
reg_enclave)
- Owner or Backup has a copy of the event log
- Owner signs
Migratecommit (unfinalized) - Owner submits log + unfinalized commit to new node
- New node verifies log integrity (see verification below)
- New node finalizes the Migrate event (special case)
- New node becomes the sequencer
- Owner updates any discovery mechanism in use (e.g., the Registry enclave via
reg_enclave) — outside the core migration protocol.
In forced mode, the sequencer field of the Migrate event will be the NEW node, not the old one. This is the only event type where sequencer discontinuity is allowed.
Before finalizing Migrate, the new node MUST rebuild state from scratch:
- Replay all events from the log, computing SMT state after each state-changing event
- Verify computed SMT root matches
state_hashin final bundle - Verify the
fromfield of the Migrate commit has owner trait set in the computed SMT (RBAC namespace) - Recompute CT root — MUST match
ct_rootin Migrate commit - Verify commit signature (
sig) is valid for the computed commit hash - Verify
prev_seqequals the last event's sequence number in the log
If any check fails, the new node MUST reject the migration. This full replay ensures the new node has a correct, verified copy of enclave state.
Alternatively, a folded validity proof over (r_0 … state_hash) attests that every transition was valid without replaying the log — see zk.md → Folding. (Replay remains the baseline; the validity proof is an O(1) optimization, not a replacement for data availability.)
Checkpoint Verification
The ct_root field serves as a checkpoint:
- CT root commits to both the event log AND state (via
state_hashin leaves) - New node recomputes CT root from received log
- If mismatch, migration is rejected
Split-Brain Prevention
After migration:
- Old node's sequencer key is no longer valid.
- Any events finalized by old node after Migrate are invalid.
- Sequencer discovery is out of scope for core; clients periodically re-check whatever discovery mechanism they use (e.g., the Registry enclave).
If a client queries the old node after Migrate:
- Old node MAY still serve read requests (CT proofs, state proofs) — data is valid.
- Old node MUST reject new commits (new sequencer handles writes).
- Client discovers migration via its discovery mechanism (e.g., a Registry lookup) or via
ENCLAVE_NOT_FOUNDon commit. - Client migrates to new node for subsequent operations.
No explicit "migrated" error is required on reads; normal operation continues until the client syncs with its discovery mechanism.
Backup Pattern
To enable forced takeover, ensure someone has the full event log:
Option 1: Owner maintains backup- Owner's client stores all events locally
- Define a custom role with P (Push) permission for all event types
- Assign to a backup service
Schema example — define a backup trait with P ops on all events:
{ "event": "*", "operator": "backup", "ops": ["P"] }The wildcard * means the backup trait receives push for ALL event types. This enables disaster recovery if the node goes offline.
Important: Forced takeover requires Owner signature on the Migrate commit. If the Owner is offline and cannot sign:
- Forced takeover is blocked (intentional security — only Owner can authorize sequencer changes)
- Mitigation: ensure Owner's client/key is highly available, or use Transfer(owner) before sequencer goes offline
Enclave Snapshot Format
When the old node transfers full state to the new node (Peaceful Handoff step 3) or a Backup supplies the log to a new sequencer (Forced Takeover step 3), the bytes on the wire need a stable interchange format. The protocol defines .enc as the canonical format.
A .enc file is the complete enclave state: all finalized events, the SMT, the CT log, and the per-event metadata needed to rebuild the enclave on any compliant node. The format is hash-pinned (sha256 footer) and version-tagged so the receiving node can refuse incompatible kernels rather than silently corrupt state.
File Layout
┌──────────────────────────────────────────────────────────┐
│ Header (32 bytes, fixed) │
│ │
│ offset size field │
│ ───── ──── ────────────────────────────────────── │
│ 0 4 magic b"ENC\x01" │
│ 4 4 layout_ver u32 LE │
│ 8 4 kernel_ver u32 LE (semver-packed) │
│ 12 4 flags u32 LE (see Flags) │
│ 16 8 payload_size u64 LE (bytes that follow)│
│ 24 8 reserved 0u64 │
├──────────────────────────────────────────────────────────┤
│ Payload (variable, payload_size bytes) │
│ │
│ The complete enclave state in the producing kernel's │
│ natural layout (see Payload Encoding below). │
├──────────────────────────────────────────────────────────┤
│ Footer (32 bytes, fixed) │
│ │
│ sha256(header || payload) │
└──────────────────────────────────────────────────────────┘
Total: 32 + payload_size + 32 bytesMagic
b"ENC\x01" — four bytes, fixed. Lets restorers reject non-snapshot inputs immediately.
layout_ver (u32 LE)
Version of THIS file format. v1 = the format above. A future v2 MAY move fields or change framing; restorers MUST refuse on unknown layout_ver.
kernel_ver (u32 LE)
Semver of the kernel that produced the snapshot, packed:
- bits 0-15: patch
- bits 16-23: minor
- bits 24-31: major
| Producer / restorer relationship | Restorer action |
|---|---|
Same kernel_ver byte-for-byte | Accept. |
| Differs in patch only, ≥ 1.0 | Accept; MAY warn. |
| Differs in minor only, ≥ 1.0 | Accept; SHOULD warn. |
| Differs in major | REFUSE. Requires explicit out-of-band migration. |
| Any difference at all, pre-1.0 kernels (0.x.y) | REFUSE. Pre-1.0 layouts can shift between any release; defer loosening until a kernel reaches 1.0 with a documented backwards-compat contract. |
On refusal, the restorer MUST surface KERNEL_VERSION_MISMATCH (the named error code in node-api.md §Snapshot Endpoints) along with both versions.
Flags (u32 LE)
bit 0: COMPRESSED — payload is brotli-compressed
bit 1: SELF_CONTAINED — payload prepends a copy of the kernel binary for offline verification
bit 2: ENCRYPTED — payload encrypted; key derivation in implementation-defined metadata
bit 3-31: reserved — MUST be 0v1 producers MAY set all flags to 0 (no compression, no embedded kernel, no encryption). Restorers MUST refuse on unknown flag bits.
Footer
sha256(header || payload), 32 bytes. Restorers MUST recompute and refuse on mismatch — this is what makes the snapshot integrity-checked even when transferred over an untrusted channel (e.g., R2 / S3 / IPFS / hand-off via USB drive).
Payload Encoding
Two payload encodings are defined; restorers detect by sniffing the first four bytes of the payload:
- Compact (
payload[0..4] == "ENC\x01") — just the populated storage buffer. Typical 200 KB – few MB for a single-bundle enclave; 10-30× smaller than the raw form. RECOMMENDED for migration over the network. - Raw memory dump (anything else, typically zero bytes from a WASM data segment) — the entire kernel address space as a single byte blob. Typical 65 MB even for small enclaves (the kernel reserves a full bump-allocator arena). Useful for forensic snapshots that capture allocator state exactly.
Both encodings carry the same logical enclave state — every event, every SMT entry, every CT leaf. The choice is bytes-on-the-wire vs. memory-layout fidelity. A future layout_ver = 2 can assign a flag bit to make the distinction explicit instead of sniffed.
Snapshot Procedure
1. Kernel writes its full state to a contiguous byte buffer `payload`.
2. Compute header bytes: magic || layout_ver || kernel_ver || flags || payload_size || reserved
3. Compute footer = sha256(header || payload)
4. Emit: header || payload || footerRestore Procedure
1. Read first 32 bytes → header.
Verify magic == "ENC\x01". Reject otherwise.
Verify layout_ver is known. Reject otherwise.
2. Read next payload_size bytes → payload.
3. Read next 32 bytes → expected_footer.
4. Compute actual_footer = sha256(header || payload).
Reject on mismatch.
5. Apply kernel_ver compatibility rules. Reject on major bump.
6. Sniff payload[0..4] to pick the encoding; install into a fresh kernel instance.
7. Run the kernel's self-test entry point. Reject on non-zero result.
8. The enclave is live; sequencing can resume.Steps 1-5 are MUST. Step 6 is implementation-dependent on the encoding. Step 7 is OPTIONAL but RECOMMENDED — a self-test catches structural corruption the sha256 footer alone cannot (e.g., bit-flips in the page table that happen to balance).
Snapshot Endpoints
The .enc byte format is wire-format only; the protocol surface that moves snapshots between nodes is two HTTP endpoints. They are specified in node-api.md §Snapshot Endpoints:
| Method | Path | Body | Purpose |
|---|---|---|---|
GET | /enclaves/:id/snapshot | — | Download the complete enclave state as application/octet-stream .enc |
POST | /enclaves/:id/restore | application/octet-stream .enc | Bootstrap a fresh enclave on the receiving node from a .enc payload |
::: extension-point id=migration-snapshot-restore-authz class=local_policy reason: authorization is operator-deployment policy; the wire format and verification rules are the same regardless of who's authorized to call
Authorization for the /enclaves/:id/snapshot and /enclaves/:id/restore endpoints is set by the operator (typically Owner only for restore, public-with-RBAC-projection for snapshot). The wire format and verification rules are the same regardless of who's authorized to call them.
:::