Are you an LLM? Read llms.txt for a summary of the docs, or llms-full.txt for the full context.
Skip to content

Enclave Migration

This document specifies enclave migration — the controlled transfer of an enclave from one node to another while preserving its full event history and identity. The Migrate event, the three migration modes (eager / lazy / fork), checkpoint verification, split-brain prevention, and the backup pattern that re-binds the old enclave's SMT root are all defined here.


Table of Contents


Migrate Event

Type: Migrate

Content Structure:
{
  "new_sequencer": "<new_seq_pub>",
  "prev_seq": 1234,
  "ct_root": "<ct_root_at_prev_seq>"
}
Fields:
FieldRequiredDescription
new_sequencerYesPublic key of the new sequencer node
prev_seqYesSequence number of the last event BEFORE the Migrate event (Migrate will have seq = prev_seq + 1)
ct_rootYesCT root at prev_seq (proves log and state before Migrate)
Authorization:
  • Only Owner can issue Migrate.
  • The commit MUST be signed by Owner.
Migration Barrier:

Once a node accepts a Migrate commit:

  • The node MUST reject all other pending commits.
  • The Migrate event MUST be the final event from this sequencer.
  • No concurrent commits are allowed during migration.
  • The node MUST immediately close the current bundle after finalizing Migrate.
Bundle Handling:

The Migrate event does NOT need to be alone in its bundle. Events accepted before the Migrate commit MAY be in the same bundle. The bundle closes immediately after Migrate is finalized, regardless of bundle.size or bundle.timeout configuration.

Example with bundle.size = 10:

  • Events seq=100-105 are in an open bundle
  • Migrate commit arrives, finalized as seq=106
  • Bundle closes immediately with events seq=100-106
  • No events seq=107+ are possible from this sequencer

This ensures a clean handoff with no ambiguity about which events belong to which sequencer.


Migration Modes

Peaceful Handoff (old node online):
  1. Owner submits Migrate commit to old node
  2. Old node finalizes Migrate as the last event
  3. Old node transfers full event log to new node
  4. New node verifies log matches ct_root
  5. New node continues sequencing; next event will be seq = prev_seq + 2 (Migrate event was prev_seq + 1)
  6. Owner updates Registry (reg_enclave)
Forced Takeover (old node offline):
  1. Owner or Backup has a copy of the event log
  2. Owner signs Migrate commit (unfinalized)
  3. Owner submits log + unfinalized commit to new node
  4. New node verifies log integrity (see verification below)
  5. New node finalizes the Migrate event (special case)
  6. New node becomes the sequencer
  7. Owner updates any discovery mechanism in use (e.g., the Registry enclave via reg_enclave) — outside the core migration protocol.

In forced mode, the sequencer field of the Migrate event will be the NEW node, not the old one. This is the only event type where sequencer discontinuity is allowed.

Forced Takeover Verification:

Before finalizing Migrate, the new node MUST rebuild state from scratch:

  1. Replay all events from the log, computing SMT state after each state-changing event
  2. Verify computed SMT root matches state_hash in final bundle
  3. Verify the from field of the Migrate commit has owner trait set in the computed SMT (RBAC namespace)
  4. Recompute CT root — MUST match ct_root in Migrate commit
  5. Verify commit signature (sig) is valid for the computed commit hash
  6. Verify prev_seq equals the last event's sequence number in the log

If any check fails, the new node MUST reject the migration. This full replay ensures the new node has a correct, verified copy of enclave state.

Alternatively, a folded validity proof over (r_0 … state_hash) attests that every transition was valid without replaying the log — see zk.md → Folding. (Replay remains the baseline; the validity proof is an O(1) optimization, not a replacement for data availability.)


Checkpoint Verification

The ct_root field serves as a checkpoint:

  • CT root commits to both the event log AND state (via state_hash in leaves)
  • New node recomputes CT root from received log
  • If mismatch, migration is rejected

Split-Brain Prevention

After migration:

  • Old node's sequencer key is no longer valid.
  • Any events finalized by old node after Migrate are invalid.
  • Sequencer discovery is out of scope for core; clients periodically re-check whatever discovery mechanism they use (e.g., the Registry enclave).
Client Recovery after Migrate:

If a client queries the old node after Migrate:

  1. Old node MAY still serve read requests (CT proofs, state proofs) — data is valid.
  2. Old node MUST reject new commits (new sequencer handles writes).
  3. Client discovers migration via its discovery mechanism (e.g., a Registry lookup) or via ENCLAVE_NOT_FOUND on commit.
  4. Client migrates to new node for subsequent operations.

No explicit "migrated" error is required on reads; normal operation continues until the client syncs with its discovery mechanism.


Backup Pattern

To enable forced takeover, ensure someone has the full event log:

Option 1: Owner maintains backup
  • Owner's client stores all events locally
Option 2: Dedicated Backup role
  • Define a custom role with P (Push) permission for all event types
  • Assign to a backup service

Schema example — define a backup trait with P ops on all events:

{ "event": "*", "operator": "backup", "ops": ["P"] }

The wildcard * means the backup trait receives push for ALL event types. This enables disaster recovery if the node goes offline.

Important: Forced takeover requires Owner signature on the Migrate commit. If the Owner is offline and cannot sign:

  • Forced takeover is blocked (intentional security — only Owner can authorize sequencer changes)
  • Mitigation: ensure Owner's client/key is highly available, or use Transfer(owner) before sequencer goes offline

Enclave Snapshot Format

When the old node transfers full state to the new node (Peaceful Handoff step 3) or a Backup supplies the log to a new sequencer (Forced Takeover step 3), the bytes on the wire need a stable interchange format. The protocol defines .enc as the canonical format.

A .enc file is the complete enclave state: all finalized events, the SMT, the CT log, and the per-event metadata needed to rebuild the enclave on any compliant node. The format is hash-pinned (sha256 footer) and version-tagged so the receiving node can refuse incompatible kernels rather than silently corrupt state.

File Layout

┌──────────────────────────────────────────────────────────┐
│ Header (32 bytes, fixed)                                  │
│                                                           │
│   offset  size  field                                     │
│   ─────   ────  ──────────────────────────────────────    │
│   0       4     magic         b"ENC\x01"                  │
│   4       4     layout_ver    u32 LE                      │
│   8       4     kernel_ver    u32 LE   (semver-packed)    │
│   12      4     flags         u32 LE   (see Flags)        │
│   16      8     payload_size  u64 LE   (bytes that follow)│
│   24      8     reserved      0u64                        │
├──────────────────────────────────────────────────────────┤
│ Payload (variable, payload_size bytes)                    │
│                                                           │
│   The complete enclave state in the producing kernel's    │
│   natural layout (see Payload Encoding below).            │
├──────────────────────────────────────────────────────────┤
│ Footer (32 bytes, fixed)                                  │
│                                                           │
│   sha256(header || payload)                               │
└──────────────────────────────────────────────────────────┘
 
Total: 32 + payload_size + 32 bytes

Magic

b"ENC\x01" — four bytes, fixed. Lets restorers reject non-snapshot inputs immediately.

layout_ver (u32 LE)

Version of THIS file format. v1 = the format above. A future v2 MAY move fields or change framing; restorers MUST refuse on unknown layout_ver.

kernel_ver (u32 LE)

Semver of the kernel that produced the snapshot, packed:

  • bits 0-15: patch
  • bits 16-23: minor
  • bits 24-31: major
Compatibility rules:
Producer / restorer relationshipRestorer action
Same kernel_ver byte-for-byteAccept.
Differs in patch only, ≥ 1.0Accept; MAY warn.
Differs in minor only, ≥ 1.0Accept; SHOULD warn.
Differs in majorREFUSE. Requires explicit out-of-band migration.
Any difference at all, pre-1.0 kernels (0.x.y)REFUSE. Pre-1.0 layouts can shift between any release; defer loosening until a kernel reaches 1.0 with a documented backwards-compat contract.

On refusal, the restorer MUST surface KERNEL_VERSION_MISMATCH (the named error code in node-api.md §Snapshot Endpoints) along with both versions.

Flags (u32 LE)

bit 0: COMPRESSED        — payload is brotli-compressed
bit 1: SELF_CONTAINED    — payload prepends a copy of the kernel binary for offline verification
bit 2: ENCRYPTED         — payload encrypted; key derivation in implementation-defined metadata
bit 3-31: reserved       — MUST be 0

v1 producers MAY set all flags to 0 (no compression, no embedded kernel, no encryption). Restorers MUST refuse on unknown flag bits.

Footer

sha256(header || payload), 32 bytes. Restorers MUST recompute and refuse on mismatch — this is what makes the snapshot integrity-checked even when transferred over an untrusted channel (e.g., R2 / S3 / IPFS / hand-off via USB drive).

Payload Encoding

Two payload encodings are defined; restorers detect by sniffing the first four bytes of the payload:

  • Compact (payload[0..4] == "ENC\x01") — just the populated storage buffer. Typical 200 KB – few MB for a single-bundle enclave; 10-30× smaller than the raw form. RECOMMENDED for migration over the network.
  • Raw memory dump (anything else, typically zero bytes from a WASM data segment) — the entire kernel address space as a single byte blob. Typical 65 MB even for small enclaves (the kernel reserves a full bump-allocator arena). Useful for forensic snapshots that capture allocator state exactly.

Both encodings carry the same logical enclave state — every event, every SMT entry, every CT leaf. The choice is bytes-on-the-wire vs. memory-layout fidelity. A future layout_ver = 2 can assign a flag bit to make the distinction explicit instead of sniffed.

Snapshot Procedure

1. Kernel writes its full state to a contiguous byte buffer `payload`.
2. Compute header bytes: magic || layout_ver || kernel_ver || flags || payload_size || reserved
3. Compute footer = sha256(header || payload)
4. Emit: header || payload || footer

Restore Procedure

1. Read first 32 bytes → header.
   Verify magic == "ENC\x01".  Reject otherwise.
   Verify layout_ver is known.  Reject otherwise.
2. Read next payload_size bytes → payload.
3. Read next 32 bytes → expected_footer.
4. Compute actual_footer = sha256(header || payload).
   Reject on mismatch.
5. Apply kernel_ver compatibility rules. Reject on major bump.
6. Sniff payload[0..4] to pick the encoding; install into a fresh kernel instance.
7. Run the kernel's self-test entry point. Reject on non-zero result.
8. The enclave is live; sequencing can resume.

Steps 1-5 are MUST. Step 6 is implementation-dependent on the encoding. Step 7 is OPTIONAL but RECOMMENDED — a self-test catches structural corruption the sha256 footer alone cannot (e.g., bit-flips in the page table that happen to balance).


Snapshot Endpoints

The .enc byte format is wire-format only; the protocol surface that moves snapshots between nodes is two HTTP endpoints. They are specified in node-api.md §Snapshot Endpoints:

MethodPathBodyPurpose
GET/enclaves/:id/snapshotDownload the complete enclave state as application/octet-stream .enc
POST/enclaves/:id/restoreapplication/octet-stream .encBootstrap a fresh enclave on the receiving node from a .enc payload

::: extension-point id=migration-snapshot-restore-authz class=local_policy reason: authorization is operator-deployment policy; the wire format and verification rules are the same regardless of who's authorized to call

Authorization for the /enclaves/:id/snapshot and /enclaves/:id/restore endpoints is set by the operator (typically Owner only for restore, public-with-RBAC-projection for snapshot). The wire format and verification rules are the same regardless of who's authorized to call them. :::