Delta Sync: Implemented Auction Delta Sync (Issue #4207)

Background

Every auction cycle, the autopilot was broadcasting a full orderbook snapshot to every connected solver. On mainnet, over 90% of that payload was identical between consecutive auctions — mostly static limit orders sitting unchanged. As we scale toward a unified mega-auction across chains, this approach doesn’t hold up.

The fix: replace full snapshots with incremental updates.


What Changed

Autopilot Side

SolvableOrdersCache now emits structured change events. The core idea is that instead of rebuilding the full auction from scratch and shipping it wholesale, the cache now tracks what actually changed between cycles and emits a DeltaEnvelope containing only those changes:

DeltaEvent::OrderAdded(order)
DeltaEvent::OrderRemoved(uid)
DeltaEvent::OrderUpdated(order)
DeltaEvent::PriceChanged { token, price }
DeltaEvent::BlockChanged { block }
DeltaEvent::JitOwnersChanged { ... }
DeltaEvent::AuctionChanged { new_auction_id }

Each envelope carries a monotonically increasing delta_sequence (global, never resets) and an auction_sequence (per-auction counter). This separation matters: auction_sequence resets at auction boundaries, but delta_sequence is the authoritative counter for replay and gap detection.

Three new HTTP endpoints on the autopilot API:

  • GET /delta/snapshot — full state for bootstrapping a new connection

  • GET /delta/stream?after_sequence=N — SSE stream of incremental envelopes

  • GET /delta/checksum — SHA-256 hash of current order UIDs and prices for integrity verification

The stream uses Server-Sent Events with a dedicated unbounded control channel for critical resync_required events so they’re never dropped when the main bounded buffer fills up.

Incremental cache rebuild (opt-in). An IndexedAuctionState tracks which orders are currently filtered (in-flight, no balance, no price, invalid signature) so that on each cycle we only re-evaluate orders that actually changed. This runs as a shadow comparison first (AUTOPILOT_DELTA_SYNC_SHADOW_COMPARE=true) and can be promoted to primary (AUTOPILOT_DELTA_SYNC_INCREMENTAL_PRIMARY=true).

Delta history retention is configurable with MIN_DELTA_HISTORY_RETAINED, DEFAULT_DELTA_HISTORY_MAX_AGE, and MAX_DELTA_HISTORY constants, with age-based pruning using the monotonic created_at_instant timestamp (not the wire published_at, which can drift under load).


Driver Side

The driver maintains a local Replica that applies snapshot + delta envelopes to keep a local copy of the order set and prices:

rust

replica.apply_snapshot(snapshot)?;
replica.apply_delta(envelope)?;

The replica validates sequence continuity, rejects stale envelopes, and requests a resnapshot on any gap or protocol version mismatch. It also validates a boot_id UUID (generated fresh on each autopilot start) so a restart triggers a clean resnapshot rather than applying deltas across sessions.

A background delta_sync task runs the fetch-snapshot → follow-stream loop with automatic retry and exponential backoff:

Syncing → apply snapshot → follow SSE stream
         ↑                        |
         └──── Resyncing ←────────┘ (on resync_required or gap)

Thin solve requests. Solvers that set supports_thin_solve_request: true in their config receive a solve request with an empty orders array. The driver reconstructs the full auction from its local replica before passing it to the solver. This is gated behind DRIVER_DELTA_SYNC_USE_REPLICA=true and falls back gracefully to the full body if the replica isn’t ready.

New HTTP headers carry metadata for thin requests so the driver can skip deserializing the body entirely when the replica is already warm:

X-Auction-Body-Mode: thin | full
X-Auction-Id: 42
X-Auction-Deadline: 2025-01-02T03:04:05Z
X-Auction-Tokens: 0x...:1,0x...:0
X-Auction-Jit-Order-Owners: 0x...,0x...

Correctness Guarantees

The trickiest part was ensuring the incremental projection always matches the canonical full rebuild. After computing delta events, the code applies them back to the previous auction state and compares the result against the canonical rebuild:

  • Match → use the reconstructed state (preserving non-delta fields like block from the full rebuild)

  • Mismatch → fall back to canonical auction, recompute events from scratch, increment a metric

A randomized property test runs 120 iterations of random order/price mutations and verifies that apply_delta_events_to_auction(base, compute_delta_events(base, next)) == next holds throughout.

Checksum verification compares SHA-256 hashes of sorted order UIDs and token prices between the driver replica and the autopilot at health check time, surfacing any divergence without blocking the hot path.


Rollout

Everything is feature-flagged:

Flag Default Purpose
AUTOPILOT_DELTA_SYNC_ENABLED false Enable delta API endpoints
AUTOPILOT_DELTA_SYNC_SHADOW_COMPARE false Run incremental path alongside full, compare results
AUTOPILOT_DELTA_SYNC_INCREMENTAL_PRIMARY false Promote incremental path to primary
DRIVER_DELTA_SYNC_ENABLED true Enable driver replica background task
DRIVER_DELTA_SYNC_USE_REPLICA false Use replica for solve request preprocessing
supports_thin_solve_request false Per-solver config to opt into thin mode

The plan is to enable shadow comparison first, watch the delta_shadow_compare{result="mismatch"} metric for a week, then flip incremental primary, then gradually roll out thin mode solver by solver.

References

Issue: https://github.com/cowprotocol/services/issues/4207
Implementation Branch: https://github.com/ashleychandy/services/tree/delta-sync

1 Like

This is a great write-up — thanks for putting it together and sharing the implementation details as well.

We’ve flagged it internally with the team. They’ll take a closer look, and we’ll circle back here if there’s feedback or discussion to bring in :+1:

1 Like

Thanks for flagging this internally.

I’ve updated the write-up, the previous version was incorrect, so I’ve replaced it entirely with the correct issue and implementation details. Happy to clarify or expand on anything if needed.

1 Like