Oxide and Friends

Futurelock

12 snips
Nov 7, 2025
In this engaging discussion, Oxide engineers Dave Pacheco, John Gallagher, and Eliza Weisman tackle the mysterious 'Futurelock' issue found in async Rust. Dave dives into the investigation, revealing how a Nexus instance became unresponsive during a live update. John shares his insights on reproducing the deadlock, while Eliza discusses semaphore implementation details. Together, they explore the conceptual differences between tasks and futures, the hidden challenges of concurrency, and practical fixes to avoid this tricky pathology in the future.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Dogfood Live-Update Hang

  • John described a live-update test where one Nexus instance became unresponsive during a weekend update and the team had to debug a mysterious hang.
  • Dave and John dug into logs, database checks, and task-level tracing before escalating to deep analysis.
ADVICE

Instrument Tasks Early

  • Use fine-grained runtime probes and task tracing to narrow where async code is stalling.
  • Instrument Tokio tasks and library internals early to capture actionable traces before state drifts.
ANECDOTE

Core Dump And Ghidra Reverse Engineering

  • John captured a core dump while the receiver was inside Tokio's receive to inspect channel internals offline.
  • He then used Ghidra to reverse-engineer inlined Rust assembly and validate channel state.
Get the Snipd Podcast app to discover more snips from this episode
Get the app