From Rust to Reality: The Hidden Journey of fetchmax This article explores how Rust’s atomic operation fetchmax works under the hood, tracing its journey from simple Rust code through multiple compilation layers down to machine code, revealing the power of modern compilers. --- Introduction QuestDB is an open-source time-series database optimized for demanding workloads. Focuses on ultra-low latency, high ingestion throughput, and multi-tier storage. Native support for Parquet and SQL ensures data portability and AI readiness. --- The Starting Point: A Job Interview Question Common interview question: track the maximum value across multiple producer threads. In Java, this often involves a compare-and-swap (CAS) loop, e.g., updateAndGet() with a lambda. A candidate used Rust and simply wrote: This was surprising because Rust provides fetchmax as a first-class atomic operation, unlike Java or C++. --- The Investigation: How Does Rust’s fetchmax Work? The author investigates how Rust implements fetchmax given no native atomic max instruction exists on x86-64. --- Layer 1: The Rust Code Example Rust code: This atomically fetches, compares, updates if new value is greater, and returns the old value in a thread-safe way. No visible loops or retry logic at this level. --- Layer 2: Macro Expansion fetchmax is generated by a macro called atomicint! in Rust’s standard library. This macro defines fetchmax as: $maxfn expands to atomicumax for unsigned types (and atomicmax for signed). Thus, the call goes to atomicumax. --- Layer 3: LLVM IR Rust compiles to LLVM Intermediate Representation (IR) before assembly. The IR for fetchmax includes: This means LLVM sees an atomic read-modify-write (RMW) operation to compute an unsigned maximum. However, most CPUs (e.g., x86-64) do not have a native umax atomic instruction. --- Seeing LLVM IR Yourself Use rustc --emit=llvm-ir main.rs to generate LLVM IR file (main.ll) containing this instruction. --- Interlude: Compiler Intrinsics The atomicumax function in Rust is a compiler intrinsic (#[rustcintrinsic]). Intrinsics have no body but map directly to LLVM’s atomic RMW instructions. The Rust compiler replaces calls to atomicumax with LLVM’s intrinsic atomicrmw umax. --- Layer 4: The LLVM Atomic Expand Pass LLVM runs a pass called AtomicExpandPass. It asks the target architecture if it supports the atomic umax natively. For x86-64, the answer is no, so LLVM rewrites the atomic RMW umax into a compare-and-swap (CAS) loop. The CAS loop pseudocode: Load expected value. Compute desired = max(expected, val). Attempt cmpxchg(ptr, expected, desired). If success, done; else retry with new expected value. You can inspect this with: --- Layer 5: The Final Assembly (x86-64) Generating assembly with rustc --emit=asm main.rs reveals CPU instructions for the CAS loop. Key points in assembly (AT&T syntax): Seed read: load current value once. Compute max without branching using sub and conditional move (cmova). lock cmpxchgq performs the atomic CAS. Branches retry until success. --- The Beauty of Abstraction The journey from high-level Rust to assembly involves: