miri/concurrency/
weak_memory.rs

1//! Implementation of C++11-consistent weak memory emulation using store buffers
2//! based on Dynamic Race Detection for C++ ("the paper"):
3//! <https://www.doc.ic.ac.uk/~afd/homepages/papers/pdfs/2017/POPL.pdf>
4//!
5//! This implementation will never generate weak memory behaviours forbidden by the C++11 model,
6//! but it is incapable of producing all possible weak behaviours allowed by the model. There are
7//! certain weak behaviours observable on real hardware but not while using this.
8//!
9//! Note that this implementation does not fully take into account of C++20's memory model revision to SC accesses
10//! and fences introduced by P0668 (<https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0668r5.html>).
11//! This implementation is not fully correct under the revised C++20 model and may generate behaviours C++20
12//! disallows (<https://github.com/rust-lang/miri/issues/2301>).
13//!
14//! Modifications are made to the paper's model to address C++20 changes:
15//! - If an SC load reads from an atomic store of any ordering, then a later SC load cannot read
16//!   from an earlier store in the location's modification order. This is to prevent creating a
17//!   backwards S edge from the second load to the first, as a result of C++20's coherence-ordered
18//!   before rules. (This seems to rule out behaviors that were actually permitted by the RC11 model
19//!   that C++20 intended to copy (<https://plv.mpi-sws.org/scfix/paper.pdf>); a change was
20//!   introduced when translating the math to English. According to Viktor Vafeiadis, this
21//!   difference is harmless. So we stick to what the standard says, and allow fewer behaviors.)
22//! - SC fences are treated like AcqRel RMWs to a global clock, to ensure they induce enough
23//!   synchronization with the surrounding accesses. This rules out legal behavior, but it is really
24//!   hard to be more precise here.
25//!
26//! Rust follows the C++20 memory model (except for the Consume ordering and some operations not performable through C++'s
27//! `std::atomic<T>` API). It is therefore possible for this implementation to generate behaviours never observable when the
28//! same program is compiled and run natively. Unfortunately, no literature exists at the time of writing which proposes
29//! an implementable and C++20-compatible relaxed memory model that supports all atomic operation existing in Rust. The closest one is
30//! A Promising Semantics for Relaxed-Memory Concurrency by Jeehoon Kang et al. (<https://www.cs.tau.ac.il/~orilahav/papers/popl17.pdf>)
31//! However, this model lacks SC accesses and is therefore unusable by Miri (SC accesses are everywhere in library code).
32//!
33//! If you find anything that proposes a relaxed memory model that is C++20-consistent, supports all orderings Rust's atomic accesses
34//! and fences accept, and is implementable (with operational semantics), please open a GitHub issue!
35//!
36//! One characteristic of this implementation, in contrast to some other notable operational models such as ones proposed in
37//! Taming Release-Acquire Consistency by Ori Lahav et al. (<https://plv.mpi-sws.org/sra/paper.pdf>) or Promising Semantics noted above,
38//! is that this implementation does not require each thread to hold an isolated view of the entire memory. Here, store buffers are per-location
39//! and shared across all threads. This is more memory efficient but does require store elements (representing writes to a location) to record
40//! information about reads, whereas in the other two models it is the other way round: reads points to the write it got its value from.
41//! Additionally, writes in our implementation do not have globally unique timestamps attached. In the other two models this timestamp is
42//! used to make sure a value in a thread's view is not overwritten by a write that occurred earlier than the one in the existing view.
43//! In our implementation, this is detected using read information attached to store elements, as there is no data structure representing reads.
44//!
45//! The C++ memory model is built around the notion of an 'atomic object', so it would be natural
46//! to attach store buffers to atomic objects. However, Rust follows LLVM in that it only has
47//! 'atomic accesses'. Therefore Miri cannot know when and where atomic 'objects' are being
48//! created or destroyed, to manage its store buffers. Instead, we hence lazily create an
49//! atomic object on the first atomic write to a given region, and we destroy that object
50//! on the next non-atomic or imperfectly overlapping atomic write to that region.
51//! These lazy (de)allocations happen in memory_accessed() on non-atomic accesses, and
52//! get_or_create_store_buffer_mut() on atomic writes.
53//!
54//! One consequence of this difference is that safe/sound Rust allows for more operations on atomic locations
55//! than the C++20 atomic API was intended to allow, such as non-atomically accessing
56//! a previously atomically accessed location, or accessing previously atomically accessed locations with a differently sized operation
57//! (such as accessing the top 16 bits of an AtomicU32). These scenarios are generally undiscussed in formalizations of C++ memory model.
58//! In Rust, these operations can only be done through a `&mut AtomicFoo` reference or one derived from it, therefore these operations
59//! can only happen after all previous accesses on the same locations. This implementation is adapted to allow these operations.
60//! A mixed atomicity read that races with writes, or a write that races with reads or writes will still cause UBs to be thrown.
61//! Mixed size atomic accesses must not race with any other atomic access, whether read or write, or a UB will be thrown.
62//! You can refer to test cases in weak_memory/extra_cpp.rs and weak_memory/extra_cpp_unsafe.rs for examples of these operations.
63
64// Our and the author's own implementation (tsan11) of the paper have some deviations from the provided operational semantics in §5.3:
65// 1. In the operational semantics, store elements keep a copy of the atomic object's vector clock (AtomicCellClocks::sync_vector in miri),
66// but this is not used anywhere so it's omitted here.
67//
68// 2. In the operational semantics, each store element keeps the timestamp of a thread when it loads from the store.
69// If the same thread loads from the same store element multiple times, then the timestamps at all loads are saved in a list of load elements.
70// This is not necessary as later loads by the same thread will always have greater timestamp values, so we only need to record the timestamp of the first
71// load by each thread. This optimisation is done in tsan11
72// (https://github.com/ChrisLidbury/tsan11/blob/ecbd6b81e9b9454e01cba78eb9d88684168132c7/lib/tsan/rtl/tsan_relaxed.h#L35-L37)
73// and here.
74//
75// 3. §4.5 of the paper wants an SC store to mark all existing stores in the buffer that happens before it
76// as SC. This is not done in the operational semantics but implemented correctly in tsan11
77// (https://github.com/ChrisLidbury/tsan11/blob/ecbd6b81e9b9454e01cba78eb9d88684168132c7/lib/tsan/rtl/tsan_relaxed.cc#L160-L167)
78// and here.
79//
80// 4. W_SC ; R_SC case requires the SC load to ignore all but last store marked SC (stores not marked SC are not
81// affected). But this rule is applied to all loads in ReadsFromSet from the paper (last two lines of code), not just SC load.
82// This is implemented correctly in tsan11
83// (https://github.com/ChrisLidbury/tsan11/blob/ecbd6b81e9b9454e01cba78eb9d88684168132c7/lib/tsan/rtl/tsan_relaxed.cc#L295)
84// and here.
85
86use std::cell::{Ref, RefCell};
87use std::collections::VecDeque;
88
89use rustc_data_structures::fx::FxHashMap;
90
91use super::AllocDataRaceHandler;
92use super::data_race::{GlobalState as DataRaceState, ThreadClockSet};
93use super::range_object_map::{AccessType, RangeObjectMap};
94use super::vector_clock::{VClock, VTimestamp, VectorIdx};
95use crate::concurrency::GlobalDataRaceHandler;
96use crate::*;
97
98pub type AllocState = StoreBufferAlloc;
99
100// Each store buffer must be bounded otherwise it will grow indefinitely.
101// However, bounding the store buffer means restricting the amount of weak
102// behaviours observable. The author picked 128 as a good tradeoff
103// so we follow them here.
104const STORE_BUFFER_LIMIT: usize = 128;
105
106#[derive(Debug, Clone)]
107pub struct StoreBufferAlloc {
108    /// Store buffer of each atomic object in this allocation
109    // Behind a RefCell because we need to allocate/remove on read access
110    store_buffers: RefCell<RangeObjectMap<StoreBuffer>>,
111}
112
113impl VisitProvenance for StoreBufferAlloc {
114    fn visit_provenance(&self, visit: &mut VisitWith<'_>) {
115        let Self { store_buffers } = self;
116        for val in store_buffers
117            .borrow()
118            .iter()
119            .flat_map(|buf| buf.buffer.iter().map(|element| &element.val))
120        {
121            val.visit_provenance(visit);
122        }
123    }
124}
125
126#[derive(Debug, Clone, PartialEq, Eq)]
127pub(super) struct StoreBuffer {
128    // Stores to this location in modification order
129    buffer: VecDeque<StoreElement>,
130}
131
132/// Whether a load returned the latest value or not.
133#[derive(PartialEq, Eq)]
134enum LoadRecency {
135    Latest,
136    Outdated,
137}
138
139#[derive(Debug, Clone, PartialEq, Eq)]
140struct StoreElement {
141    /// The identifier of the vector index, corresponding to a thread
142    /// that performed the store.
143    store_index: VectorIdx,
144
145    /// Whether this store is SC.
146    is_seqcst: bool,
147
148    /// The timestamp of the storing thread when it performed the store
149    timestamp: VTimestamp,
150
151    /// The value of this store. `None` means uninitialized.
152    // FIXME: Currently, we cannot represent partial initialization.
153    val: Option<Scalar>,
154
155    /// Metadata about loads from this store element,
156    /// behind a RefCell to keep load op take &self
157    load_info: RefCell<LoadInfo>,
158}
159
160#[derive(Debug, Clone, PartialEq, Eq, Default)]
161struct LoadInfo {
162    /// Timestamp of first loads from this store element by each thread
163    timestamps: FxHashMap<VectorIdx, VTimestamp>,
164    /// Whether this store element has been read by an SC load
165    sc_loaded: bool,
166}
167
168impl StoreBufferAlloc {
169    pub fn new_allocation() -> Self {
170        Self { store_buffers: RefCell::new(RangeObjectMap::new()) }
171    }
172
173    /// When a non-atomic access happens on a location that has been atomically accessed
174    /// before without data race, we can determine that the non-atomic access fully happens
175    /// after all the prior atomic writes so the location no longer needs to exhibit
176    /// any weak memory behaviours until further atomic accesses.
177    pub fn memory_accessed(&self, range: AllocRange, global: &DataRaceState) {
178        if !global.ongoing_action_data_race_free() {
179            let mut buffers = self.store_buffers.borrow_mut();
180            let access_type = buffers.access_type(range);
181            match access_type {
182                AccessType::PerfectlyOverlapping(pos) => {
183                    buffers.remove_from_pos(pos);
184                }
185                AccessType::ImperfectlyOverlapping(pos_range) => {
186                    // We rely on the data-race check making sure this is synchronized.
187                    // Therefore we can forget about the old data here.
188                    buffers.remove_pos_range(pos_range);
189                }
190                AccessType::Empty(_) => {
191                    // The range had no weak behaviours attached, do nothing
192                }
193            }
194        }
195    }
196
197    /// Gets a store buffer associated with an atomic object in this allocation.
198    /// Returns `None` if there is no store buffer.
199    fn get_store_buffer<'tcx>(
200        &self,
201        range: AllocRange,
202    ) -> InterpResult<'tcx, Option<Ref<'_, StoreBuffer>>> {
203        let access_type = self.store_buffers.borrow().access_type(range);
204        let pos = match access_type {
205            AccessType::PerfectlyOverlapping(pos) => pos,
206            // If there is nothing here yet, that means there wasn't an atomic write yet so
207            // we can't return anything outdated.
208            _ => return interp_ok(None),
209        };
210        let store_buffer = Ref::map(self.store_buffers.borrow(), |buffer| &buffer[pos]);
211        interp_ok(Some(store_buffer))
212    }
213
214    /// Gets a mutable store buffer associated with an atomic object in this allocation,
215    /// or creates one with the specified initial value if no atomic object exists yet.
216    fn get_or_create_store_buffer_mut<'tcx>(
217        &mut self,
218        range: AllocRange,
219        init: Option<Scalar>,
220    ) -> InterpResult<'tcx, &mut StoreBuffer> {
221        let buffers = self.store_buffers.get_mut();
222        let access_type = buffers.access_type(range);
223        let pos = match access_type {
224            AccessType::PerfectlyOverlapping(pos) => pos,
225            AccessType::Empty(pos) => {
226                buffers.insert_at_pos(pos, range, StoreBuffer::new(init));
227                pos
228            }
229            AccessType::ImperfectlyOverlapping(pos_range) => {
230                // Once we reach here we would've already checked that this access is not racy.
231                buffers.remove_pos_range(pos_range.clone());
232                buffers.insert_at_pos(pos_range.start, range, StoreBuffer::new(init));
233                pos_range.start
234            }
235        };
236        interp_ok(&mut buffers[pos])
237    }
238}
239
240impl<'tcx> StoreBuffer {
241    fn new(init: Option<Scalar>) -> Self {
242        let mut buffer = VecDeque::new();
243        let store_elem = StoreElement {
244            // The thread index and timestamp of the initialisation write
245            // are never meaningfully used, so it's fine to leave them as 0
246            store_index: VectorIdx::from(0),
247            timestamp: VTimestamp::ZERO,
248            val: init,
249            is_seqcst: false,
250            load_info: RefCell::new(LoadInfo::default()),
251        };
252        buffer.push_back(store_elem);
253        Self { buffer }
254    }
255
256    /// Reads from the last store in modification order, if any.
257    fn read_from_last_store(
258        &self,
259        global: &DataRaceState,
260        thread_mgr: &ThreadManager<'_>,
261        is_seqcst: bool,
262    ) {
263        let store_elem = self.buffer.back();
264        if let Some(store_elem) = store_elem {
265            let (index, clocks) = global.active_thread_state(thread_mgr);
266            store_elem.load_impl(index, &clocks, is_seqcst);
267        }
268    }
269
270    fn buffered_read(
271        &self,
272        global: &DataRaceState,
273        thread_mgr: &ThreadManager<'_>,
274        is_seqcst: bool,
275        rng: &mut (impl rand::Rng + ?Sized),
276        validate: impl FnOnce() -> InterpResult<'tcx>,
277    ) -> InterpResult<'tcx, (Option<Scalar>, LoadRecency)> {
278        // Having a live borrow to store_buffer while calling validate_atomic_load is fine
279        // because the race detector doesn't touch store_buffer
280
281        let (store_elem, recency) = {
282            // The `clocks` we got here must be dropped before calling validate_atomic_load
283            // as the race detector will update it
284            let (.., clocks) = global.active_thread_state(thread_mgr);
285            // Load from a valid entry in the store buffer
286            self.fetch_store(is_seqcst, &clocks, &mut *rng)
287        };
288
289        // Unlike in buffered_atomic_write, thread clock updates have to be done
290        // after we've picked a store element from the store buffer, as presented
291        // in ATOMIC LOAD rule of the paper. This is because fetch_store
292        // requires access to ThreadClockSet.clock, which is updated by the race detector
293        validate()?;
294
295        let (index, clocks) = global.active_thread_state(thread_mgr);
296        let loaded = store_elem.load_impl(index, &clocks, is_seqcst);
297        interp_ok((loaded, recency))
298    }
299
300    fn buffered_write(
301        &mut self,
302        val: Scalar,
303        global: &DataRaceState,
304        thread_mgr: &ThreadManager<'_>,
305        is_seqcst: bool,
306    ) -> InterpResult<'tcx> {
307        let (index, clocks) = global.active_thread_state(thread_mgr);
308
309        self.store_impl(val, index, &clocks.clock, is_seqcst);
310        interp_ok(())
311    }
312
313    /// Selects a valid store element in the buffer.
314    fn fetch_store<R: rand::Rng + ?Sized>(
315        &self,
316        is_seqcst: bool,
317        clocks: &ThreadClockSet,
318        rng: &mut R,
319    ) -> (&StoreElement, LoadRecency) {
320        use rand::seq::IteratorRandom;
321        let mut found_sc = false;
322        // FIXME: we want an inclusive take_while (stops after a false predicate, but
323        // includes the element that gave the false), but such function doesn't yet
324        // exist in the standard library https://github.com/rust-lang/rust/issues/62208
325        // so we have to hack around it with keep_searching
326        let mut keep_searching = true;
327        let candidates = self
328            .buffer
329            .iter()
330            .rev()
331            .take_while(move |&store_elem| {
332                if !keep_searching {
333                    return false;
334                }
335
336                keep_searching = if store_elem.timestamp <= clocks.clock[store_elem.store_index] {
337                    // CoWR: if a store happens-before the current load,
338                    // then we can't read-from anything earlier in modification order.
339                    // C++20 §6.9.2.2 [intro.races] paragraph 18
340                    false
341                } else if store_elem.load_info.borrow().timestamps.iter().any(
342                    |(&load_index, &load_timestamp)| load_timestamp <= clocks.clock[load_index],
343                ) {
344                    // CoRR: if there was a load from this store which happened-before the current load,
345                    // then we cannot read-from anything earlier in modification order.
346                    // C++20 §6.9.2.2 [intro.races] paragraph 16
347                    false
348                } else if store_elem.timestamp <= clocks.write_seqcst[store_elem.store_index]
349                    && store_elem.is_seqcst
350                {
351                    // The current non-SC load, which may be sequenced-after an SC fence,
352                    // cannot read-before the last SC store executed before the fence.
353                    // C++17 §32.4 [atomics.order] paragraph 4
354                    false
355                } else if is_seqcst
356                    && store_elem.timestamp <= clocks.read_seqcst[store_elem.store_index]
357                {
358                    // The current SC load cannot read-before the last store sequenced-before
359                    // the last SC fence.
360                    // C++17 §32.4 [atomics.order] paragraph 5
361                    false
362                } else if is_seqcst && store_elem.load_info.borrow().sc_loaded {
363                    // The current SC load cannot read-before a store that an earlier SC load has observed.
364                    // See https://github.com/rust-lang/miri/issues/2301#issuecomment-1222720427.
365                    // Consequences of C++20 §31.4 [atomics.order] paragraph 3.1, 3.3 (coherence-ordered before)
366                    // and 4.1 (coherence-ordered before between SC makes global total order S).
367                    false
368                } else {
369                    true
370                };
371
372                true
373            })
374            .filter(|&store_elem| {
375                if is_seqcst && store_elem.is_seqcst {
376                    // An SC load needs to ignore all but last store maked SC (stores not marked SC are not
377                    // affected)
378                    let include = !found_sc;
379                    found_sc = true;
380                    include
381                } else {
382                    true
383                }
384            });
385
386        let chosen = candidates.choose(rng).expect("store buffer cannot be empty");
387        if std::ptr::eq(chosen, self.buffer.back().expect("store buffer cannot be empty")) {
388            (chosen, LoadRecency::Latest)
389        } else {
390            (chosen, LoadRecency::Outdated)
391        }
392    }
393
394    /// ATOMIC STORE IMPL in the paper (except we don't need the location's vector clock)
395    fn store_impl(
396        &mut self,
397        val: Scalar,
398        index: VectorIdx,
399        thread_clock: &VClock,
400        is_seqcst: bool,
401    ) {
402        let store_elem = StoreElement {
403            store_index: index,
404            timestamp: thread_clock[index],
405            // In the language provided in the paper, an atomic store takes the value from a
406            // non-atomic memory location.
407            // But we already have the immediate value here so we don't need to do the memory
408            // access.
409            val: Some(val),
410            is_seqcst,
411            load_info: RefCell::new(LoadInfo::default()),
412        };
413        if self.buffer.len() >= STORE_BUFFER_LIMIT {
414            self.buffer.pop_front();
415        }
416        self.buffer.push_back(store_elem);
417        if is_seqcst {
418            // Every store that happens before this needs to be marked as SC
419            // so that in a later SC load, only the last SC store (i.e. this one) or stores that
420            // aren't ordered by hb with the last SC is picked.
421            self.buffer.iter_mut().rev().for_each(|elem| {
422                if elem.timestamp <= thread_clock[elem.store_index] {
423                    elem.is_seqcst = true;
424                }
425            })
426        }
427    }
428}
429
430impl StoreElement {
431    /// ATOMIC LOAD IMPL in the paper
432    /// Unlike the operational semantics in the paper, we don't need to keep track
433    /// of the thread timestamp for every single load. Keeping track of the first (smallest)
434    /// timestamp of each thread that has loaded from a store is sufficient: if the earliest
435    /// load of another thread happens before the current one, then we must stop searching the store
436    /// buffer regardless of subsequent loads by the same thread; if the earliest load of another
437    /// thread doesn't happen before the current one, then no subsequent load by the other thread
438    /// can happen before the current one.
439    fn load_impl(
440        &self,
441        index: VectorIdx,
442        clocks: &ThreadClockSet,
443        is_seqcst: bool,
444    ) -> Option<Scalar> {
445        let mut load_info = self.load_info.borrow_mut();
446        load_info.sc_loaded |= is_seqcst;
447        let _ = load_info.timestamps.try_insert(index, clocks.clock[index]);
448        self.val
449    }
450}
451
452impl<'tcx> EvalContextExt<'tcx> for crate::MiriInterpCx<'tcx> {}
453pub(super) trait EvalContextExt<'tcx>: crate::MiriInterpCxExt<'tcx> {
454    fn buffered_atomic_rmw(
455        &mut self,
456        new_val: Scalar,
457        place: &MPlaceTy<'tcx>,
458        atomic: AtomicRwOrd,
459        init: Scalar,
460    ) -> InterpResult<'tcx> {
461        let this = self.eval_context_mut();
462        let (alloc_id, base_offset, ..) = this.ptr_get_alloc_id(place.ptr(), 0)?;
463        if let (
464            crate::AllocExtra {
465                data_race: AllocDataRaceHandler::Vclocks(_, Some(alloc_buffers)),
466                ..
467            },
468            crate::MiriMachine {
469                data_race: GlobalDataRaceHandler::Vclocks(global), threads, ..
470            },
471        ) = this.get_alloc_extra_mut(alloc_id)?
472        {
473            if atomic == AtomicRwOrd::SeqCst {
474                global.sc_read(threads);
475                global.sc_write(threads);
476            }
477            let range = alloc_range(base_offset, place.layout.size);
478            let buffer = alloc_buffers.get_or_create_store_buffer_mut(range, Some(init))?;
479            buffer.read_from_last_store(global, threads, atomic == AtomicRwOrd::SeqCst);
480            buffer.buffered_write(new_val, global, threads, atomic == AtomicRwOrd::SeqCst)?;
481        }
482        interp_ok(())
483    }
484
485    fn buffered_atomic_read(
486        &self,
487        place: &MPlaceTy<'tcx>,
488        atomic: AtomicReadOrd,
489        latest_in_mo: Scalar,
490        validate: impl FnOnce() -> InterpResult<'tcx>,
491    ) -> InterpResult<'tcx, Option<Scalar>> {
492        let this = self.eval_context_ref();
493        'fallback: {
494            if let Some(global) = this.machine.data_race.as_vclocks_ref() {
495                let (alloc_id, base_offset, ..) = this.ptr_get_alloc_id(place.ptr(), 0)?;
496                if let Some(alloc_buffers) =
497                    this.get_alloc_extra(alloc_id)?.data_race.as_weak_memory_ref()
498                {
499                    if atomic == AtomicReadOrd::SeqCst {
500                        global.sc_read(&this.machine.threads);
501                    }
502                    let mut rng = this.machine.rng.borrow_mut();
503                    let Some(buffer) = alloc_buffers
504                        .get_store_buffer(alloc_range(base_offset, place.layout.size))?
505                    else {
506                        // No old writes available, fall back to base case.
507                        break 'fallback;
508                    };
509                    let (loaded, recency) = buffer.buffered_read(
510                        global,
511                        &this.machine.threads,
512                        atomic == AtomicReadOrd::SeqCst,
513                        &mut *rng,
514                        validate,
515                    )?;
516                    if global.track_outdated_loads && recency == LoadRecency::Outdated {
517                        this.emit_diagnostic(NonHaltingDiagnostic::WeakMemoryOutdatedLoad {
518                            ptr: place.ptr(),
519                        });
520                    }
521
522                    return interp_ok(loaded);
523                }
524            }
525        }
526
527        // Race detector or weak memory disabled, simply read the latest value
528        validate()?;
529        interp_ok(Some(latest_in_mo))
530    }
531
532    /// Add the given write to the store buffer. (Does not change machine memory.)
533    ///
534    /// `init` says with which value to initialize the store buffer in case there wasn't a store
535    /// buffer for this memory range before.
536    fn buffered_atomic_write(
537        &mut self,
538        val: Scalar,
539        dest: &MPlaceTy<'tcx>,
540        atomic: AtomicWriteOrd,
541        init: Option<Scalar>,
542    ) -> InterpResult<'tcx> {
543        let this = self.eval_context_mut();
544        let (alloc_id, base_offset, ..) = this.ptr_get_alloc_id(dest.ptr(), 0)?;
545        if let (
546            crate::AllocExtra {
547                data_race: AllocDataRaceHandler::Vclocks(_, Some(alloc_buffers)),
548                ..
549            },
550            crate::MiriMachine {
551                data_race: GlobalDataRaceHandler::Vclocks(global), threads, ..
552            },
553        ) = this.get_alloc_extra_mut(alloc_id)?
554        {
555            if atomic == AtomicWriteOrd::SeqCst {
556                global.sc_write(threads);
557            }
558
559            let buffer = alloc_buffers
560                .get_or_create_store_buffer_mut(alloc_range(base_offset, dest.layout.size), init)?;
561            buffer.buffered_write(val, global, threads, atomic == AtomicWriteOrd::SeqCst)?;
562        }
563
564        // Caller should've written to dest with the vanilla scalar write, we do nothing here
565        interp_ok(())
566    }
567
568    /// Caller should never need to consult the store buffer for the latest value.
569    /// This function is used exclusively for failed atomic_compare_exchange_scalar
570    /// to perform load_impl on the latest store element
571    fn perform_read_on_buffered_latest(
572        &self,
573        place: &MPlaceTy<'tcx>,
574        atomic: AtomicReadOrd,
575    ) -> InterpResult<'tcx> {
576        let this = self.eval_context_ref();
577
578        if let Some(global) = this.machine.data_race.as_vclocks_ref() {
579            if atomic == AtomicReadOrd::SeqCst {
580                global.sc_read(&this.machine.threads);
581            }
582            let size = place.layout.size;
583            let (alloc_id, base_offset, ..) = this.ptr_get_alloc_id(place.ptr(), 0)?;
584            if let Some(alloc_buffers) =
585                this.get_alloc_extra(alloc_id)?.data_race.as_weak_memory_ref()
586            {
587                let Some(buffer) =
588                    alloc_buffers.get_store_buffer(alloc_range(base_offset, size))?
589                else {
590                    // No store buffer, nothing to do.
591                    return interp_ok(());
592                };
593                buffer.read_from_last_store(
594                    global,
595                    &this.machine.threads,
596                    atomic == AtomicReadOrd::SeqCst,
597                );
598            }
599        }
600        interp_ok(())
601    }
602}