Skip to content

Potentially-observable store gets elided: asm block does not act as a compiler fence #144351

@RalfJung

Description

@RalfJung

I tried this code:

#![feature(atomic_from_mut)]

use std::arch::asm;
use std::sync::atomic::*;

unsafe extern "C" {
    fn bar();
}
#[unsafe(no_mangle)]
pub unsafe fn foo(y: *mut *mut i32) {
    let mut val: i32 = 0;
    let x: *mut i32 = &mut val;

    *x = 42; 
    asm!(""); // compiler_fence(Ordering::Release);
    AtomicPtr::from_mut(&mut *y).store(x, Ordering::Relaxed);
    loop {}
}

I would have expected the generated code to still contain the store of 42 in x, since the empty asm block must be considered to be a release fence by LLVM -- and then some other thread might wait for the value of y to change, and read through that pointer.

However, the resulting LLVM IR after optimizations is (godbolt: https://rust.godbolt.org/z/r86soTWaa):

define void @foo(ptr nocapture noundef writeonly %y) unnamed_addr {
start:
  %val = alloca [4 x i8], align 4
  call void @llvm.lifetime.start.p0(i64 4, ptr nonnull %val)
  tail call void asm sideeffect alignstack inteldialect "", "~{dirflag},~{fpsr},~{flags},~{memory}"() #2
  store atomic ptr %val, ptr %y monotonic, align 8
  br label %bb2

bb2:
  br label %bb2
}

If I replace the asm block by an actual fence or even just a compiler_fence, the result looks as expected:

define void @foo(ptr nocapture noundef writeonly %y) unnamed_addr {
start:
  %val = alloca [4 x i8], align 4
  call void @llvm.lifetime.start.p0(i64 4, ptr nonnull %val)
  store i32 42, ptr %val, align 4
  fence syncscope("singlethread") release
  store atomic ptr %val, ptr %y monotonic, align 8
  br label %bb1

bb1:
  br label %bb1
}

This is based on an example by @Amanieu.

@nikic this is a bug, right? Surely an empty asm block with all clobbers must have at least all the effects of an arbitrary compiler fence?

@fg-cfh points out that the LLVM documentation for the memory clobber just says that the asm block can write arbitrary memory, but does not mention reading memory. The corresponding docs for GCC also mention reading. I assume this can only be an oversight, since surely in general asm! blocks are allowed to read memory they have access to.

However, I think this problem here is more subtle: the asm block couldn't read from *x after all, since at the point the asm block runs, neither x nor val have been escaped. The issue is that reading and writing are not the only things one can do with memory. We have to also consider "synchronization effects", such as release/acquire fences, which are not associated with a memory access at all -- they change some other part of global state which records which memory events the next relaxed store can "publish" to other threads (and which events the next acquire fences "receives" into the current thread).

Cc @rust-lang/opsem

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-bugCategory: This is a bug.I-unsoundIssue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/SoundnessP-highHigh priorityT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions