Skip to content

Cranelift: inline probestack makes valgrind unhappy #7454

@narpfel

Description

@narpfel

I tried rustc_codegen_cranelift on some of my projects, and found that even though the binaries appeared to run normally, they produced errors and segfaults in valgrind. Looking at the disassembly, it appeared that valgrind doesn’t like the way Cranelift performs stack probing.

.clif Test Case

This is the most minimal Rust code that I came up with:

fn main() {
    let _xs: [u32; 5_000];
}

which generates the following .clif file:

output file

main.clif/_ZN4main4main17hf30ba8656d3abcbbE.unopt.clif

generated by rustc -Z codegen-backend=cranelift src/main.rs --emit=llvm-ir

set opt_level=none
set tls_model=elf_gd
set libcall_call_conv=isa_default
set probestack_size_log2=12
set probestack_strategy=inline
set bb_padding_log2_minus_one=0
set regalloc_checker=0
set regalloc_verbose_logs=0
set enable_alias_analysis=1
set enable_verifier=0
set is_pic=1
set use_colocated_libcalls=0
set enable_float=1
set enable_nan_canonicalization=0
set enable_pinned_reg=0
set enable_atomics=1
set enable_safepoints=0
set enable_llvm_abi_extensions=1
set unwind_info=1
set preserve_frame_pointers=0
set machine_code_cfg_info=0
set enable_probestack=1
set probestack_func_adjusts_sp=0
set enable_jump_tables=1
set enable_heap_access_spectre_mitigation=1
set enable_table_access_spectre_mitigation=1
set enable_incremental_compilation_cache_checks=0
target x86_64 has_sse3=1 has_ssse3=1 has_sse41=1 has_sse42=1 has_avx=0 has_avx2=0 has_fma=0 has_avx512bitalg=0 has_avx512dq=0 has_avx512vl=0 has_avx512vbmi=0 has_avx512f=0 has_popcnt=1 has_bmi1=0 has_bmi2=0 has_lzcnt=0


function u0:8() system_v {
; symbol _ZN4main4main17hf30ba8656d3abcbbE
; instance Instance { def: Item(DefId(0:3 ~ main[b61b]::main)), args: [] }
; abi FnAbi { args: [], ret: ArgAbi { layout: TyAndLayout { ty: (), layout: Layout { size: Size(0 bytes), align: AbiAndPrefAlign { abi: Align(1 bytes), pref: Align(8 bytes) }, abi: Aggregate { sized: true }, fields: Arbitrary { offsets: [], memory_index: [] }, largest_niche: None, variants: Single { index: 0 }, max_repr_align: None, unadjusted_abi_align: Align(1 bytes) } }, mode: Ignore }, c_variadic: false, fixed_count: 0, conv: Rust, can_unwind: true }

; kind  loc.idx   param    pass mode                            ty
; zst   _0    ()                                0b 1, 8              align=8,offset=
; ret   _0      -          Ignore                               ()

; kind  local ty                              size align (abi,pref)
; stack _1    [u32; 5000_usize]              20000b 4, 4              storage=ss0

    ss0 = explicit_slot 20000

block0:
    nop 
    jump block1

block1:
    nop 
; 
; return
    return
}

Steps to Reproduce

$ rustc -Z codegen-backend=cranelift src/main.rs
$ valgrind ./main

Expected Results

When run in valgrind, this program should not produce any errors.

Actual Results

valgrind complains about out-of-bounds stack writes and then lets the program segfault on a write to an unmapped address:

valgrind output
$ RUSTFLAGS="-Z codegen-backend=cranelift" cargo build
   Compiling project v0.1.0 (/tmp/project)
    Finished dev [unoptimized + debuginfo] target(s) in 0.23s
$ ./target/debug/project
$ echo $?
0
$ valgrind ./target/debug/project
==9258== Memcheck, a memory error detector
==9258== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==9258== Using Valgrind-3.21.0 and LibVEX; rerun with -h for copyright info
==9258== Command: ./target/debug/project
==9258==
==9258== Invalid write of size 4
==9258==    at 0x10F5FF: project::main (main.rs:1)
==9258==    by 0x10F68F: core::ops::function::FnOnce::call_once (function.rs:250)
==9258==    by 0x10F673: std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:154)
==9258==    by 0x10F720: std::rt::lang_start::{{closure}} (rt.rs:167)
==9258==    by 0x1260A6: call_once<(), (dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> (function.rs:284)
==9258==    by 0x1260A6: do_call<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> (panicking.rs:552)
==9258==    by 0x1260A6: try<i32, &(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> (panicking.rs:516)
==9258==    by 0x1260A6: catch_unwind<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> (panic.rs:142)
==9258==    by 0x1260A6: {closure#2} (rt.rs:148)
==9258==    by 0x1260A6: do_call<std::rt::lang_start_internal::{closure_env#2}, isize> (panicking.rs:552)
==9258==    by 0x1260A6: try<isize, std::rt::lang_start_internal::{closure_env#2}> (panicking.rs:516)
==9258==    by 0x1260A6: catch_unwind<std::rt::lang_start_internal::{closure_env#2}, isize> (panic.rs:142)
==9258==    by 0x1260A6: std::rt::lang_start_internal (rt.rs:148)
==9258==    by 0x10F6F5: std::rt::lang_start (rt.rs:166)
==9258==    by 0x10F5F5: main (in /tmp/project/target/debug/project)
==9258==  Address 0x1ffeffe5d0 is on thread 1's stack
==9258==  4096 bytes below stack pointer
==9258==
==9258== Invalid write of size 4
==9258==    at 0x10F606: project::main (main.rs:1)
==9258==    by 0x10F68F: core::ops::function::FnOnce::call_once (function.rs:250)
==9258==    by 0x10F673: std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:154)
==9258==    by 0x10F720: std::rt::lang_start::{{closure}} (rt.rs:167)
==9258==    by 0x1260A6: call_once<(), (dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> (function.rs:284)
==9258==    by 0x1260A6: do_call<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> (panicking.rs:552)
==9258==    by 0x1260A6: try<i32, &(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> (panicking.rs:516)
==9258==    by 0x1260A6: catch_unwind<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> (panic.rs:142)
==9258==    by 0x1260A6: {closure#2} (rt.rs:148)
==9258==    by 0x1260A6: do_call<std::rt::lang_start_internal::{closure_env#2}, isize> (panicking.rs:552)
==9258==    by 0x1260A6: try<isize, std::rt::lang_start_internal::{closure_env#2}> (panicking.rs:516)
==9258==    by 0x1260A6: catch_unwind<std::rt::lang_start_internal::{closure_env#2}, isize> (panic.rs:142)
==9258==    by 0x1260A6: std::rt::lang_start_internal (rt.rs:148)
==9258==    by 0x10F6F5: std::rt::lang_start (rt.rs:166)
==9258==    by 0x10F5F5: main (in /tmp/project/target/debug/project)
==9258==  Address 0x1ffeffd5d0 is on thread 1's stack
==9258==  8192 bytes below stack pointer
==9258==
==9258== Invalid write of size 4
==9258==    at 0x10F60D: project::main (main.rs:1)
==9258==    by 0x10F68F: core::ops::function::FnOnce::call_once (function.rs:250)
==9258==    by 0x10F673: std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:154)
==9258==    by 0x10F720: std::rt::lang_start::{{closure}} (rt.rs:167)
==9258==    by 0x1260A6: call_once<(), (dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> (function.rs:284)
==9258==    by 0x1260A6: do_call<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> (panicking.rs:552)
==9258==    by 0x1260A6: try<i32, &(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> (panicking.rs:516)
==9258==    by 0x1260A6: catch_unwind<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> (panic.rs:142)
==9258==    by 0x1260A6: {closure#2} (rt.rs:148)
==9258==    by 0x1260A6: do_call<std::rt::lang_start_internal::{closure_env#2}, isize> (panicking.rs:552)
==9258==    by 0x1260A6: try<isize, std::rt::lang_start_internal::{closure_env#2}> (panicking.rs:516)
==9258==    by 0x1260A6: catch_unwind<std::rt::lang_start_internal::{closure_env#2}, isize> (panic.rs:142)
==9258==    by 0x1260A6: std::rt::lang_start_internal (rt.rs:148)
==9258==    by 0x10F6F5: std::rt::lang_start (rt.rs:166)
==9258==    by 0x10F5F5: main (in /tmp/project/target/debug/project)
==9258==  Address 0x1ffeffc5d0 is not stack'd, malloc'd or (recently) free'd
==9258==
==9258==
==9258== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==9258==  Access not within mapped region at address 0x1FFEFFC5D0
==9258==    at 0x10F60D: project::main (main.rs:1)
==9258==    by 0x10F68F: core::ops::function::FnOnce::call_once (function.rs:250)
==9258==    by 0x10F673: std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:154)
==9258==    by 0x10F720: std::rt::lang_start::{{closure}} (rt.rs:167)
==9258==    by 0x1260A6: call_once<(), (dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> (function.rs:284)
==9258==    by 0x1260A6: do_call<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> (panicking.rs:552)
==9258==    by 0x1260A6: try<i32, &(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> (panicking.rs:516)
==9258==    by 0x1260A6: catch_unwind<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> (panic.rs:142)
==9258==    by 0x1260A6: {closure#2} (rt.rs:148)
==9258==    by 0x1260A6: do_call<std::rt::lang_start_internal::{closure_env#2}, isize> (panicking.rs:552)
==9258==    by 0x1260A6: try<isize, std::rt::lang_start_internal::{closure_env#2}> (panicking.rs:516)
==9258==    by 0x1260A6: catch_unwind<std::rt::lang_start_internal::{closure_env#2}, isize> (panic.rs:142)
==9258==    by 0x1260A6: std::rt::lang_start_internal (rt.rs:148)
==9258==    by 0x10F6F5: std::rt::lang_start (rt.rs:166)
==9258==    by 0x10F5F5: main (in /tmp/project/target/debug/project)
==9258==  If you believe this happened as a result of a stack
==9258==  overflow in your program's main thread (unlikely but
==9258==  possible), you can try to increase the size of the
==9258==  main thread stack using the --main-stacksize= flag.
==9258==  The main thread stack size used in this run was 8388608.
==9258==
==9258== HEAP SUMMARY:
==9258==     in use at exit: 85 bytes in 3 blocks
==9258==   total heap usage: 10 allocs, 7 frees, 2,157 bytes allocated
==9258==
==9258== LEAK SUMMARY:
==9258==    definitely lost: 0 bytes in 0 blocks
==9258==    indirectly lost: 0 bytes in 0 blocks
==9258==      possibly lost: 0 bytes in 0 blocks
==9258==    still reachable: 85 bytes in 3 blocks
==9258==         suppressed: 0 bytes in 0 blocks
==9258== Rerun with --leak-check=full to see details of leaked memory
==9258==
==9258== For lists of detected and suppressed errors, rerun with: -s
==9258== ERROR SUMMARY: 4 errors from 3 contexts (suppressed: 0 from 0)
[1]    9258 segmentation fault (core dumped)  valgrind ./target/debug/project

Versions and Environment

$ rustc --version --verbose -Z codegen-backend=cranelift
rustc 1.75.0-nightly (75b064d26 2023-11-01)
binary: rustc
commit-hash: 75b064d26970ca8e7a487072f51835ebb057d575
commit-date: 2023-11-01
host: x86_64-unknown-linux-gnu
release: 1.75.0-nightly
Cranelift version: 0.101.2

valgrind 3.21.0 (I realise this is not the current version, but I didn’t find anything related to stacks in the changelog for 3.22.0 in case this is a false positive in valgrind.)

Extra Info

This is the disassembly of main:

00000000000075fb <_ZN7project4main17ha1977755d345ddbbE>:
    75fb:	55                   	push   rbp
    75fc:	48 89 e5             	mov    rbp,rsp
    75ff:	89 a4 24 00 f0 ff ff 	mov    DWORD PTR [rsp-0x1000],esp
    7606:	89 a4 24 00 e0 ff ff 	mov    DWORD PTR [rsp-0x2000],esp
    760d:	89 a4 24 00 d0 ff ff 	mov    DWORD PTR [rsp-0x3000],esp
    7614:	89 a4 24 00 c0 ff ff 	mov    DWORD PTR [rsp-0x4000],esp
    761b:	89 a4 24 00 b0 ff ff 	mov    DWORD PTR [rsp-0x5000],esp
    7622:	48 81 ec 20 4e 00 00 	sub    rsp,0x4e20
    7629:	48 81 c4 20 4e 00 00 	add    rsp,0x4e20
    7630:	48 89 ec             	mov    rsp,rbp
    7633:	5d                   	pop    rbp
    7634:	c3                   	ret

valgrind doesn’t like that the stack is written to before the stack pointer is moved.

Inline stack probing was introduced in #4747. Only unrolled stack probing is problematic, the loop is okay for valgrind as the stack pointer is moved before the write.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIncorrect behavior in the current implementation that needs fixingcraneliftIssues related to the Cranelift code generator

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions