Skip to content

cranelift: Add COFF TLS Support#4546

Merged
jameysharp merged 3 commits into
bytecodealliance:mainfrom
afonso360:x64-tls-coff
Aug 11, 2022
Merged

cranelift: Add COFF TLS Support#4546
jameysharp merged 3 commits into
bytecodealliance:mainfrom
afonso360:x64-tls-coff

Conversation

@afonso360
Copy link
Copy Markdown
Contributor

👋 Hey,

This PR is an initial draft at adding support for resolving TLS variables in COFF file formats (Windows). This is the current main blocker on having cg_clif working on windows.

There are still some pending issues in this PR:

  • Do we need to mark the gs register as used?
  • Declaring the _tls_index symbol. No idea if it is correct or not.

It is my understanding (mostly based on #1885) that we still need to add COFF TLS support to the object crate for this to become functional.

I'm going to look into modifying the object crate to see how easy/hard its going to be to add support for this, but would really appreciate any help.


Based on my tests compiling cranelift/filetests/filetests/isa/x64/tls_coff.clif we emit pretty much the equivalent machine code and relocations as rustc does.

Disassembly of rustc output:
0000000000000000 <getaddr>:
     0: 8b 05 00 00 00 00            	movl	(%rip), %eax  # 6 <getaddr+0x6>
  	0000000000000002:  IMAGE_REL_AMD64_REL32	_tls_index
     6: 65 48 8b 0c 25 58 00 00 00   	movq	%gs:88, %rcx
     f: 48 8b 04 c1                  	movq	(%rcx,%rax,8), %rax
    13: 48 8d 80 00 00 00 00         	leaq	(%rax), %rax
  	0000000000000016:  IMAGE_REL_AMD64_SECREL	tls
    1a: c3                           	retq
Disassembly of tls_coff.clif output:
0000000000000000 <u0:0>:
     0: 55                            pushq   %rbp
     1: 48 89 e5                      movq    %rsp, %rbp
     4: 8b 05 00 00 00 00             movl    (%rip), %eax  # a <u0:0+0xa>
              0000000000000006:  IMAGE_REL_AMD64_REL32        _tls_index
     a: 65 48 8b 0c 25 58 00 00 00    movq    %gs:88, %rcx
    13: 48 8b 04 c1                   movq    (%rcx,%rax,8), %rax
    17: 48 8d 80 00 00 00 00          leaq    (%rax), %rax
              000000000000001a:  IMAGE_REL_AMD64_SECREL       .Ldata0
    1e: 48 89 ec                      movq    %rbp, %rsp
    21: 5d                            popq    %rbp
    22: c3                            retq

I think the only difference there is the target of the relocation, because in the tls_coff example I didn't name the symbol.

cc: @bjorn3 @cfallin
cc: #1885

@github-actions github-actions Bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen cranelift:module labels Jul 28, 2022
@bjorn3
Copy link
Copy Markdown
Contributor

bjorn3 commented Jul 28, 2022

Do we need to mark the gs register as used?

The gs register is not allocatable by regalloc, so it doesn't need to be marked as usable.

Comment thread cranelift/codegen/src/isa/x64/lower.rs Outdated
Comment thread cranelift/object/src/backend.rs Outdated
value: 0,
size: 32,
// TODO: Is this correct?
kind: SymbolKind::Tls,
Copy link
Copy Markdown
Contributor

@bjorn3 bjorn3 Jul 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did expect it to not be correct.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like object doesn't distinguish between text, data and tls for COFF: https://github.com/gimli-rs/object/blob/404ae26dac1eafc82ecad4f02ba8ccda57f7cb27/src/write/coff.rs#L597 And I couldn't find anywhere COFF distinguishes between data and tls undefined symbols.

Comment thread cranelift/object/src/backend.rs Outdated
@bjorn3
Copy link
Copy Markdown
Contributor

bjorn3 commented Jul 28, 2022

Thanks a lot for working on this!

@bjorn3
Copy link
Copy Markdown
Contributor

bjorn3 commented Jul 28, 2022

#![feature(thread_local)]                                                                                                                          1 ↵ bjorn@laptopbjorn-lenovo

use std::cell::Cell;

#[thread_local]
pub static FOO: Cell<u8> = Cell::new(42);
pub static mut BAR: u8 = 43;  
pub fn get_foo() -> u8 {
    FOO.get()
}

pub fn set_foo(foo: u8) {
    FOO.set(foo)
}' | rustc +nightly --target x86_64-pc-windows-msvc --emit obj - --crate-type lib -O

produces an object file for which objdump -dr gives:

rust_out.o:     file format pe-x86-64


Disassembly of section .text:

0000000000000000 <_ZN8rust_out7get_foo17h9e4ed845564f4123E>:
   0:   8b 05 00 00 00 00       mov    0x0(%rip),%eax        # 6 <_ZN8rust_out7get_foo17h9e4ed845564f4123E+0x6>
                        2: R_X86_64_PC32        _tls_index
   6:   65 48 8b 0c 25 58 00    mov    %gs:0x58,%rcx
   d:   00 00 
   f:   48 8b 04 c1             mov    (%rcx,%rax,8),%rax
  13:   8a 80 00 00 00 00       mov    0x0(%rax),%al
                        15: secrel32    _ZN8rust_out3FOO17he424d0489633cb80E
  19:   c3                      retq   

Disassembly of section .text:

0000000000000000 <_ZN8rust_out7set_foo17hb5fb9735547d2b58E>:
   0:   8b 05 00 00 00 00       mov    0x0(%rip),%eax        # 6 <_ZN8rust_out7set_foo17hb5fb9735547d2b58E+0x6>
                        2: R_X86_64_PC32        _tls_index
   6:   65 48 8b 14 25 58 00    mov    %gs:0x58,%rdx
   d:   00 00 
   f:   48 8b 04 c2             mov    (%rdx,%rax,8),%rax
  13:   88 88 00 00 00 00       mov    %cl,0x0(%rax)
                        15: secrel32    _ZN8rust_out3FOO17he424d0489633cb80E
  19:   c3                      retq   

and objdump --syms:

rust_out.o:     file format pe-x86-64

SYMBOL TABLE:
[  0](sec  1)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .text
AUX scnlen 0x0 nreloc 0 nlnno 0 checksum 0x0 assoc 1 comdat 0
[  2](sec  2)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .data
AUX scnlen 0x0 nreloc 0 nlnno 0 checksum 0x0 assoc 2 comdat 0
[  4](sec  3)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .bss
AUX scnlen 0x0 nreloc 0 nlnno 0 checksum 0x0 assoc 3 comdat 0
[  6](sec  4)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .text
AUX scnlen 0x1a nreloc 2 nlnno 0 checksum 0xa84d5eee assoc 4 comdat 1
[  8](sec  4)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x0000000000000000 _ZN8rust_out7get_foo17h9e4ed845564f4123E
[  9](sec  5)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .text
AUX scnlen 0x1a nreloc 2 nlnno 0 checksum 0xac4f1caa assoc 5 comdat 1
[ 11](sec  5)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x0000000000000000 _ZN8rust_out7set_foo17hb5fb9735547d2b58E
[ 12](sec  6)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .tls$
AUX scnlen 0x1 nreloc 0 nlnno 0 checksum 0xdbbbc9d6 assoc 6 comdat 1
[ 14](sec  6)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x0000000000000000 _ZN8rust_out3FOO17he424d0489633cb80E
[ 15](sec  7)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .data
AUX scnlen 0x1 nreloc 0 nlnno 0 checksum 0xacbcf940 assoc 7 comdat 1
[ 17](sec  7)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x0000000000000000 _ZN8rust_out3BAR17hab28e85eef212a9aE
[ 18](sec  8)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .data
AUX scnlen 0x8 nreloc 1 nlnno 0 checksum 0x0 assoc 8 comdat 1
[ 20](sec  8)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x0000000000000000 __imp__ZN8rust_out3FOO17he424d0489633cb80E
[ 21](sec  9)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .data
AUX scnlen 0x8 nreloc 1 nlnno 0 checksum 0x0 assoc 9 comdat 1
[ 23](sec  9)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x0000000000000000 __imp__ZN8rust_out3BAR17hab28e85eef212a9aE
[ 24](sec -1)(fl 0x00)(ty   0)(scl   3) (nx 0) 0x0000000000000000 @feat.00
[ 25](sec  0)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x0000000000000000 _tls_index
[ 26](sec -2)(fl 0x00)(ty   0)(scl 103) (nx 2) 0x0000000000000000 rust_out.cb515bd7-
File 
File 

Comment thread cranelift/object/src/backend.rs Outdated
@bjorn3
Copy link
Copy Markdown
Contributor

bjorn3 commented Jul 28, 2022

I managed to run a small tls example on windows! 🎉

#![feature(thread_local)]
use std::cell::Cell;
#[thread_local]
static FOO: Cell<u8> = Cell::new(42);
fn main() {
    assert_eq!(FOO.get(), 42);
    std::thread::spawn(|| {
        FOO.set(43);
        assert_eq!(FOO.get(), 43);
    }).join();
    assert_eq!(FOO.get(), 42);
}

@afonso360
Copy link
Copy Markdown
Contributor Author

afonso360 commented Jul 28, 2022

I managed to run a small tls example on windows! 🎉

That's really cool! I'm going to try and compile cg_clif with this, to see where it breaks next. I suspect we don't have a very complete TLS implementation.

@afonso360 afonso360 force-pushed the x64-tls-coff branch 2 times, most recently from 9cc2c07 to 5e62aa9 Compare July 28, 2022 17:51
@afonso360 afonso360 marked this pull request as ready for review July 28, 2022 17:53
@afonso360
Copy link
Copy Markdown
Contributor Author

afonso360 commented Aug 10, 2022

Rebased this on top of main. I've also renamed the TlsIndex symbol to CoffTlsIndex.

With this rebase we've also lost the lookup function. But we can add it later when we need it in the JIT, is this okay @bjorn3 ?

@bjorn3
Copy link
Copy Markdown
Contributor

bjorn3 commented Aug 10, 2022

Fine with me.

@jameysharp
Copy link
Copy Markdown
Contributor

Do you both feel this is ready for merge, then? I've glanced over it and it looks reasonable to me, but I'm not at all familiar with how thread-local storage works on any platform.

@afonso360
Copy link
Copy Markdown
Contributor Author

Yes, this should be ready to merge!

@jameysharp jameysharp enabled auto-merge (squash) August 11, 2022 16:33
Copy link
Copy Markdown
Contributor

@jameysharp jameysharp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

@jameysharp jameysharp merged commit c5bc368 into bytecodealliance:main Aug 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:area:x64 Issues related to x64 codegen cranelift:module cranelift Issues related to the Cranelift code generator

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants