Skip to content

Add read-only data support for x86 of ELF#242

Merged
jargh merged 86 commits into
awslabs:mainfrom
aqjune-aws:x86-elf
May 18, 2026
Merged

Add read-only data support for x86 of ELF#242
jargh merged 86 commits into
awslabs:mainfrom
aqjune-aws:x86-elf

Conversation

@aqjune-aws
Copy link
Copy Markdown
Collaborator

@aqjune-aws aqjune-aws commented Jun 15, 2025

Issue #, if available:

This pull request adds the read-only data section support for the ELF format in the x86-64 architecture.

The overall patch structure is similar to #207 (ELF, Arm64) and #226 (Mach-O, Arm64). However, the high-level structure is slightly simpler than Arm64 patches because one relocation type is necessary whereas the arm case needed 3. What made it even simpler is that the unique relocation type was already implemented in s2n-bignum :)

As the previous Arm64 patches, this adds a tutorial file that illustrates (1) how read-only data can be read and (2) a function that uses the read-only data can be called.
As the Arm64 implementation had adrp_within_bounds, this patch also has riprel32_within_bounds which states that the distance between symbol and RIP must fit within 2^31.

Another small issue is that, the rodata tutorial is not runnable on MacOS (x86-64). This is because Mac's clang thinks that when a static function is called, its destination can be updated after dynamic linking. In Linux, a call to a static function cannot be relocated after dynamic linking (making the tutorial work), whereas call to a global function can be relocated after it.
This implies that proofs may change depending on whether the OS is Linux x86-64 or MacOS x86-64. However, I believe that users of s2n-bignum mainly uses Linux(x86-64), Linux(arm64) or MacOS(arm64) but not MacUS(x86-64) now. :)

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@aqjune-aws aqjune-aws marked this pull request as ready for review June 16, 2025 14:43
Comment thread common/misc.ml
Comment thread x86/proofs/decode.ml
Comment thread x86/proofs/x86.ml
@aqjune-aws aqjune-aws requested a review from jargh June 16, 2025 16:35
@jargh
Copy link
Copy Markdown
Contributor

jargh commented May 16, 2026

This is great! Thank you for not only providing the foundation but also validating that it works well in practice.

I only noticed one small thing, a build failure on Mac OS for make tutorial inside x86. Obviously this is not a very typical thing to do now that x86 Macs are on the way out, but if one wants to explore the proofs for x86 on a Mac it is something we'd want to support. I see:

$ make tutorial
cat tutorial/rodata.S | sed -e 's/\/\/.*//' | cc -E -I../include -DWINDOWS_ABI=0  -xassembler-with-cpp - | tr ';' '\n' | as -arch x86_64 -o tutorial/rodata.o -
<stdin>:20:3: error: unknown directive
  .type f, %function
  ^
<stdin>:30:3: error: unknown directive
  .type g, %function
  ^
<stdin>:39:19: error: unexpected token in '.section' directive
  .section .rodat...

I suspect we just need to guard the .type etc. with a suitable #if, as in the main code and the corresponding arm/tutorial/rodata.S file here:

#if defined(__ELF__)
.section  .rodata
  .global  x
  .type  x, %object
  .size  x, 40
#elif defined(__APPLE__)
.const_data
#endif

@aqjune-aws
Copy link
Copy Markdown
Collaborator Author

Hi John, the compilation error can be resolved, but I think there is one more slightly complicated issue about discrepancy between Linux x86 assembler vs. Apple x86 assembler.

The failure comes from the fact that rodata.S has a function call from g to f, and Apple x86 assembler always introduces a relocation entry for function calls, whereas Linux x86 assembler doesn't introduce a relocation entry.
This makes the rodata.ml will have to diverge between Linux and MacOS version.

Specifically, in rodata.S:

f:         
  ...

g:
  ...
  call   f <- this one

Since rodata.S already has a definition of f, the Linux assembler considers call f as already resolved, not introducing an relocation entry. However, MacOS's assembler is more conservative and always introduces a relocation entry. objdump -r rodata.o on MacOS x86 will show something like:

rodata_mac.o:	file format mach-o 64-bit x86-64

RELOCATION RECORDS FOR [__text]:
OFFSET           TYPE                     VALUE
0000000000000025 X86_64_RELOC_BRANCH      f <--- this!
0000000000000018 X86_64_RELOC_SIGNED      z
000000000000000a X86_64_RELOC_SIGNED      y
0000000000000003 X86_64_RELOC_SIGNED      x

But on x86:

rodata.o:     file format elf64-x86-64

RELOCATION RECORDS FOR [.text]:
OFFSET           TYPE              VALUE
0000000000000003 R_X86_64_PC32     x-0x0000000000000004
000000000000000a R_X86_64_PC32     y-0x0000000000000004
0000000000000018 R_X86_64_PC32     z-0x0000000000000004

We can have two tutorials on x86 MacOS vs. x86 Linux, but then a question is how to deal with x86 subroutines that will have rodata + calls in general. Luckily curve25519 and edwards25519 assembly subroutines don't seem to have function calls, but they can exist in other operations in the future.

Copy link
Copy Markdown
Contributor

@jargh jargh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thank you! It's great to have all the machinery for .rodata sections set up on both architectures now, and to have clear validation that it works well in practice.

@jargh jargh merged commit 4d9b2c1 into awslabs:main May 18, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants