[OpenMP] Fix various alignment issues#142376
Conversation
When running the `openmp` testsuite on 32-bit SPARC, several tests `FAIL` apparently randomly, but always with the same kind of error: ``` # error: command failed with exit status: -11 ``` The tests die with `SIGBUS`, as can be seen in `truss` output: ``` 26461/1: Incurred fault llvm#5, FLTACCESS %pc = 0x00010EAC 26461/1: siginfo: SIGBUS BUS_ADRALN addr=0x0013D12C 26461/1: Received signal llvm#10, SIGBUS [default] 26461/1: siginfo: SIGBUS BUS_ADRALN addr=0x0013D12C ``` i.e. the code is trying an unaligned access which cannot work on SPARC, a strict-alignment target which enforces natural alignment on access. This explains the apparent randomness of the failures: if the memory happens to be aligned appropriately, the tests work, but fail if not. A `Debug` build reveals much more: - `__kmp_alloc` currently aligns to `sizeof(void *)`, which isn't enough on strict-alignment targets when the data are accessed as types requiring larger alignment. Therefore, this patch increases `alignment` to `SizeQuant`. - 32-bit Solaris/sparc `libc` guarantees 8-byte alignment from `malloc`, so this patch adjusts `SizeQuant` to match. - There's a `SIGBUS` in ``` __kmpc_fork_teams (loc=0x112f8, argc=0, microtask=0x16cc8 <__omp_offloading_ffbc020a_4b1abe_main_l9_debug__.omp_outlined>) at openmp/runtime/src/kmp_csupport.cpp:573 573 *(kmp_int64 *)(&this_thr->th.th_teams_size) = 0L; ``` Casting to a pointer to a type requiring 64-bit alignment when that isn't guaranteed is wrong. Instead, this patch uses `memset` instead. - There's another `SIGBUS` in ``` 0xfef8cb9c in __kmp_taskloop_recur (loc=0x10cb8, gtid=0, task=0x23cd00, lb=0x23cd18, ub=0x23cd20, st=1, ub_glob=499, num_tasks=100, grainsize=5, extras=0, last_chunk=0, tc=500, num_t_min=20, codeptr_ra=0xfef8dbc8 <__kmpc_taskloop(ident_t*, int, kmp_task_t*, int, kmp_uint64*, kmp_uint64*, kmp_int64, int, int, kmp_uint64, void*)+240>, task_dup=0x0) at openmp/runtime/src/kmp_tasking.cpp:5147 5147 p->st = st; ``` `p->st` doesn't currently guarantee the 8-byte alignment required by `kmp_int64 st`. `p` is set in ``` __taskloop_params_t *p = (__taskloop_params_t *)new_task->shareds; ``` but `shareds_offset` is currently `sizeof(void *)` only. Increasing it to `sizeof(kmp_uint64)` to match its use fixes the `SIGBUS`. With these fixes I get clean `openmp` test results on 32-bit SPARC (both Solaris and Linux), with one unrelated exception. Tested on `sparc-sun-solaris2.11`, `sparcv9-sun-solaris2.11`, `sparc-unknown-linux-gnu`, `sparc64-unknown-linux-gnu`, `i386-pc-solaris2.11`, `amd64-pc-solaris2.11`, `i686-pc-linux-gnu`, and `x86_64-pc-linux-gnu`.
| ensure 16 byte alignment */ | ||
|
|
||
| #if KMP_ARCH_X86 || !KMP_HAVE_QUAD | ||
| #if KMP_ARCH_X86 || KMP_ARCH_SPARC || !KMP_HAVE_QUAD |
There was a problem hiding this comment.
Do we want to add anything to the comment above this?
There was a problem hiding this comment.
I found this is complete mess: glibc sysdeps/i386/malloc-alignment. does #define MALLOC_ALIGNMENT 16, contrary to the comment (haven't checked if this has changed historically). Besides, KMP_HAVE_QUAD is a complete misnomer: it's only defined on x86 targets, so should probably be renamed to KMP_X86_HAVE_QUAD to reflect that.
There was a problem hiding this comment.
I've updated this comment now, removing the incorrect Linux reference.
As it turns out, malloc alignment is a total can of worms:
- In theory, the address returned should be aligned to
alignof(max_align_t). - However, implementations often differ. E.g. Solaris 32-bit
libcmalloconly uses 8-byte alignment. At the time this was introduced,__float128didn't exist on x86 yet and it was considered too risky to raise this later for fear of breaking binary compatiblity. Similarly, SPARC uses 8-byte alignment, too, although one could argue that this should be 16 bytes due to the use of 128-bit long double. - Many platforms have different
mallocimplementions that often differ from the one inlibc. - The actual alignment cannot be queried at build or runtime, unfortunately.
That said, it might be prudent to switch to one of memalign, posix_memalign, or _aligned_malloc. GCC's libgomp (in libgomp/alloc.c:gomp_aligned_alloc) went that route, e.g.
|
No, its alignment is the alignment of the member with the strictest alignment requirement (i.e. |
shiltian
left a comment
There was a problem hiding this comment.
LGTM but I'd like @jprotze @jpeyton52 to have a second look.
mjklemm
left a comment
There was a problem hiding this comment.
LGTM, @jpeyton52 please add your blessing, if you're OK with this PR.
openmp/runtime/src/kmp_alloc.cpp
Outdated
| kmp_allocator_t *allocator; // allocator | ||
| } kmp_mem_desc_t; | ||
| static int alignment = sizeof(void *); // align to pointer size by default | ||
| static int alignment = SizeQuant; |
There was a problem hiding this comment.
I cannot find any assignment to this variable. What is the purpose of having this as a static?
There was a problem hiding this comment.
As @jprotze notes we can get rid of this global and make it a constexpr size_t:
| static int alignment = SizeQuant; | |
| constexpr size_t alignment = SizeQuant; |
It is only used once down below in the __kmp_alloc() function (line 1929).
openmp/runtime/src/kmp_alloc.cpp
Outdated
| kmp_allocator_t *allocator; // allocator | ||
| } kmp_mem_desc_t; | ||
| static int alignment = sizeof(void *); // align to pointer size by default | ||
| static int alignment = SizeQuant; |
There was a problem hiding this comment.
As @jprotze notes we can get rid of this global and make it a constexpr size_t:
| static int alignment = SizeQuant; | |
| constexpr size_t alignment = SizeQuant; |
It is only used once down below in the __kmp_alloc() function (line 1929).
- Switch `alignment` to `constexpr size_t`.
|
I believe we're good now? |
When running the `openmp` testsuite on 32-bit SPARC, several tests `FAIL` apparently randomly, but always with the same kind of error: ``` # error: command failed with exit status: -11 ``` The tests die with `SIGBUS`, as can be seen in `truss` output: ``` 26461/1: Incurred fault llvm#5, FLTACCESS %pc = 0x00010EAC 26461/1: siginfo: SIGBUS BUS_ADRALN addr=0x0013D12C 26461/1: Received signal llvm#10, SIGBUS [default] 26461/1: siginfo: SIGBUS BUS_ADRALN addr=0x0013D12C ``` i.e. the code is trying an unaligned access which cannot work on SPARC, a strict-alignment target which enforces natural alignment on access. This explains the apparent randomness of the failures: if the memory happens to be aligned appropriately, the tests work, but fail if not. A `Debug` build reveals much more: - `__kmp_alloc` currently aligns to `sizeof(void *)`, which isn't enough on strict-alignment targets when the data are accessed as types requiring larger alignment. Therefore, this patch increases `alignment` to `SizeQuant`. - 32-bit Solaris/sparc `libc` guarantees 8-byte alignment from `malloc`, so this patch adjusts `SizeQuant` to match. - There's a `SIGBUS` in ``` __kmpc_fork_teams (loc=0x112f8, argc=0, microtask=0x16cc8 <__omp_offloading_ffbc020a_4b1abe_main_l9_debug__.omp_outlined>) at openmp/runtime/src/kmp_csupport.cpp:573 573 *(kmp_int64 *)(&this_thr->th.th_teams_size) = 0L; ``` Casting to a pointer to a type requiring 64-bit alignment when that isn't guaranteed is wrong. Instead, this patch uses `memset` instead. - There's another `SIGBUS` in ``` 0xfef8cb9c in __kmp_taskloop_recur (loc=0x10cb8, gtid=0, task=0x23cd00, lb=0x23cd18, ub=0x23cd20, st=1, ub_glob=499, num_tasks=100, grainsize=5, extras=0, last_chunk=0, tc=500, num_t_min=20, codeptr_ra=0xfef8dbc8 <__kmpc_taskloop(ident_t*, int, kmp_task_t*, int, kmp_uint64*, kmp_uint64*, kmp_int64, int, int, kmp_uint64, void*)+240>, task_dup=0x0) at openmp/runtime/src/kmp_tasking.cpp:5147 5147 p->st = st; ``` `p->st` doesn't currently guarantee the 8-byte alignment required by `kmp_int64 st`. `p` is set in ``` __taskloop_params_t *p = (__taskloop_params_t *)new_task->shareds; ``` but `shareds_offset` is currently aligned to `sizeof(void *)` only. Increasing it to `sizeof(kmp_uint64)` to match its use fixes the `SIGBUS`. With these fixes I get clean `openmp` test results on 32-bit SPARC (both Solaris and Linux), with one unrelated exception. Tested on `sparc-sun-solaris2.11`, `sparcv9-sun-solaris2.11`, `sparc-unknown-linux-gnu`, `sparc64-unknown-linux-gnu`, `i386-pc-solaris2.11`, `amd64-pc-solaris2.11`, `i686-pc-linux-gnu`, and `x86_64-pc-linux-gnu`.
When running the
openmptestsuite on 32-bit SPARC, several testsFAILapparently randomly, but always with the same kind of error:The tests die with
SIGBUS, as can be seen intrussoutput:i.e. the code is trying an unaligned access which cannot work on SPARC, a strict-alignment target which enforces natural alignment on access. This explains the apparent randomness of the failures: if the memory happens to be aligned appropriately, the tests work, but fail if not.
A
Debugbuild reveals much more:__kmp_alloccurrently aligns tosizeof(void *), which isn't enough on strict-alignment targets when the data are accessed as types requiring larger alignment. Therefore, this patch increasesalignmenttoSizeQuant.32-bit Solaris/sparc
libcguarantees 8-byte alignment frommalloc, so this patch adjustsSizeQuantto match.There's a
SIGBUSinCasting to a pointer to a type requiring 64-bit alignment when that isn't guaranteed is wrong. Instead, this patch uses
memsetinstead.There's another
SIGBUSinp->stdoesn't currently guarantee the 8-byte alignment required bykmp_int64 st.pis set inbut
shareds_offsetis currently aligned tosizeof(void *)only. Increasing it tosizeof(kmp_uint64)to match its use fixes theSIGBUS.With these fixes I get clean
openmptest results on 32-bit SPARC (both Solaris and Linux), with one unrelated exception.Tested on
sparc-sun-solaris2.11,sparcv9-sun-solaris2.11,sparc-unknown-linux-gnu,sparc64-unknown-linux-gnu,i386-pc-solaris2.11,amd64-pc-solaris2.11,i686-pc-linux-gnu, andx86_64-pc-linux-gnu.