Conversation
wangye805
left a comment
There was a problem hiding this comment.
Okay, after seeing ROCm/aiter@a64fa18...1b616b0#diff-fbc2a753d0cd5161aad3c68beb6eab5a577bae39fe4cb7b1af1c3b6c8ae963beR72, I sort of understand that aiter compile_ops provide hooks to modify the source generation, compilation flags, and so on in their json file.
In this case, how about move the archive (static lib) part into their compile.py? We can provide another option or env to their compile.py to go the static lib path. Therefore TE side is just loading their lib.a.
| AITER_EMBEDDED_HSA_HEADER=aiter_embedded_hsa.h | ||
| AITER_EMBEDDED_HSA_INCLUDE_DIR=${AITER_EMBEDDED_HSA_HEADER_DIR}) | ||
| else() | ||
| message(STATUS "[AITER-PREBUILT] No supported V3 ASM arch selected; skipping embedded HSA generation.") |
There was a problem hiding this comment.
[AITER-PREBUILT]? [AITER-BUILD]
transformer_engine/common/ck_fused_attn/generate_aiter_embedded_hsa.py
Outdated
Show resolved
Hide resolved
1c08539 to
38eb092
Compare
To do that, we'd need to essentially implement this PR on the AITER side anyways. We can't use their existing hooks exposed in |
There was a problem hiding this comment.
You probably don't need path splitting there either, it should work with DAITER_EMBEDDED_HSA_HEADER=\"{header_path_var}\"'
Also, I wonder if it can be overridden w/o AITERT scripts change with using HIPCC_COMPILE_FLAGS_APPEND or whatever
| --output ${AITER_EMBEDDED_HSA_HEADER} | ||
| --output ${AITER_EMBEDDED_HSA_HEADER_PATH} | ||
| --subdirs ${AITER_EMBEDDED_HSA_SUBDIRS} | ||
| RESULT_VARIABLE AITER_MAKE_HSA_RET |
There was a problem hiding this comment.
Nit: You can use COMMAND_ERROR_IS_FATAL ANY instead so CMake will fail itself on command failure
I'm not saying just building the static library in aiter optest compile.py. I'm referring to adding the archive part after lib.so's are finished building. Perhaps after https://github.com/ROCm/aiter/blob/9c2fef901dfbc1cc0e212038173d9d46160e3e12/op_tests/cpp/mha/compile.py#L98, then run the archive to get the lib.a |
| @@ -0,0 +1,151 @@ | |||
| #!/usr/bin/env python3 | |||
| # Copyright (c) Meta Platforms, Inc. and affiliates. | |||
| # Copyright (c) 2026, Advanced Micro Devices, Inc. All rights reserved. | |||
There was a problem hiding this comment.
Minimally it should say where is it copied from and refer to Pytorch for original file license
Description
Currently AITER relies on reading
AITER_ASM_DIRwhich is not in general thread-safe when we (and potentially others) modify the variable at runtime. AITER supports embedding the HSA table into a header which will be used directly instead of relying on runtime loading of the kernels viaAITER_ASM_DIR. This results in thread-safe operation, without significant cost.There is an AITER PR open for compatibility on their side, however that would still require us to use newer versions of AITER which is pending this PR.
Type of change
Changes
Please list the changes introduced in this PR:
generate_aiter_embedded_hsa.pyto generate header file for the HSA embeddingcompile.pywhich is pending a PR.Checklist: