You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sooo. Uh, remember when in halide#6655
we've agreed that we want to add support to precisely specify
the CPU for which the code should be *tuned* for,
but not *targeted* for. Aka, similar to clang's `-mtune=` option,
that does not affect the ISA set selection?
So guess what, that's not what we did, apparently.
`CodeGen_LLVM::mcpu()` / `halide_mcpu` actually do specify
the *target* CPU. It was obvious in retrospect, because e.g.
`CodeGen_X86::mattrs()` does not, in fact, ever specify `+avx2`,
yet we get AVX2 :) So we've unintentionally added `-march=` support.
Oops.
While i'd like to add `-march=` support, that was not the goal here.
Fixing this is complicated by the fact that
`llvm::Target::createTargetMachine()` only takes `CPU Target` string,
you can't specify `CPU Tune`.
But this is actually a blessing in disguise,
because it allows us to fix another bug at the same time:
There is a problem with halide "compile to llvm ir assembly",
a lot of information from Halide Target is not //really// lowered
into LLVM Module, but is embedded as a metadata,
that is then extracted by halide `make_target_machine()`.
While that is not a problem in itself, it makes it *impossible*
to dump the LLVM IR, and manually play with it,
because e.g. the CPU [Target] and Attributes (ISA set)
are not actually lowered into the form LLVM understands,
but are in some halide-specific metadata.
So, to fix the first bug, we must lower the CPU Tune
into per-function `"tune-cpu"` metadata,
and while there we might as well lower `"target-cpu"`
and `"target-features"` similarly.
0 commit comments