[TIR] VNNI and ARM dot product intrinsic for tensorization#10925
[TIR] VNNI and ARM dot product intrinsic for tensorization#10925vinx13 merged 13 commits intoapache:mainfrom
Conversation
|
|
||
|
|
||
| @T.prim_func | ||
| def dot_product_4x4_i8i8i32_neon( |
There was a problem hiding this comment.
This is equivalent to the TE one in
tvm/python/tvm/topi/arm_cpu/tensor_intrin.py
Line 536 in 912993f
|
|
||
|
|
||
| @T.prim_func | ||
| def dot_product_4x4_i8i8i32_sdot( |
There was a problem hiding this comment.
This is equivalent to the TE one in
tvm/python/tvm/topi/arm_cpu/tensor_intrin.py
Line 431 in 912993f
|
|
||
|
|
||
| @T.prim_func | ||
| def dot_product_16x4_u8i8i32_vnni( |
There was a problem hiding this comment.
Equivalent to the TE one in
tvm/python/tvm/topi/x86/tensor_intrin.py
Line 244 in 912993f
|
|
||
|
|
||
| # TODO(masahi): Parametrize the TVMScript description of dot product by | ||
| # shape and dtype, and share the common description with x86. |
There was a problem hiding this comment.
cc @junrushao1994 @yelite, this is one of the common needs for meta programming support in TVMScript. I think shape parameterization is possible via specialize, but not sure if I can use that with T.Buffer syntax sugar.
A similar need arises for tensorcore (different mma shape x data type)
|
|
||
| vec_b = B.vload([0, 0], dtype="int8x16") | ||
|
|
||
| # TODO(masahi): Remove duplication when inlined function call is supported |
There was a problem hiding this comment.
cc @junrushao1994 @yelite, I want to define and call a convenience function like
tvm/python/tvm/topi/arm_cpu/tensor_intrin.py
Line 625 in 912993f
There was a problem hiding this comment.
Yes, it is likely doable and @Hzfengsy probably already has something ready
82e152a to
07bbb38
Compare
|
CC @vinx13 would you like to review? Thanks a lot! |
Introduces a new directory
python/tvm/tir/tensor_intrinwhere we put intrinsic descriptions written in TVMScript for various HW targets. They can be used by manual tensorized TIR schedules or auto-tensorized ones. More intrinsics, such as tensorcore ones, DP4A etc will be added later.@junrushao1994 @vinx13 @shingjan @Hzfengsy @spectrometerHBH