[topi] add ARM v8.2 udot (uint8) support#3978
Merged
yzhliu merged 8 commits intoapache:masterfrom Oct 1, 2019
Merged
Conversation
Contributor
|
Before merging, it would be good if we can try 2 more optimizations
|
Member
Author
|
@anijain2305 @zhiics please review again. |
zhiics
approved these changes
Sep 29, 2019
anijain2305
reviewed
Sep 30, 2019
anijain2305
reviewed
Sep 30, 2019
anijain2305
approved these changes
Sep 30, 2019
Contributor
anijain2305
left a comment
There was a problem hiding this comment.
LGTM. Minor comments.
Contributor
|
It will be good if we can share the performance speedup results. |
Member
Author
|
@tqchen could you check the ci instance? it shows "no space left" |
Member
Author
|
@anijain2305 The avg speedup is ~2.1x compared to fp32 |
Member
|
ci issue fixed |
Member
Author
|
Thanks @anijain2305 @zhiics @tqchen |
anijain2305
pushed a commit
to anijain2305/tvm
that referenced
this pull request
Oct 17, 2019
* [topi] add ARM v8.2 udot (uint8) support * fix test case * fix common conv2d schedule * add back fp32_time in test * fix lint * fix doc, add support for int32_lanes=4, signed int * fix lint * add ic_bn % 4 checker in schedule
wweic
pushed a commit
to neo-ai/tvm
that referenced
this pull request
Oct 18, 2019
* [topi] add ARM v8.2 udot (uint8) support * fix test case * fix common conv2d schedule * add back fp32_time in test * fix lint * fix doc, add support for int32_lanes=4, signed int * fix lint * add ic_bn % 4 checker in schedule
petrex
pushed a commit
to petrex/tvm
that referenced
this pull request
Oct 29, 2019
* master: Fix split's last factor issue (apache#4044) [COMMUNITY] ajtulloch -> committer (apache#4043) [TOPI]Add op argwhere (apache#3994) [topi] add ARM v8.2 udot (uint8) support (apache#3978) [COMMUNITY] anijain2305 -> reviewer (apache#4036) [QNN] Renaming dense operator. (apache#4033) [Relay][Compile_engine] Int64 shape handling for outputs. (apache#4031) Add dmlc-core to the list of installed header directories. (apache#4035) [ARITH] migrate indexdiv/mod to floordiv/mod (apache#4008) [Relay] Move prelude to text format (apache#3939) make tvm compilable by gcc 4.9.2 (apache#4032) [AUTOTVM][DOCS] Add a link to the defining network description of auto-tuning tutorial (apache#4023) [ARITH] cleanup the indexmod/div on python side (apache#4028) [Fix] Add more pad_mode support for onnx converter (apache#4029) Add parser support for ReLU tflite operator (apache#4022) Additional MXNet Convolution and Deconvolution tests (apache#4026) docs: minor spelling tweaks (apache#4027)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add uint8 intrinsic for ARM. Currently it is
udot.v2i32.v8i8which may have too small lanes. will add more later@anijain2305 @zhiics @vinx13 @ZihengJiang