[topi] add ARM v8.2 udot (uint8) support by yzhliu · Pull Request #3978 · apache/tvm

yzhliu · 2019-09-20T02:29:09Z

Add uint8 intrinsic for ARM. Currently it is udot.v2i32.v8i8 which may have too small lanes. will add more later

@anijain2305 @zhiics @vinx13 @ZihengJiang

zhiics

Left a few minors, other LGTM

topi/python/topi/arm_cpu/tensor_intrin.py

topi/python/topi/generic/conv2d.py

anijain2305 · 2019-09-23T02:01:35Z

Before merging, it would be good if we can try 2 more optimizations

Currently, udot seems to be little slow (~1x speedup). Reasoning can be that we are not fully utilizing the fused accumulation. We should look at the assembly to double-check that.
Please try udot.v4i32.v16i8, that should quadruple the throughput compared to FP32.

yzhliu · 2019-09-28T01:36:23Z

@anijain2305 @zhiics please review again.

topi/python/topi/generic/conv2d.py

topi/python/topi/nn/conv2d.py

anijain2305

LGTM. Minor comments.

anijain2305 · 2019-09-30T00:19:55Z

It will be good if we can share the performance speedup results.

yzhliu · 2019-09-30T21:56:13Z

@tqchen could you check the ci instance? it shows "no space left"

yzhliu · 2019-09-30T21:56:37Z

@anijain2305 The avg speedup is ~2.1x compared to fp32

tqchen · 2019-10-01T00:21:27Z

ci issue fixed

yzhliu · 2019-10-01T15:40:10Z

Thanks @anijain2305 @zhiics @tqchen

* [topi] add ARM v8.2 udot (uint8) support * fix test case * fix common conv2d schedule * add back fp32_time in test * fix lint * fix doc, add support for int32_lanes=4, signed int * fix lint * add ic_bn % 4 checker in schedule

* master: Fix split's last factor issue (apache#4044) [COMMUNITY] ajtulloch -> committer (apache#4043) [TOPI]Add op argwhere (apache#3994) [topi] add ARM v8.2 udot (uint8) support (apache#3978) [COMMUNITY] anijain2305 -> reviewer (apache#4036) [QNN] Renaming dense operator. (apache#4033) [Relay][Compile_engine] Int64 shape handling for outputs. (apache#4031) Add dmlc-core to the list of installed header directories. (apache#4035) [ARITH] migrate indexdiv/mod to floordiv/mod (apache#4008) [Relay] Move prelude to text format (apache#3939) make tvm compilable by gcc 4.9.2 (apache#4032) [AUTOTVM][DOCS] Add a link to the defining network description of auto-tuning tutorial (apache#4023) [ARITH] cleanup the indexmod/div on python side (apache#4028) [Fix] Add more pad_mode support for onnx converter (apache#4029) Add parser support for ReLU tflite operator (apache#4022) Additional MXNet Convolution and Deconvolution tests (apache#4026) docs: minor spelling tweaks (apache#4027)

yzhliu changed the title ~~Armint8~~ [topi] add ARM v8.2 udot (uint8) support Sep 20, 2019

yzhliu added 5 commits September 19, 2019 20:55

[topi] add ARM v8.2 udot (uint8) support

cf3405a

fix test case

7391602

fix common conv2d schedule

6e671fc

add back fp32_time in test

f47e2a1

fix lint

43340b1

yzhliu force-pushed the armint8 branch from b9efee1 to 43340b1 Compare September 20, 2019 03:56

zhiics reviewed Sep 23, 2019

View reviewed changes

topi/python/topi/arm_cpu/tensor_intrin.py Show resolved Hide resolved

topi/python/topi/generic/conv2d.py Show resolved Hide resolved

topi/python/topi/generic/conv2d.py Show resolved Hide resolved

yzhliu added 2 commits September 27, 2019 18:28

fix doc, add support for int32_lanes=4, signed int

23a8b02

fix lint

4bee735

zhiics approved these changes Sep 29, 2019

View reviewed changes

anijain2305 reviewed Sep 30, 2019

View reviewed changes

topi/python/topi/generic/conv2d.py Outdated Show resolved Hide resolved

anijain2305 reviewed Sep 30, 2019

View reviewed changes

topi/python/topi/nn/conv2d.py Show resolved Hide resolved

anijain2305 approved these changes Sep 30, 2019

View reviewed changes

add ic_bn % 4 checker in schedule

d935473

yzhliu force-pushed the armint8 branch from ffd7ca0 to d935473 Compare September 30, 2019 20:06

yzhliu merged commit 5cc1764 into apache:master Oct 1, 2019

tqchen mentioned this pull request Nov 8, 2019

[RELEASE][DRAFT] TVM v0.6 Release candidate #4259

Closed

masahi mentioned this pull request Mar 24, 2022

[ARM] Fix NCHWc int8 dot product schedule lowering #10773

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[topi] add ARM v8.2 udot (uint8) support#3978

[topi] add ARM v8.2 udot (uint8) support#3978
yzhliu merged 8 commits intoapache:masterfrom
yzhliu:armint8

yzhliu commented Sep 20, 2019 •

edited

Loading

Uh oh!

zhiics left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anijain2305 commented Sep 23, 2019

Uh oh!

yzhliu commented Sep 28, 2019

Uh oh!

Uh oh!

Uh oh!

anijain2305 left a comment

Uh oh!

anijain2305 commented Sep 30, 2019

Uh oh!

yzhliu commented Sep 30, 2019

Uh oh!

yzhliu commented Sep 30, 2019

Uh oh!

tqchen commented Oct 1, 2019

Uh oh!

yzhliu commented Oct 1, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

yzhliu commented Sep 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhiics left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anijain2305 commented Sep 23, 2019

Uh oh!

yzhliu commented Sep 28, 2019

Uh oh!

Uh oh!

Uh oh!

anijain2305 left a comment

Choose a reason for hiding this comment

Uh oh!

anijain2305 commented Sep 30, 2019

Uh oh!

yzhliu commented Sep 30, 2019

Uh oh!

yzhliu commented Sep 30, 2019

Uh oh!

tqchen commented Oct 1, 2019

Uh oh!

yzhliu commented Oct 1, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yzhliu commented Sep 20, 2019 •

edited

Loading