[TOPI] Add support for groupped conv3d#9873
Conversation
e2e423d to
19fd9cc
Compare
8a06f18 to
31c4443
Compare
src/arith/rewrite_simplify.cc
Outdated
| if ((floordiv(x, y)).Match(ret) && analyzer_->CanProve(x.Eval() < y.Eval())) { | ||
| return 0; | ||
| } | ||
|
|
There was a problem hiding this comment.
I'm not sure what this is for?
There was a problem hiding this comment.
Maybe need to a a pass test for this.
There was a problem hiding this comment.
This is needed so that the simplifier here: https://github.com/apache/tvm/pull/9873/files#diff-04a1aab966320b6f63c390cc8b79f79b543a1a7a3f560ea3a97d18c2bb041a0aR837-R839 will remove the division by groups. Without this, simplification get stuck simplifying ff // (num_filter // groups) * (in_channel // groups) + rc to ff // num_filter * in_channel + rc. It can't prove that ff // num_filter = 0.
Yeah, I also noticed this implicit downcasting before, I'm in favor of this change. Users can always explicitly cast their inputs to fp16. The original code was added before we have the fp16 conversion pass. This may break existing flows if some users depend on this behavior, which is also the default in cudnn/cublas. Personally I don't mind it, but if that's concerning we can always add a workaround to allow the old behavior. |
bd63426 to
f4f2908
Compare
f4f2908 to
7928fc7
Compare
Change conv3d to use generic conv implementation which supports groupped convolutions. Also, remove support for non-float16 tensorcore operations as they cause large degradation in accuracy. Generic conv now supports autoscheduler.
d070fb9 to
58ae0fa
Compare
|
@masahi @mbrookhart I've finally got this branch green. Could you re-review? |
mbrookhart
left a comment
There was a problem hiding this comment.
LGTM. I would love to align on the tensorcore issues across the ops, if any still remain, but that can be a second PR.
* [TOPI] Add support for groupped conv3d Change conv3d to use generic conv implementation which supports groupped convolutions. Also, remove support for non-float16 tensorcore operations as they cause large degradation in accuracy. Generic conv now supports autoscheduler. * correct none check * add tests for floordiv simplification * fixed incorrect test for autoscheduler * formatting * add groups to winograd * fix tensorcore * manually simplify index instead of relying on simplifier * formatting * add groups argument to conv3d_ncdhw_winograd_without_weight_transform * formatting
Change conv3d to use generic conv implementation which supports grouped convolutions. Also, remove support for non-float16 tensorcore operations as they cause large degradation in accuracy. Generic conv now supports autoscheduler.
@mbrookhart @jwfromm