[microNPU] Add support for TFLite FULLY_CONNECTED#10345
[microNPU] Add support for TFLite FULLY_CONNECTED#10345manupak merged 18 commits intoapache:mainfrom
Conversation
This is primarily a legalization to an NPU Conv2d operator. The legalization target is Conv2d with 1 1 I O (HWIO)
Test TVM runtime against TFLite for codegen and operator legalization.
lhutton1
left a comment
There was a problem hiding this comment.
Thanks @dchauhan-arm, this will be a very useful addition. Although this is still WIP I just wanted to offer a couple of suggestions which may help get this in!
| ) | ||
| bias_add = is_op("nn.bias_add")(dense, is_constant()) | ||
| req = is_op("qnn.requantize")( | ||
| dense | bias_add, is_constant(), is_constant(), is_constant(), is_constant() |
There was a problem hiding this comment.
Currently I the legalization will fall over if there is not a bias present. We should make bias optional in FullyConnectedParams, see QnnTransposeConv2dParams for an idea
There was a problem hiding this comment.
ack! (and thanks for the pointer on transpose conv2d)
| ): | ||
| @tf.function | ||
| def fully_connected(): | ||
| return tf.keras.layers.Dense( |
There was a problem hiding this comment.
I'm not too familiar with the Keras API, but I'm not sure this will work. One thing we could do instead is use tf.matmul which gets legalized to fully connected in TFLite under the conditions we will use it for. e.g. something like this would be a starting point:
@tf.function
def dense_layer(x):
w = tf.constant(
np.random.uniform(size=[units, units]),
dtype=tf.float32,
)
return tf.matmul(x, w)
_compare_tvm_with_tflite(dense_layer, [(1, units)], accel_type)
Happy to keep the Keras implementation if we get it working though, just wanted to offer an alternative :)
There was a problem hiding this comment.
this is a very welcome change, I'l try and make this work!
| requantize_op.args[RequantArgs.IFM_ZERO_POINT.value], | ||
| ) | ||
| self.ifm = TensorParams( | ||
| qnn_dense.args[QDenseArgs.ifm.value], |
There was a problem hiding this comment.
ifm should be capitals i.e. QDenseArgs.IFM.value
ekalda
left a comment
There was a problem hiding this comment.
Thanks @dchauhan-arm, broadly looks good, I left some suggestions for improvements :)
| # IFM reshapes | ||
| ifm = post.args[0] | ||
| if len(params.ifm.shape) != 4 or not params.ifm.shape[1] == params.ifm.shape[2] == 1: | ||
| ifm = relay.reshape(ifm, (-1, 1, 1, params.ifm.shape[-1])) |
There was a problem hiding this comment.
| ifm = relay.reshape(ifm, (-1, 1, 1, params.ifm.shape[-1])) | |
| ifm = relay.reshape(ifm, (1, 1, 1, params.ifm.shape[-1])) |
should be safer since the NPU doesn't support IFMs with a batch size anything other than 1 and this kind of fully connected wouldn't be offloaded anyway
| def callback(self, pre, post, node_map): | ||
| params = ethosu_patterns.FullyConnectedParams(post.op.body) | ||
| params.ifm.tensor = post.args[0] | ||
| activation_map = {"clip": "CLIP"} |
There was a problem hiding this comment.
nit: we don't expect that dict to expand, so we can just do if activation == "clip": etc
| if len(params.ofm.shape) != 4 or not params.ofm.shape[1] == params.ofm.shape[2] == 1: | ||
| ethosu_fc = relay.reshape(ethosu_fc, params.ofm.shape) |
There was a problem hiding this comment.
I suspect there isn't a test case that exercises this case since on line 1700 this pass runs after the no op legalizer, so the last reshape won't have a following identity op and will fall over in TE
|
|
||
| @ir.transform.module_pass(opt_level=1) | ||
| class LegalizeFullyConnected: | ||
| """This is the pass that wraps the AddRewriter""" |
There was a problem hiding this comment.
| """This is the pass that wraps the AddRewriter""" | |
| """This is the pass that wraps the FullyConnectedRewriter""" |
| """ | ||
|
|
||
| composite_name = "ethosu.fully_connected" | ||
| activation_map = {"clip": "CLIP"} |
There was a problem hiding this comment.
Same nit about the clip dict as before :)
| if activation_function == "RELU": | ||
| assert str(op.attrs.activation) == "CLIP" | ||
|
|
||
| dense_pattern_table = [ |
There was a problem hiding this comment.
nit: it would be better to keep the naming consistent, so maybe rename this to fc_pattern_table or fully_connected_pattern_table
|
|
||
| # check IFM | ||
| ifm = op.args[0].checked_type | ||
| assert list([1, 3, units, 1]) == list([1, 3, units, 1]) |
There was a problem hiding this comment.
This assert doesn't check anything... Some things to potentially check:
- That we have ended up with a
ethosu_conv2dop (taking into account that there might be reshape ops before and after the conv2d) - That the IFM is in a shape of (1, 1, 1, c)
- That the weights are in a shape (o, 1, 1, c) with o being the output channels of the weights
- That the kernel and dilation are (1, 1)
Address comments, update codegen test, fix linting.
Address more comments, ensure qnn.dense is lowered to NPU, fix linting
Fix linting, update legalization test and codegen test for completeness.
lhutton1
left a comment
There was a problem hiding this comment.
Thanks for the update @dchauhan-arm, looking much better! Mostly just a few stylistic suggestions below
| import tvm # type: ignore | ||
| from tvm import relay | ||
| from tvm.relay.expr import Constant, Call # type: ignore | ||
| from tvm.relay.expr import Constant, Call |
There was a problem hiding this comment.
Nit: I don't think we need to change this
| if not np.all(np.array(self.ifm.shape[:-1]) == 1): | ||
| # As we reshape the ifm from | ||
| # [n0, n1, ... , n_m] to [n0 * n1 * ... * n_{m-1}, n_m] | ||
| # all except the last dims need to be 1. | ||
| return False |
There was a problem hiding this comment.
I don't think we need this due to reasoning in the above comment and since we already check that the batch size == 1 with check_batch_size above and we know that the ifm must be 2D
| return True # optional_bias_add = ( | ||
|
|
||
| # is_op("nn.bias_add")(dense, is_constant()) | dense | ||
| # ) |
| dense = is_op("qnn.dense")( | ||
| wildcard(), is_constant(), is_constant(), is_constant(), is_constant(), is_constant() | ||
| ) | ||
| optional_bias_add = is_op("nn.bias_add")(dense, is_constant()) | dense |
There was a problem hiding this comment.
I think this should just be optional_bias_add = is_op("nn.bias_add")(dense, is_constant())
|
|
||
| # check OFM | ||
| ofm = op.checked_type | ||
| assert [ofm.shape[2], ofm.shape[3]] == [1, ofm_channels] |
There was a problem hiding this comment.
Same as above, lets alter this to check the whole ofm shape assert ofm.shape == ...
There was a problem hiding this comment.
This would need to be assert list(ofm.shape) == [1, 1, 1, ofm_channels]
| assert [ofm.shape[2], ofm.shape[3]] == [1, ofm_channels] | ||
| # assert list(ofm.shape) == list(expected_ofm_shape) | ||
| assert str(ofm.dtype) == dtype | ||
| # assert ofm.shape[3] == ofm_channels |
| assert weights_ohwi.shape[0] == ofm_channels | ||
| assert weights_ohwi.shape[1] == 1 | ||
| assert weights_ohwi.shape[2] == 1 | ||
| assert weights_ohwi.shape[3] == ifm_shape[1] |
There was a problem hiding this comment.
Nit: we could do this with one assert e.g. assert list(weights_ohwi) == [ofm_channels, 1, 1, ifm_shape[1]
| # (1, 1), | ||
| # (1, 1), | ||
| # (1, 1), | ||
| # ) |
| # (1, 1), | ||
| # (1, 1), | ||
| # ) | ||
| assert list(op.attrs.padding) == [0, 0, 0, 0] |
There was a problem hiding this comment.
Nit: might also be worth checking the op name is an NPU convolution here as well
|
Looks like CI only failed because of a flaky GPU test :) |
Address comments, fix linting. Certain legalization test assertions were updated. Co-authored-by: Rishabh Jain <rishabh.jain2@arm.com>
Fix assertion in legalization test.
|
|
||
| # check IFM | ||
| ifm = op.args[0].checked_type | ||
| assert [ifm.shape[2], ifm.shape[3]] == list(ifm_shape) |
There was a problem hiding this comment.
To get this change to work we would need assert list(ifm.shape) == [1, 1] + list(ifm_shape), since ifm.shape is not a list
|
|
||
| # check OFM | ||
| ofm = op.checked_type | ||
| assert [ofm.shape[2], ofm.shape[3]] == [1, ofm_channels] |
There was a problem hiding this comment.
This would need to be assert list(ofm.shape) == [1, 1, 1, ofm_channels]
| # check weights | ||
| weights_ohwi = op.args[1].data.asnumpy() | ||
| assert str(weights_ohwi.dtype) == dtype | ||
| assert list(weights_ohwi) == [ofm_channels, 1, 1, ifm_shape[1]] |
There was a problem hiding this comment.
This ones just missing .shape i.e. assert list(weights_ohwi.shape) == [ofm_channels, 1, 1, ifm_shape[1]]
There was a problem hiding this comment.
ack! good spot on the list(ifm.shape)
Address comments, fixing assertion on ifm and ofm shape.
lhutton1
left a comment
There was a problem hiding this comment.
Thanks @dchauhan-arm, LGTM!.. Providing CI stays green
manupak
left a comment
There was a problem hiding this comment.
LGTM!
Thanks @dchauhan-arm @lhutton1 @ekalda @jainris !
* [microNPU] Add support for TFLite FULLY_CONNECTED This is primarily a legalization to an NPU Conv2d operator. The legalization target is Conv2d with 1 1 I O (HWIO) * [microNPU] Add support for TFLite FULLY_CONNECTED Test TVM runtime against TFLite for codegen and operator legalization. * [microNPU] Add support for TFLite FULLY_CONNECTED Fix linting * [microNPU] Add support for TFLite FULLY_CONNECTED Address comments, update codegen test, fix linting. * [microNPU] Add support for TFLite FULLY_CONNECTED Address more comments, ensure qnn.dense is lowered to NPU, fix linting * [microNPU] Add support for TFLite FULLY_CONNECTED Fix linting, update legalization test and codegen test for completeness. * [microNPU] Add support for TFLite FULLY_CONNECTED Address comments, fix linting. Certain legalization test assertions were updated. Co-authored-by: Rishabh Jain <rishabh.jain2@arm.com> * [microNPU] Add support for TFLite FULLY_CONNECTED Fix assertion in legalization test. * [microNPU] Add support for TFLite FULLY_CONNECTED Address comments, fixing assertion on ifm and ofm shape. Co-authored-by: Rishabh Jain <rishabh.jain2@arm.com>
This is primarily a legalization to an NPU Conv2d operator. The
legalization target is Conv2d with 1 1 I O (HWIO)