[TOPI] FIFO buffer op, to accelerate sequence modeling with dilated convolutions#4039
Merged
vinx13 merged 6 commits intoapache:masterfrom Oct 10, 2019
Merged
[TOPI] FIFO buffer op, to accelerate sequence modeling with dilated convolutions#4039vinx13 merged 6 commits intoapache:masterfrom
vinx13 merged 6 commits intoapache:masterfrom
Conversation
Contributor
Author
|
TODO.
|
Member
|
cc @vinx13 @merrymercy would be great if you can help comment and review |
anijain2305
reviewed
Oct 2, 2019
Contributor
anijain2305
left a comment
There was a problem hiding this comment.
Thanks for the contribution. I will have to look into the details to understand the compute, but overall looks good to me. Will do one more round by tomorrow.
zhiics
reviewed
Oct 3, 2019
Member
zhiics
left a comment
There was a problem hiding this comment.
Thanks for the contribution. I left some minor reviews. Otherwise, looks good to me.
kevinthesun
reviewed
Oct 6, 2019
vinx13
approved these changes
Oct 10, 2019
anijain2305
pushed a commit
to anijain2305/tvm
that referenced
this pull request
Oct 17, 2019
…onvolutions (apache#4039) * Add FIFO buffer op to enable explicit computation re-use in convolution * Add a test * Add end-to-end test with 1D convolution * Add a stub in MXNet frontend * Address reviewer comments * Add back stub for MXNet frontend
wweic
pushed a commit
to neo-ai/tvm
that referenced
this pull request
Oct 18, 2019
…onvolutions (apache#4039) * Add FIFO buffer op to enable explicit computation re-use in convolution * Add a test * Add end-to-end test with 1D convolution * Add a stub in MXNet frontend * Address reviewer comments * Add back stub for MXNet frontend
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation. Dilated convolutions have appeared as an effective alternative to recurrent units in modeling sequences. For example, WaveNet [1] uses a stack of dilated convolutional layers to generate raw audio waveforms from text. Snips [2] modifies the WaveNet architecture to detect a keyword in an audio stream.
In order to capture temporal context, the WaveNet architecture feeds a sliding window over the input sequence into the first convolutional layer. As noted in [2] and [3], computing convolution over the sliding window results in redundant computation:

This pull request implements a FIFO buffer operator where intermediate outputs are cached from each convolutional layer, so as to eliminate redundant computation. This is like [4], except that here the re-use is explicit and inherent in the model. Note that caching is only applicable in inference time (so not applicable to training).

Semantics. The FIFO buffer op should behave like
Usage. See
topi/tests/python/test_fifo_buffer.pyLimitation. Currently, the buffer op exists only in TOPI. To make it useful, we want to merge it into MXNet and other frameworks. Alternatively, we could conceivably implement a custom pass in Relay so that the user can annotate a stack of convolutional layers.
References
[1] "WaveNet: A Generative Model for Raw Audio." https://arxiv.org/abs/1609.03499
[2] "Efficient keyword spotting using dilated convolutions and gating" https://arxiv.org/abs/1811.07684
[3] "Fast Wavenet Generation Algorithm" https://arxiv.org/abs/1611.09482
[4] "Deep reuse: streamline CNN inference on the fly via coarse-grained computation reuse" https://dl.acm.org/citation.cfm?id=3330384
Special thanks to Thibaud Senechal (Amazon) for initially suggesting the concept of FIFO buffer.
cc @yongwww @wweic @zhiics @kevinthesun @anijain2305