[VTA] Performance optimize, remove unnecessary contigious memory use.#4246
Merged
tmoreau89 merged 2 commits intoapache:masterfrom Nov 2, 2019
huajsj:master
Merged
[VTA] Performance optimize, remove unnecessary contigious memory use.#4246tmoreau89 merged 2 commits intoapache:masterfrom huajsj:master
tmoreau89 merged 2 commits intoapache:masterfrom
huajsj:master
Conversation
Issue: Uop maintain a cache vector to copy uop data into contigious DRAM memory for FPGA/Simulator use, but this cache vector not get clear after FPGA/Simulator core run, in Resnet18 case, if we printf the cache size in UopQueue::ReadBarrier function, we can saw such cache size keep increase, this would cause no use data copy and unnecessary contigous DRAM memory malloc. Analysis: This issue caused by not clear cache_ vector when do uop_queue_.Reset(). Solution: Override BaseQueue Reset function in UopQueue and add cache_ clear logic.
Contributor
Author
|
Hi @tmoreau89, could you help to review this patch? Regards |
tmoreau89
reviewed
Nov 1, 2019
Contributor
|
Thanks for the catch @huajsj; was the fix tested in sim and hardware? |
Contributor
Author
Hi @tmoreau89 , Thanks for the follow up, yes , the fix get tested in both sim and pynq FPGA and work fine. Regards |
zxy844288792
pushed a commit
to neo-ai/tvm
that referenced
this pull request
Nov 13, 2019
…apache#4246) * [VTA] Performance optimize, remove unnecessary contigious memory use. Issue: Uop maintain a cache vector to copy uop data into contigious DRAM memory for FPGA/Simulator use, but this cache vector not get clear after FPGA/Simulator core run, in Resnet18 case, if we printf the cache size in UopQueue::ReadBarrier function, we can saw such cache size keep increase, this would cause no use data copy and unnecessary contigous DRAM memory malloc. Analysis: This issue caused by not clear cache_ vector when do uop_queue_.Reset(). Solution: Override BaseQueue Reset function in UopQueue and add cache_ clear logic. * address review comments, remove spacing.
tqchen
pushed a commit
to tqchen/tvm
that referenced
this pull request
Mar 29, 2020
…apache#4246) * [VTA] Performance optimize, remove unnecessary contigious memory use. Issue: Uop maintain a cache vector to copy uop data into contigious DRAM memory for FPGA/Simulator use, but this cache vector not get clear after FPGA/Simulator core run, in Resnet18 case, if we printf the cache size in UopQueue::ReadBarrier function, we can saw such cache size keep increase, this would cause no use data copy and unnecessary contigous DRAM memory malloc. Analysis: This issue caused by not clear cache_ vector when do uop_queue_.Reset(). Solution: Override BaseQueue Reset function in UopQueue and add cache_ clear logic. * address review comments, remove spacing.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue:
Uop maintain a cache vector to copy uop data into contigious DRAM memory for
FPGA/Simulator use, but this cache vector not get clear after FPGA/Simulator
core run, in Resnet18 case, if we printf the cache size in UopQueue::ReadBarrier
function, we can saw such cache size keep increase, this would cause
no use data copy and unnecessary contigous DRAM memory malloc.
Analysis:
This issue caused by not clear cache_ vector when do
uop_queue_.Reset().
Solution:
Override BaseQueue Reset function in UopQueue and add cache_ clear
logic.