Skip to content

[VTA] Performance optimize, remove unnecessary contigious memory use.#4246

Merged
tmoreau89 merged 2 commits intoapache:masterfrom
huajsj:master
Nov 2, 2019
Merged

[VTA] Performance optimize, remove unnecessary contigious memory use.#4246
tmoreau89 merged 2 commits intoapache:masterfrom
huajsj:master

Conversation

@huajsj
Copy link
Contributor

@huajsj huajsj commented Nov 1, 2019

Issue:
Uop maintain a cache vector to copy uop data into contigious DRAM memory for
FPGA/Simulator use, but this cache vector not get clear after FPGA/Simulator
core run, in Resnet18 case, if we printf the cache size in UopQueue::ReadBarrier
function, we can saw such cache size keep increase, this would cause
no use data copy and unnecessary contigous DRAM memory malloc.

Analysis:
This issue caused by not clear cache_ vector when do
uop_queue_.Reset().

Solution:
Override BaseQueue Reset function in UopQueue and add cache_ clear
logic.

Issue:
Uop maintain a cache vector to copy uop data into contigious DRAM memory for
FPGA/Simulator use, but this cache vector not get clear after FPGA/Simulator
core run, in Resnet18 case, if we printf the cache size in UopQueue::ReadBarrier
function, we can saw such cache size keep increase, this would cause
no use data copy and unnecessary contigous DRAM memory malloc.

Analysis:
This issue caused by not clear cache_ vector when do
uop_queue_.Reset().

Solution:
Override BaseQueue Reset function in UopQueue and add cache_ clear
logic.
@huajsj
Copy link
Contributor Author

huajsj commented Nov 1, 2019

Hi @tmoreau89, could you help to review this patch?

Regards
Hua

@tmoreau89
Copy link
Contributor

Thanks for the catch @huajsj; was the fix tested in sim and hardware?

@huajsj
Copy link
Contributor Author

huajsj commented Nov 1, 2019

Thanks for the catch @huajsj; was the fix tested in sim and hardware?

Hi @tmoreau89 , Thanks for the follow up, yes , the fix get tested in both sim and pynq FPGA and work fine.

Regards
Hua

Copy link
Contributor

@tmoreau89 tmoreau89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@tmoreau89 tmoreau89 merged commit 008aa83 into apache:master Nov 2, 2019
zxy844288792 pushed a commit to neo-ai/tvm that referenced this pull request Nov 13, 2019
…apache#4246)

* [VTA] Performance optimize, remove unnecessary contigious memory use.

Issue:
Uop maintain a cache vector to copy uop data into contigious DRAM memory for
FPGA/Simulator use, but this cache vector not get clear after FPGA/Simulator
core run, in Resnet18 case, if we printf the cache size in UopQueue::ReadBarrier
function, we can saw such cache size keep increase, this would cause
no use data copy and unnecessary contigous DRAM memory malloc.

Analysis:
This issue caused by not clear cache_ vector when do
uop_queue_.Reset().

Solution:
Override BaseQueue Reset function in UopQueue and add cache_ clear
logic.

* address review comments, remove spacing.
tqchen pushed a commit to tqchen/tvm that referenced this pull request Mar 29, 2020
…apache#4246)

* [VTA] Performance optimize, remove unnecessary contigious memory use.

Issue:
Uop maintain a cache vector to copy uop data into contigious DRAM memory for
FPGA/Simulator use, but this cache vector not get clear after FPGA/Simulator
core run, in Resnet18 case, if we printf the cache size in UopQueue::ReadBarrier
function, we can saw such cache size keep increase, this would cause
no use data copy and unnecessary contigous DRAM memory malloc.

Analysis:
This issue caused by not clear cache_ vector when do
uop_queue_.Reset().

Solution:
Override BaseQueue Reset function in UopQueue and add cache_ clear
logic.

* address review comments, remove spacing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants