-
Notifications
You must be signed in to change notification settings - Fork 3.9k
[AutoTVM] Add batch_matmul to tunable operations #4242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 7 commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
d260174
Batch matmul tuning running but with errors.
jwfromm a7de1cf
Default x86 schedule as good as before.
jwfromm eca29b3
Code Cleanup
jwfromm d817355
Remove unused argument.
jwfromm 649c47c
improved template documentation.
jwfromm e4f5e58
Silly lint fix
jwfromm 4905aba
Removed leftover comment.
jwfromm dcbface
Moved cfg declaration to schedule for batch_matmul
jwfromm d00513a
Moved x86 dense cfg declaration to schedule.
jwfromm 479decb
lint fix
jwfromm fa2d760
Removed duplicate cfg declaration in dense.
jwfromm 2464981
Reverted changes to dense.
jwfromm File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part should belong to the schedule instead of the declaration. I suggest moving them to the schedule function like other ops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually extremely similar to the topi dense declaration in x86 as it's based directly on it. I would argue that the functional similarity between dense and batch_matmul encourages us to keep the syntax as close as possible to make transferring optimizations simple. If you feel strongly that configuration declarations should be in the schedule, I'd by happy to move both the batch_matmul and the dense declarations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally don't think that's the best practice, because it would be tedious when we want to reuse the compute function on the different target (e.g., CUDA). It would also be confusing when someone tries to improve the schedule in the future, so I think it would be great to change both dense and batch_norm. On the other hand, I would also be happy to know other's opinion.
cc @icemelon9
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to move both to be in the schedule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for moving this into the schedule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest push moves the configuration declaration in both batch_matmul and dense to scheduling function. However, note that in the dense declaration, some splits are actually used in the computation (
tile_kindense_no_packfor example) and so cannot be moved. This means that the declarations arent all located in the same place. Do you guys prefer it this way or should we leave dense alone and only change batch_matmul?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case I personally prefer to leave the dense there, and file a separate PR to refactor the dense compute.
cc @yzhliu @icemelon9 @Laurawly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@comaniac I agree thats the best way to proceed. I reverted the changes to dense in the latest commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@comaniac In certain case, the config space must be defined in the compute as the compute needs to use it to define intermediate compute stage.