Skip to content

Auto-tuning scripts to maximize GPU kernel performance#62

Merged
tqchen merged 1 commit intomainfrom
tuning
May 7, 2023
Merged

Auto-tuning scripts to maximize GPU kernel performance#62
tqchen merged 1 commit intomainfrom
tuning

Conversation

@junrushao
Copy link
Copy Markdown
Member

@junrushao junrushao commented May 3, 2023

@MarcelDelhez
Copy link
Copy Markdown

I would say more ... but, perhaps I am blind ... It would very interesting to have a documentation on how to train a model (ideally from scratch and using PDFs or a web site (html pages) like whitead/paper-qa does.
But, at least a documentation with a script to tune based on own data.

@funnbot
Copy link
Copy Markdown

funnbot commented May 4, 2023

I was able to use meta_schedule.relax_integration.extract_tasks() to get task weights, convert to tune context and tune_tasks (tune_tir calls this)

It’s also easy to load the cached pickle from the build script and filter out dynamic functions when looping over the modules functions, without a separate split step.

I still can’t figure out how the database in this repo has tunings for NT_matmul, maybe a custom schedule?

@yzh119
Copy link
Copy Markdown
Member

yzh119 commented May 4, 2023

Hi @MarcelDelhez, what do we mean by "tuning" here is tuning the kernel performance (to be faster) instead of fine-tuning weights.
Support fine-tuning in MLC-LLM is indeed very important but it's out of the scope of this PR, you can create an issue to discuss that :)

@MarcelDelhez
Copy link
Copy Markdown

I agree tuning was not the correct word. My concern was more about learning from own documents.
Such a bot is useful when it can answer to questions from own documents. So, my 'tunig' word must be replaced by fine tuning but into that case, there is also need for forgetting outdated information and ... a bundle of gpus to train a model and, I can easily understand it is out of your scope

@junrushao junrushao force-pushed the tuning branch 3 times, most recently from bc53a4c to e263a41 Compare May 6, 2023 02:13
@junrushao junrushao changed the title [WIP] Convenient script for auto tuning Auto-tuning scripts to maximize GPU kernel performance May 6, 2023
@junrushao junrushao force-pushed the tuning branch 2 times, most recently from 2e07023 to fb71db8 Compare May 6, 2023 20:55
@junrushao
Copy link
Copy Markdown
Member Author

Agreed that "tuning" is a pretty overloaded term - in this particular case, I am referring to "auto-tuning compiler", which is the key to GPU performance. With TVM Unity auto tuning, MLC LLM is able to generate performant code on average phones for those 3b/7b models as fast as 10 tok/sec

@tqchen tqchen merged commit 909f267 into main May 7, 2023
@tqchen tqchen deleted the tuning branch May 8, 2023 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants