Reimplement the pooling instance allocation strategy#5661
Conversation
Subscribe to Label Actioncc @fitzgen, @peterhuene DetailsThis issue or pull request has been labeled: "fuzzing", "wasmtime:api", "wasmtime:config"Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
Label Messager: wasmtime:configIt looks like you are changing Wasmtime's configuration options. Make sure to
DetailsTo modify this label's message, edit the To add new label messages or remove existing label messages, edit the |
This commit is a reimplementation of the strategy by which the pooling instance allocator selects a slot for a module. Previously there was a choice amongst three different algorithms: "reuse affinity", "next available", and "random". The default was "reuse affinity" but some new data has come to light which shows that this may not always be a good default. Notably the pooling allocator will retain some memory per-slot in the pooling instance allocator, for example instance data or memory data if-so-configured. This means that a currently unused, but previously used, slot can contribute to the RSS usage of a program using Wasmtime. Consequently the RSS impact here is O(max slots) which can be counter-intuitive for embedders. This particularly affects "reuse affinity" because the algorithm for picking a slot when there are no affine slots is "pick a random slot", which means eventually all slots will get used. In discussions about possible ways to tackle this, an alternative to "pick a strategy" arose and is now implemented in this commit. Concretely the new allocation algorithm for a slot is now: * First pick the most recently used affine slot, if one exists. * Otherwise if the number of affine slots to other modules is above some threshold N then pick the least-recently used affine slot. * Otherwise pick a slot that's affine to nothing. The "N" in this algorithm is configurable and setting it to 0 is the same as the old "next available" strategy while setting it to infinity is the same as the "reuse affinity" algorithm. Setting it to something in the middle provides a knob to allow a modest "cache" of affine slots while not allowing the total set of slots used to grow too much beyond the maximal concurrent set of modules. The "random" strategy is now no longer possible and was removed to help simplify the allocator.
cfallin
left a comment
There was a problem hiding this comment.
This is a really clean and pleasing generalization of the old allocator -- thanks for this!
I didn't see any issues at all with the code; so in the absence of something more substantial, I just have some comment-request and naming nits :-) Overall it's quite clear already though.
| rand::thread_rng().gen() | ||
| }; | ||
| let rng = SmallRng::from_seed(seed); | ||
| pub fn new(max_instances: u32, max_unused_warm_slots: u32) -> Self { |
There was a problem hiding this comment.
debug_assert!(max_unused_warm_slots <= max_instances) ?
There was a problem hiding this comment.
Adding this assertion would require adding validation to the PoolingAllocatorConfig in wasmtime as well to provide a better error message than tripping the assertion. Thinking about that though I think it may not be worth it since it's not really a problem if max_unused_warm_slots is bigger than the number of slots. It's a bit silly but it can also perhaps be helpful to always pass a large value here to say "always keep everything warm"
This commit is a reimplementation of the strategy by which the pooling instance allocator selects a slot for a module. Previously there was a choice amongst three different algorithms: "reuse affinity", "next available", and "random". The default was "reuse affinity" but some new data has come to light which shows that this may not always be a good default.
Notably the pooling allocator will retain some memory per-slot in the pooling instance allocator, for example instance data or memory data if-so-configured. This means that a currently unused, but previously used, slot can contribute to the RSS usage of a program using Wasmtime. Consequently the RSS impact here is O(max slots) which can be counter-intuitive for embedders. This particularly affects "reuse affinity" because the algorithm for picking a slot when there are no affine slots is "pick a random slot", which means eventually all slots will get used.
In discussions about possible ways to tackle this, an alternative to "pick a strategy" arose and is now implemented in this commit. Concretely the new allocation algorithm for a slot is now:
The "N" in this algorithm is configurable and setting it to 0 is the same as the old "next available" strategy while setting it to infinity is the same as the "reuse affinity" algorithm. Setting it to something in the middle provides a knob to allow a modest "cache" of affine slots while not allowing the total set of slots used to grow too much beyond the maximal concurrent set of modules. The "random" strategy is now no longer possible and was removed to help simplify the allocator.