Skip to content

Conversation

@MatthewFluet
Copy link
Collaborator

As suggested by @JohnReppy, rather than storing the alternate return addresses of a PCall on the (dynamic) call stack, one can store them in a (static) table. In MLton, we can store the alternate return addresses in the struct GC_frameInfo that corresponds to the PCALL_CONT_FRAME.

As suggested by @JohnReppy at POPL'24, one can exploit the static
relationship among the return addresses at a PCall to avoid storing the
alternate return addresses on the (dynamic) call stack; rather, the
`cont`/`seq` return address on the stack should be sufficient to determine
the corresponding `parl`/`sync` and `parr`/`spwn` return addresses.

In MLton, we can use the `GC_frameInfo` that is associated with each return
address to store the `parl`/`sync` and `parr`/`spwn` return addresses in a
`PCALL_CONT_FRAME`.

One advantage is that it avoids two writes (of the `parl`/`sync` and
`parr`/`spwn` return addresses) at each `PCall`.  Although it is expected
that this overhead is small (the top of the call stack ought to be in
cache), it does nicely reduce the overhead of `PCall` to exactly the same
as `Call`.
@shwestrick
Copy link
Collaborator

Nice! I was able to measure a few percent improvement in a couple examples (fully parallel fib, and fully parallel wc). I confirmed in fib that the only difference in the generated code is eliminating those two store instructions.

@shwestrick shwestrick merged commit ef2588b into MPLLang:main Jun 4, 2024
@MatthewFluet MatthewFluet deleted the pcall-return-addresses-in-frame-info branch June 4, 2024 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants