[d3d9] Bind framebuffer rework#5112
Merged
Merged
Conversation
bad5b98 to
4f46e7b
Compare
4f46e7b to
47427ed
Compare
b48a825 to
13a9fe4
Compare
Collaborator
Author
|
Briefly tested with:
|
Hari-c137
approved these changes
Aug 13, 2025
misyltoad
requested changes
Aug 13, 2025
misyltoad
requested changes
Aug 13, 2025
We track the hazards for each texture slot and clear the bits at the beginning of this function. Because of that we cannot just look at the recently changed render targets here.
The size and sample count mismatch can be simplified because it only applies to 1x1 RT0 on Windows.
13a9fe4 to
4e603f7
Compare
misyltoad
approved these changes
Aug 13, 2025
Collaborator
misyltoad
left a comment
There was a problem hiding this comment.
Looks fine in theory. Definitely needs more stress testing though
bmwiedemann
pushed a commit
to bmwiedemann/openSUSE
that referenced
this pull request
Dec 31, 2025
https://build.opensuse.org/request/show/1324809 by user msmeissn + anag_factory - Update to version 2.7.1: * Fixed a regression that would cause MSAA resolves to look like MSAA was not working in a number of D3D9 games. * Improved performance in some D3D9 games by avoiding unnecessary render pass barriers, such as Dead Space 2 (PR gh#doitsujin/dxvk#5112). * Fixed the way tessellation factors are interpreted for line tessellation. This fixes particle rendering in DCS World and potentially other issues. (gh#doitsujin/dxvk#4327) * Added the d3d9.modeCountCompatibility configuration option to work around buffer overflows in older games that do not expect monitors to support more than ~16 different display modes. This fixes a crash in AquaNox 2. * Alone in the Dark: Worked around a crash on start. (PR gh#doitsujin/dxvk#5158) *
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Successor to the scary half of #5092
Fixes #5116
First of all it rewrites the
D3D9DeviceEx::BindFramebufferlogic based on what I found during testing:Ignoring binds based on the write mask is an exception for 1x1 RTs.
If there's a size mismatch across render targets (and/or with the DS) the draw just gets skipped on Windows.
I didn't implement the skipping because it's another thing that would be checked on every draw and not a single game has hit that case so far. So I'll just ignore that until it becomes necessary.
After that there's another optimization to avoid needless layout transitions.
In the last couple of months, I optimized the framebuffer binding logic to bind
the render targets that the game tells us to bind instead of trying to be smart
and only binding the ones that the shader writes to and that have a non-zero write mask.
The goal behind this was to avoid having to break up render passes based on state or
shader changes. Besides that, we don't use VK_IMAGE_LAYOUT_DEPTH_READONLY_STENCIL_ATTACHMENT_OPTIMAL
anymore for the render pass depth stencil attachment. It does get used as the default layout for depth stencil images that can be sampled.
Those changes massively reduced the amount of render passes (and barriers) in some games.
The exception to either of those two is that the same texture is bound for sampling and as an RT or DS.
In that case we first try to unbind it in the case of RTs or bind it as DEPTH_READONLY in the case of DS
if possible (shader write masks, blend state write masks/depth write state allow it) before pulling out
the big guns with LAYOUT_FEEDBACK_LOOP.
Dead Space 2 has a deferred lighting pass where it ping pongs between two shaders.
One shader is very simple and only used to update the stencil buffer.
The other shader samples the depth buffer and renders something to a render target.
Neither of those writes to the depth buffer.
Both of those use the same depth stencil surface but because of the different shader
sampler masks, we transition the depth buffer to READONLY when the shader, that samples it,
gets bound. Then when the next shader gets bound, the "hazard" is cleared because the sampler mask
indicates it's not sampled anymore. This makes us rebind the depth buffer as RW because the assumption is that binding as RW by default will allow us to avoid splitting the render pass for every change to the depth write state.
The game does ping-pong between those two shaders after a single draw call each, so we also switch between RW and READONLY after a single draw call each.
The PR solves this by only checking, whether the shader actually uses the texture bound at a slot, if there wasn't a hazard before or if the hazard cannot be resolved by unbinding the RT (or binding the DS as readonly).
This reduces the number of render passes at the beginning of Dead Space 2 from 70 to 14 and later parts of the game see similar improvements.