Render instance data refactor#1713
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This change refactors how the renderer handles per instance data. Per instance data contains things like the object to world transform, color etc. It is used to render the same mesh in multiple places with one draw call.
Previously the per instance data was collected during the rendering itself. This had a few issues: The renderer had to iterate through all renderdata of a batch and collect the instance data from that and then upload that to the GPU. This was done per submesh and per view, so even for a mesh with just one submesh this could mean that the same data was collected and uploaded like e.g. 3 times per frame (shadow, depth pre pass + main pass).
With this refactoring the instance data is already allocated and filled during the extraction. This means that for static objects it will only be collected and uploaded once and then re-used. Dynamic objects will still fill the data every frame but it is only uploaded once per frame. Furthermore the instance data is not duplicated for submeshes anymore.
Since the data in the gpu buffer can now be in arbitrary order a new indirection vertex buffer is introduced which contains various data offsets:
These offsets are already collected during batching in the extraction phase so that the renderer does not have to look at individual render data anymore but can just operate on batches. It highly depends on the scene but I could measure speedups of 3x in the render thread.
There are some more advantages of this approach:
Breaking changes: