Switch clustering to use per-froxel linked lists if storage buffers are in use. by pcwalton · Pull Request #22811 · bevyengine/bevy

pcwalton · 2026-02-05T06:11:05Z

The data structure used for clustering currently consists of a heap of clusterable object indices, plus an offset and counts structure for each froxel. That is, each froxel's data consists of an offset that represents the first index in a heap of indices, followed by the number of point lights, spot lights, reflection probes, irradiance volumes, and decals belonging to that froxel respectively. The indices of spot lights are assumed to immediately follow the indices of point lights, the indices of reflection probes are assumed to immediately follow the indices of spot lights, and so on. This tightly-packed structure is cache- and memory-efficient, which is especially important on WebGL 2 where the size of the uniform is extremely limited.

Unfortunately, this data structure inhibits GPU clustering, which we would like to perform in the future. In GPU clustering, we process every froxel-light pair in parallel, and the number of froxels that a light covers may exceed the workgroup size, so we're limited to atomic memory accesses for synchronization. There's no easy way I can see to build up such a tightly-packed data structure in parallel like this; the best we could do would be to build a linked list or a chunked linked list and have a second pass that compresses the linked list down, but the second pass would itself add unnecessary overhead.

To fix this problem and prepare for GPU clustering, this patch changes the data structure used for clustering to instead have one singly linked list per clusterable object type. The offset and counts structure is changed to 5 linked list heads that point to offsets in the heap. Each element in the heap is a pair that contains the ID of a clusterable object and the offset in the heap of the next pair in the list. The list is terminated by 0xffffffffu.

The CPU clustering code is unchanged; assign_objects_to_clusters still creates offsets and counts in the same way. During extraction, the offset-and-count model is converted to a linked list. The reason for this is that the uniform cluster data structure (as opposed to the storage cluster data structure), which is still used on WebGL 2, needs to remain tightly packed because uniform space is still at a premium. It was easier to keep the code identical for now than to add complexity to assign_objects_to_clusters. Unfortunately, the shader code did incur a fair bit of complexity through added #ifdefs; when we drop WebGL 2, these can be removed.

I tested the relevant examples and verified that they're unchanged.

in use. The data structure used for clustering currently consists of a heap of clusterable object indices, plus an *offset and counts* structure for each froxel. That is, each froxel's data consists of an offset that represents the first index in a heap of indices, followed by the number of point lights, spot lights, reflection probes, irradiance volumes, and decals belonging to that froxel respectively. The indices of spot lights are assumed to immediately follow the indices of point lights, the indices of reflection probes are assumed to immediately follow the indices of spot lights, and so on. This tightly-packed structure is cache- and memory-efficient, which is especially important on WebGL 2 where the size of the uniform is extremely limited. Unfortunately, this data structure inhibits *GPU clustering*, which we would like to perform in the future. In GPU clustering, we process every froxel-light pair in parallel, and the number of froxels that a light covers may exceed the workgroup size, so we're limited to atomic memory accesses for synchronization. There's no easy way I can see to build up such a tightly-packed data structure in parallel like this; the best we could do would be to build a linked list or a chunked linked list and have a second pass that compresses the linked list down, but the second pass would itself add unnecessary overhead. To fix this problem and prepare for GPU clustering, this patch changes the data structure used for clustering to instead have one singly linked list per clusterable object type. The offset and counts structure is changed to 5 linked list heads that point to offsets in the heap. Each element in the heap is a pair that contains the ID of a clusterable object and the offset in the heap of the next pair in the list. The list is terminated by `0xffffffffu`. The CPU clustering code is unchanged; `assign_objects_to_clusters` still creates offsets and counts in the same way. During extraction, the offset-and-count model is converted to a linked list. The reason for this is that the uniform cluster data structure (as opposed to the storage cluster data structure), which is still used on WebGL 2, needs to remain tightly packed because uniform space is still at a premium. It was easier to keep the code identical for now than to add complexity to `assign_objects_to_clusters`. Unfortunately, the shader code did incur a fair bit of complexity through added `#ifdef`s; when we drop WebGL 2, these can be removed. I tested the relevant examples and verified that they're unchanged.

pcwalton requested review from atlv24 and tychedelia February 5, 2026 06:11

pcwalton added the A-Rendering Drawing game state to the screen label Feb 5, 2026

github-project-automation bot added this to Rendering and Rendering (2026 Proposal) Feb 5, 2026

github-project-automation bot moved this to Needs SME Triage in Rendering (2026 Proposal) Feb 5, 2026

pcwalton added S-Needs-Review Needs reviewer attention (from anyone!) to move forward C-Performance A change motivated by improving speed, memory usage or compile times labels Feb 5, 2026

Typo police

22acf2b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Switch clustering to use per-froxel linked lists if storage buffers are in use.#22811

Switch clustering to use per-froxel linked lists if storage buffers are in use.#22811
pcwalton wants to merge 2 commits intobevyengine:mainfrom
pcwalton:per-froxel-linked-lists

pcwalton commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

pcwalton commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant