Improved perf a bunch by collapsing atomics to 1 per wave, and frustum + depth culling particles before appending to the giant vertex buffer.
I think there's still some wasted time hammering a single atomic value that I could improve by splitting into N different vertex buffers instead.
Improved perf a bunch by collapsing atomics to 1 per wave, and frustum + depth culling particles before appending to the giant vertex buffer.
I think there's still some wasted time hammering a single atomic value that I could improve by splitting into N different vertex buffers instead.