Nikita Lisitsa
banner
lisyarus.bsky.social
Nikita Lisitsa
@lisyarus.bsky.social
He/him

I teach C++ & computer graphics and make videogames

Working on a medieval village building game: https://youtube.com/playlist?list=PLSGI94QoFYJwGaieAkqw5_qfoupdppxHN&cbrd=1

Check out my cozy road building traffic sim: https://t.ly/FfOwR
Yay, Chromium build finished, and I was able to attach XCode Metal profiler to it and get a WebGPU capture! Now gotta figure out what all this means 😅

Though judging by high ALU utilization and low texture read utilization, it's compute bound, in line with my earlier tests.
November 29, 2025 at 9:20 PM
"8+ years exp graphics dev learns to actually profile stuff" status update:

After some guidance I decided to re-fetch chrome sources and now it builds without problems (3 hrs in, 70% done).

Also I downloaded actual Nsight Graphics and fed my game into it, now trying to make sense of the results
November 29, 2025 at 7:24 PM
More profiling! Replaced DDA voxel marching algorithm with just always walking along the +Y axis, removing most of the computations in shader. Expected this to be at most ~2x faster than true algorithm (less voxels visited). Timings here are relative to the base (true) case.
November 25, 2025 at 8:29 PM
More profiling stats! Replaced reading voxel data in the inner raytracer loop with computing it in the shader using the same exact noise function. Results barely differ from the base case, except it's slower at low voxel density, meaning I'm probably compute-bound in this case.
November 25, 2025 at 7:52 PM
First profiling results! Expectedly, on a fully-dense scene (256³ cube) chunk size doesn't matter - first hit is the result, no raytracing needed. On average density small chunks are better due to more fine-grained empty space skipping. On empty scenes larger chunks are better.
November 25, 2025 at 2:44 PM
Finally have some more time to work on this. The benefit of a noise-based scene is that 1) I can easily adjust its density, and 2) I can generate it in the shader on the fly. This will help in profiling the raytracer in a variety of cases and check if it's compute/memory bound.
November 25, 2025 at 2:10 PM
He looks like a perfect 50/50 blend of ma bois!
November 24, 2025 at 4:28 PM
Once again visualizing the number of steps the raytracer takes through the voxel map (more yellow = more steps, red = hit a voxel). Looks quite surreal
November 23, 2025 at 7:59 PM
Trying a new voxel scene to test memory vs compute GPU performance in my raytracer, using @xordev.com's dot noise: mini.gmshaders.com/p/dot-noise
November 23, 2025 at 9:10 AM
Soo chunk size profiling results are kinda funny: the best chunk size is 4x4x4, just like the 64-wide octree nodes I unsuccessfully tried earlier! Maybe if I store octree nodes data in a 3D texture, I can make the best out of the 2 approaches...
November 22, 2025 at 8:06 PM
Rewrote the voxel raytracer to use a two-level chunk system (a 3D texture atlas for 16³-sized chunks, another 3D world-space texture referencing the atlas). Without any optimizations, for primary (camera) rays it's already 25% faster; not so much for random bounce rays.
November 22, 2025 at 5:23 PM
Rewriting my voxel storage once again, this time using a two-level tree, aka chunking. Here I'm raymarching the chunks storage, where individual unrelated nonempty 16³-sized chunks are packed in some order. Quite a surreal view of the scene :)
November 22, 2025 at 4:34 PM
Right now in the 1 sample per frame with just 1 bounce scenario, octree traversal ends up being 30% slower than raw 3D texture traversal. With 0 bounces (just the camera ray, which are very coherent) it seems to be about 50% slower. I think I messed something up really bad 😅
November 21, 2025 at 7:30 PM
I noticed that a lot of my octree traversal perf problems are due to rays not hitting anything when they should've (and thus looping until the max step count is reached). Here I'm visualizing whole warps that had such rays (for better readability), using subgroupAny()
November 21, 2025 at 6:52 PM
Thank you so much! I reserve my YouTube for my main project devlogs :) (see pinned post, new one coming soon)

In this project I'm trying to make a game (more like a toy, though) where you can put/remove blocks and the environment reacts to it by computing sort of realistic lighting, like here:
November 21, 2025 at 3:44 PM
Visualizing the number of steps it takes for the octree traversal algorithm to find an intersection (yellow = higher)
November 21, 2025 at 2:55 PM
First iteration of sparse octree traversal works! It's already about 2x faster than direct 3D texture traversal for primary camera rays, but much slower for incoherent random monte-carlo rays. Need to optimize the hell out of it now
November 21, 2025 at 1:33 PM
I'm rewriting my voxel thing to use wide sparse octrees and it's going exactly as expected :)
November 20, 2025 at 9:23 PM
I just wanted to make games, exhibit 4562:

(this is from here: agraphicsguynotes.com/posts/understanding_the_math_behind_restir_gi)
November 19, 2025 at 12:50 PM
Still no luck integrating ReSTIR into full GI (using a simpler way than what the ReSTIR GI paper does). Left if ground truth, right is my attempt. It runs faster, but is clearly darker...
November 19, 2025 at 12:48 PM
Clamped some weights too conservatively and my lighting got funny
November 19, 2025 at 12:47 PM
Messed up scene generation a bit and got some nice
v i b e s
November 19, 2025 at 9:57 AM
I think I've messed up the weights again! (It's quite easy in ReSTIR, lol.) Here's a more correct image (I hope?). Anyway it doesn't matter much until I combine it using MIS into full GI computation
November 19, 2025 at 9:51 AM
First ReSTIR test using 16 reservoir proposal samples (right image) compared to uniform direction sampling and ignoring indirect light (left image) with many spread out lights (~3k light-emitting voxel faces here). Quite insane noise reduction going on in here!
November 19, 2025 at 9:40 AM
First test with ReSTIR, just for direct lighting for now. Figuring out the weights was a bit finicky, ngl. This scene is a bit too easy since all light sources are in the same spot, gonna try placing a bunch of light sources tomorrow.
November 18, 2025 at 9:50 PM