this is the problem model in the first image. each color is a different shader that had to be generated by tangerine to render the voxel, and the average complexity of the generated shaders is also high.
the second image is the generated part for one of the these shaders. its basically the object structure w/ all the params pulled out.
incidentally this happens to also be my plan for how to deal with shader compiler hitches in general, i've just been procrastinating on it because opengl doesn't believe in async shader compiling
here's some very entertaining reading about the same problem in a completely different project https://dolphin-emu.org/blog/2017/07/30/ubershaders/
update: I hacked together an interpreted mode and it works great for this :D
anyways, here's the code for just the interpreter if you are curious. it is quite short. https://github.com/Aeva/tangerine/blob/excelsior/shaders/interpreter.glsl
when i get around to adding occlusion culling this should become quite fast, as a lot of the frame time is burned rendering voxels you can't see. visibility feedback could also be used to prioritize the compiling queue. this also might mean a wysiwyg editor could be possible since the time to render is instant once the octree is solved. lots exciting stuff.
@jonbro yes :D the root of the tree contains the entire model, which would be too slow to render compiled or otherwise. the octree splits to eliminate dead space, and as it does so each node removes the parts of the CSG tree that can't effect it resulting in a simpler SDF
@jonbro yes. here's the implementation if you're interested https://github.com/Aeva/tangerine/blob/excelsior/tangerine/sdfs.cpp#L1156
@aeva I guess I can't think of an alternate way to approach :D
this is really cool to see an end to end implementation of this.
@aeva awesome! I'm not sure I'm ready to revive my toy voxel sdf thingy, but these notes are gonna be my starting point if i do.
I gave up at the culling SDF ops stage, so I could never really have complex models :(
@jonbro so far this approach is working quite well for me. the main problem is the distance fields aren't exact after any set operators, so it can't cull as aggressively on the CPU as i would like it to. it also definitely needs clustered occlusion culling. I think this strat has promise though.
@aeva I don’t know if it works on any other platform but on macOS you could achieve async shader compilation by using shared contexts — each context could compile one shader at once. ‘Course, some of the compilation was still deferred to see the relevant state at draw time, so you tended to need to draw with each shader and the correct state bound, so it was a huge pain in the butt…
@OneSadCookie i had a go at the shared contexts approach, and it ended up causing a lot of mystery behavior, like long half minute hangs in strange places like timing queries and such. the problem with shared contexts is that when you opt into them, the driver then turns on a ton of hazard tracking and synchronization it normally doesn't have to do, and thus is not as well QA'd as the main stuff.
@OneSadCookie there's also an extension for parallel shader compiling that doesn't work properly on nvidia, and there's also the shader binary trick where you compile it in another process and then load the binary. a common problem to all of these is that opengl likes to recompile shaders it already compiled the first time you use them and every time you change pipeline state for Reasons
@OneSadCookie i'm planning on jettisoning gl for vk because of this, but the going is slow because vk is supremely unpleasant to write for some reason.
@aeva ah yeah, that is all very sucks. And I haven’t written Vk myself but I’ve seen somebody’s setup code and some of the complexity reflected through WGPU, so I can understand the desire to stick with GL for a bit longer!
@aeva You could try compiling to SPIR-V in a separate thread, that way the driver only has to go from bytecode to machine code.
I use shaderc:
It has several options, like optimization level, source language, and target environment (I'm using HLSL + vulkan).
It's shipped as part of the vulkan SDK, so I just LoadLibrary and load its functions to use from C.
There's also glslang:
Also dxc can compile hlsl to dxil or spir-v:
I am not 100% sure, but it looks like it would not be a huge undertaking if you are already using GL3.3 core.
There are some code snippets on the GL wiki linked above, it looks pretty trivial to change the cpu code side.
@lh0xfb i'm targeting 4.2 with some extensions. i don't think i'm using anything particularly exotic
@aeva That should be fine; I think the main requirement is explicit attribute and binding locations, so that everything links together consistently.
Here's a site where someone shared their before / after of converting their GLSL shader over to work with SPIR-V: https://eleni.mutantstargoat.com/hikiko/opengl-spirv/
I think it may also affect your cpu code with regard to relying on GL to perform reflection on your shader, eg. to find the binding id of a uniform. With SPIR-V you'd instead need to know which binding id you want to update.
It's still a hell of a lot simpler than vulkan, where you have to also provide even more info about *everything* and do descriptor set allocation, writes, and lifetime / state management such that descriptors are not modified while the gpu is using them.
I do the laziest/simplest thing possible, where I only have one global descriptor set per frame, I update it once before any rendering is done, and reclaim it a couple frames later.
The original server operated by the Mastodon gGmbH non-profit