Finally taking a bit more time to catch up with all the new instancing options, great work, very convenient.
GLSL is becoming more and more overrated as @elburz would say
for people interested, attached is a half comprehensive toe. Half because the glsl methods only implement translate and rotate, and also I haven’t tried instancing from SOPs or DATs though I imagine the performance would be similar to chop, since the data comes from cpu mem.
2020_07_28_instancing_perf.toe (10.2 KB)
My bare bone performance comparison involves going full screen with vsync off and looking at the fps with the different methods.
it seems if the instancing data is coming from chops, the transform matrices are computed on the cpu and sent to the shader, which I think was always the case (one can see the geo cooking quite a bit).
So manually computing the transform matrices in the shader boosts performance quite a bit (at least when you have a lot of instances of a low vertices count obj), and performance gets pretty close to instancing from TOPs (minus cpu cost of animating chops channels)
it seems the old school way of using samplerbuffer to send custom attributes to the shader is a bit faster than the new custom instancing attributes, though I’ll probably still use those since it’s much more convenient
of course fastest perfomance is instancing from TOPs, and in that case it seems the transform matrices are computed in the shader directly already (hurray), and performance is pretty much identical between built in instancing, sampling textures manually and using the custom attributes.
A couple questions to conclude :
It seems if using the translate/rotate… instancing params, only the full transform matrix can be accessed in the shader, according to https://docs.derivative.ca/Write_a_GLSL_Material
mat4 TDInstanceMat(); These matrices will contain the entire transform, including TX, TY, TZ, SX, SY, SZ as well as Rotate To.
A while ago, the wiki had :
If you aren't using rotation, then individual arrays for each of TX, TY, TZ, SX, SY and SZ will be sent to the GPU (if they are used). The fewer channels you use, the less space these will take up. vec3 TDInstanceTranslate(); vec3 TDInstanceScale();
Guessing that is gone and if I need separate translate, scale and rot, I can just use the custom attributes? Which is what I did in the shared toe.
Thank you for reading!