This one is a bit confusing, also with https://github.com/vinz9/CudaSortTOP I’ve noticed that when the plugin is loaded as a custom op, not using the cplusplus TOP, even though the output format looks like what is set in getOutputFormat() (32 bit mono), the cooking time (4x slower) reflects another format (32 bit rgba).
Here is what I’m doing
CudaSortTOP::getOutputFormat(TOP_OutputFormat* format, const OP_Inputs inputs, void reserved)
format->redChannel = true; format->greenChannel = false; format->blueChannel = false; format->alphaChannel = false; format->bitsPerChannel = 32; format->floatPrecision = true; return true; //return false;
See this video (look at the format in the info as well as the cooking time, it goes from 1.3 to 6ms for the gpu when the plugin is loaded as a custom op)
Let me know if you can reproduce, thank you!