Hello, all!
I’m working on an installation and we’re running into a strange issue. In short, Touch is reporting a solid 60 fps, but we’re experiencing a intermittent stutter on output. We have a couple theories as to why it may be happening (and it may be a combination of the two- or something else entirely) and I’d love to get some input from the community to help work through it.
Our setup:
Xeon CPU, Win 10
2x Quadro P6000 (1 attached to UHD display output, 1 to 1080p operator display)
4x Render instances of Touch (2 per GPU)
1x Compositing instance running on GPU 1 (outputting to single UHD display)
Everything is correctly affinitied to the right GPU and assigned the correct NUMA node
Theory 1: Not Full Screen Exclusive
For some reason, the compositing instance does not seem to be going into Full Screen Exclusive mode. The Window Op is set to fill the entirety of the UHD display, and I believe all of the other settings are set right to allow FSE, however, the telltale “blink” doesn’t happen when we switch application focus and we’re seeing this stutter, which leads me to believe it’s not exclusive.
Is there something I’m missing that is preventing enabling FSE? The TD doc page mentions it “may need mosaic,” but is this only the case when you’re attempting to span displays? Is there a problem with having one display in fullscreen while another is displaying the desktop? Is there some way to force FSE? We looked into Win 10’s “Fullscreen optimizations” but enabling / disabling seemed to have no effect.
Theory 2: Render Instances Out of Sync With Compositor
For performance, we’ve offloaded the majority of the rendering to multiple render instances that get composited together just before output in another instance. We have to use the Shared Mem TOP (instead of Spout) to get frames from GPU 0 to GPU 1; however, I’ve seen the stutter happen using Spout as well. All instances are running at 60 fps, but we think that if the render time varies within that (or when within that frame each render instance writes to its shared mem), the compositor may occasionally display an old frame (resulting in displaying two duplicate frames).
Here’s a diagram of what I think’s going on (using the compositor and one render instance):
In this case, the first two frames would draw as expected, the third would appear to stutter.
We’ve tried a number of things: increasing the frame rate of the render nodes, changing CPU priorities (all the way up to realtime), showing / minimizing render instance windows, putting certain windows in the foreground, making them focused / defocused, realtime flag on / off, turning on / off vsync etc. Nothing has completely solved the problem. We’ve seem to have had the best luck with all instances in realtime priority, drawing all windows (with all visible), with the render instance in focus and with a hog chop to increase compositor frame time, but it may be coincidental (it’s a very intermittent problem).
So, if this is what’s going on, the question becomes: is there a way to sync the TD processes so that the compositor draws only after each render instance has updated its texture?
This had led to a number of questions in our search for a solution:
- Is vsync a good “metronome” to keep the separate processes in sync? Can we keep windows minimized and not drawn while still vsyncing?
- Is there any thing like OnPreRender (ala Unity) that would let us way to wait to the end to get frames / composite?
- Does Shared Mem Out “publish” its texture on cook or is it just maintaining a reference to a texture? If it’s on cook, would OnFrameStart or End be a way of reliably scheduling the texture update?
- Is there any way to get when Shared Mem In has a new texture? The Info CHOP doesn’t seem to have any relevant outputs.
- Would Sync In / Out CHOPs help? Sync Out from the compositor, when it’s no longer waiting for any clients via sync_external, then render?
If you’ve made it to the bottom of this post, thank you! I’d love thoughts on these theories or any others you might have!
Ps. I found some related posts that may be relevant (thoughts?):
This one is a little old, and doesn’t quite seem like our symptoms (also we can’t currently afford to run at 120hz)
This one mentions issues sharing across GPUs, but for performance reasons- we’ve found both shared mem and NDI performant and stable aside from this occasional frame hitch)
Thanks again!