Advice for Direct Display Out TOP with multi-GPU setup

We would like to use the Direct Display Out TOP for our production setup in which each card has it’s own Mosaic configuration and TouchDesigner affinity instance. However, we’re running into the following issues:

  1. When all displays are detached from a card, the -gpuformonitor flag no longer works as there are no displays associated with that card visible to Windows. We are currently planning on working around this by balancing our display outputs in such a way that we can attach a dummy EDID emulator, but this isn’t ideal. Generally, it would be helpful if there were a way to start TD with the PCI bus index supplied by the ddisplay utility. (This is also a frustration I’ve run into with configuring CUDA devices in PyTorch in TD.)
  2. I’m unsure of the best way to run in perform mode when there are no monitors attached, or generally what the advice is here for running in this configuration in production.

We recognize this technology isn’t fully mature yet, but are really excited about the possibility of using it in production as it will allow us to avoid some aspects of interfacing with an external A/V team.

Thanks!

  1. There currently isn’t any way to target a GPU without a monitor on the desktop connected to it, which allows the specific GPU to be described by what is connected to it. I’m looking into ways to do this. I don’t actually see the bus ID output from ddisplay. What value are you looking at? The ‘OS AdapterId’ is the same as the deviceLUID that DX/Vulkan uses, however the problem with using that value is that it can change on system reboot.
    CUDA and Vulkan do have a concept of the busID, which we could use. They are in the form [domain]:[bus]:[device].[function]. The problem is not all GPUs support that (non-Nvidia in particular), so it’s not usable as a general way to target a GPU. But it may be the best solution for this.

  2. Not sure what is best here still since it’s all new to us too. I think a perform window with ‘Draw Window’ turned off on the Window COMP is likely best, ensuring no windowing work is getting done on the GPU where that window is residing, since that isn’t the GPU that the process is driving.

Thanks for your feedback and testing this!

1 Like

Yes, my mistake. ddisplay doesn’t appear to have the necessary information, which is provided by nividia-smi. Here’s what it looks like on my machine:

name pci.bus_id pci.device_id
NVIDIA RTX 4000 Ada Generation 00000000:02:00.0 0x27B210DE
NVIDIA RTX 4000 Ada Generation 00000000:03:00.0 0x27B210DE

And Vulkan:

Screenshot 2024-12-17 201255
Screenshot 2024-12-17 201216

PyTorch does not have a direct way to list the bus index, we’ve been starting TD with os.environ['CUDA_DEVICE_ORDER'] = 'PCI_BUS_ID' to work around this.

I hadn’t even considered AMD, which obviously makes things much more complicated… We’d be happy with anything here even if it’s just the iteration index as reported by Vk, as that’s also what we’re effectively using for CUDA.

Here is a new build to try out:
https://www.dropbox.com/scl/fi/sqpelllqmp0tl6ubonem2/TouchDesigner.2023.12127.exe?rlkey=7d9s8ppk9z3phpjuayvb0zdql&dl=0

This has a new -gpubusid option on startup, and you can see the format for the bus ID in the Monitors DAT (rightmost column).

I also fixed one bug that was causing my error reporting to be triggered too easily, putting the node into an error state.

1 Like

Thanks, Malcolm. I’ve been running this build for the past few days and the -gpubusid solution is very helpful. I’d also noticed the spurious error on the node and can also confirm that it appears to be fixed. I’ve implemented your suggestion re: perform mode and overall am quite pleased with how well this setup is working!