Nvidia Background TOP full body?

jeffcrouse · September 8, 2021, 7:10pm

I would love to use the NVIDIA background TOP instead of a Kinect for background removal, but all of the videos I’ve seen suggest that designed mostly for people sitting at desks. Does anyone know if it (NVIDIA Maxine Video Effects, I suppose) is capable of segmenting a full human? I don’t have an RTX card to test with, and I’d like to understand the capabilities before I drop $2k+

This post would seem to suggest that it wasn’t possible 5 months ago: https://www.nvidia.com/en-us/geforce/forums/broadcasting/18/424938/add-support-for-full-body-tracking-background-remo/ But maybe something has changed?

Even if it doesn’t work out of the box, would it be possible to use a different model?

thanks in advance!

raganmd · September 8, 2021, 9:51pm

Full body segmentation does work, but I wouldn’t say that it’s a depth replacement just yet. Here are some examples pulling from stills - which has different performance than with video. Part of the challenge here is also the model that you’re working with. There are probably some models built off of standing figures that might yield better results here.

Standing bodies:

People at desks:

The other reminder here is that the output from the nvidia background TOP is a several frames behind realtime - so you need to use a cache to delay your feed by 3-6 frames to get a clean alignment between key and video. That’s a little different than working with a depth sensor, and if it’s a deal breaker it’s worth knowing about.

jeffcrouse · September 9, 2021, 3:54pm

Thanks, @raganmd. I totally agree that it doesn’t seem ready for prime time yet. Still exciting though.