I have a 7 cameras , lined up one after the other. All the cameras are connected to a different PC.
I am using the Nvidea body tracking CHOP to get the body tracking information of the users.
I have one more PC where I am getting all the information of the 7 cameras.
I have a red box showing on the user getting detected by the camera. I want to know how can I sycn all the body tracking information in way that if a user starts walking infront of camera 1 and ends up to camera 7 the red box follows the user?
Nvidia Body tracking CHOP uses any TOP to detect and track skeleton. So, for me there are 2 ways you can do it:
Try and blend &warp camera views in one big picture and see if it gets you there and the track is stable enough. Imho, this is the most stable way to do it.
You can try and and match the start and end of detection views (I assume bounding box) for each view and switch each view as soon as a person is detected, provided only 1 person is tracked at a time. When multiple people are tracked, it’s becoming a tricky business indeed.
This can be used for real-time video stitching. Use a snapshot of the inputs from all the cameras, send it through PTGui to create a .pts file and then load the .pts file. You can now feed the same cameras into the component and a live-stitch will be performed.
I think it could be a way to explore it probably with more panoramic settings.
I have tried the Nvidia body tracking CHOP, maybe i am wrong but it only tracks 8 users max.
I used the fit CHOP to add all my camera views together and test it but it only tracked 8 users
I still don’t understand what are you trying to achieve?
how many people to you want to track?
is the consistency is smth you’re after or not?
You can use other methods like mediapipe that will track as many people as it can see (as far as I know). You need to test it with extra large images though.
Have you considered another approach? smth like 4° lens camera with ultrawide view?
If the results of your bodytracking + fit CHOP are satisfying for your installation and you just need to track more people, you can try mediapipe. I don’t see how exactly it can work (because of the overlap - same person can be far from camera in both or even more cameras depending on space geometry).
I just went to Bodytrack CHOP though and you can actually determine how many people you want to track and couple of settings to help filter out unwanted results
I have a room with a long wall, i have 7 cameras in the wall to detect users, there will be at a time 50 people in the room. so I want to detect them and have an effect on each user.
Consistency is important.
I have orbbecc cameras right now also the Zed2.
I have tested the bodytrack CHOP it does mention that it can track more but it only does track 8, tested it out.
Issue with mediapipe is that it does not have an ID for the object being detected so its difficult to track, it only creates a box around a user
Not sure what’s the limitation of Body Track. Graphic card? I never wanted to track more than couple of persons at a time. Perhaps the camera position? If there are a lot of people one behind another (like a concert), it cannot figure out this is a person?
The body track CHOP is internally clamped at 8 people based on current limits in the Nvidia Maxine SDK.
It’s not as precise as skeletal body tracking, but I’ve seen users merge depth data from multiple cameras into a single point cloud and then use a top-down blob track on the whole scene to identify people.
I’m curious how you’d start to merge multiple cameras into a point cloud. I’m interested in doing this for a current project but don’t quite know where to start with the merging idea
If the cameras can output a point cloud image like the Kinect, Orbbec, etc then you can use a Point Transform TOP to align the points from one camera into the space of the second camera. You can then use the pointMerge component from the palette or just a Layout TOP to combine both point cloud images into a single image.
If your camera can only produce a depth map then you can use the depthProjection component from the palette to convert it to a point cloud image.
This is a technique I’ve used quite a bit. I use the Point Transform TOP to align my cameras and then I use my PointCloudClipper tool to clip off any walls/floor/things I don’t want to track.
You can get that here:
You can then render that point cloud at a fairly low resolution and do blob tracking on it from any direction you wish.
ah that all make sense. but the Nvidia body track chop outputs as positional body data. So I don’t think either of these options work. I am personally just using a usb web cam.
The Nvidia body tracking only works on a single image and I’m not sure if there is any good way to combine multiple camera sources into a single workspace.
In theory you could to a 3D transform on the skeleton points using a Transform XYZ Chop so that the skeletons from 2 different cameras are in the same world space. But I don’t know if the Nvidia 3d projection is reliable enough for that to be consistent.
I know people have been interested in a body tracking solution from a merged point cloud from multiple sources, but I’m not aware of any options for that right now.
The stereolabs SDK has a “fusion” module which seems to allow multi-camera body tracking. It’s not supported in TD, but you might be able to set this up in external python and stream the skeletons to TD
I haven’t worked on the ZED cam much myself, but I will ask the team about this. We’re actually working on a few small updates to the ZED Top right now, so it may be a decent time to at least check out what is involved with this.
The zed cam need some serious updates haha, amazing that the team is working on that.
For now I came up with a solution, I got the 7 camera feeds into touchdesigner, merged them all to make one video feed, then passed the video into OpenCV and did person detection.
In my use case i didnt needed skeleton information just user position, which i got from OpenCV.
Using the positions of the users I was able to create an effect where the user was positioned.
I am also getting the approximate size of the user as well so I am then using this to adjust the size of my effect.
@shahanhyperspace I was going to try this option and its good to know its working. My setup is a long walkway 6m x 30m so i was thinking to place 6 cameras every 5 meters in top down setup. I can stich the 7 cameras and create a long video. but my concern is can the bodytrack or opencv detect people from top view? since it doesn’t have a good front view of the persons from top view.