Hi,
I am asking here since I have not dabbled much with the Azure Kinect in TD. Basically, I’m trying to mesh 4 azure kinects around a room in order to track several people at once from different directions. The goal is to infer their gaze direction using their head position and rotation data; then, they fire a vector from their head position towards the walls to calculate the area where a person is looking at a particular time. I am also planning to check for any intersections of gaze vectors (if there is more than one person). Mainly I want to know the rough world space coordinates (area) of where each person is looking in a room (without the use of eye trackers).
Before I go deeper into the problem, I was wondering what the best approach would be? Do I even need so many kinect azures? If so, would you recommend I merge into one view using Azure Pointcloud Merger | Derivative and then work on that - or should I work on each Kinect separately?
The idea is to extract the quaternion for the head joint and then convert to a direction vector. From there use the line tool to visualise the vector. My further question then is - is my approach over complicating things? Am I missing some out-of-the box implementation that I can use? Do I even need the depth information for such a problem - or would just using the colour camera with say mediapipe head detection be enough?