Gaze direction with Kinect Azure

Hi,
I am asking here since I have not dabbled much with the Azure Kinect in TD. Basically, I’m trying to mesh 4 azure kinects around a room in order to track several people at once from different directions. The goal is to infer their gaze direction using their head position and rotation data; then, they fire a vector from their head position towards the walls to calculate the area where a person is looking at a particular time. I am also planning to check for any intersections of gaze vectors (if there is more than one person). Mainly I want to know the rough world space coordinates (area) of where each person is looking in a room (without the use of eye trackers).

Before I go deeper into the problem, I was wondering what the best approach would be? Do I even need so many kinect azures? If so, would you recommend I merge into one view using Azure Pointcloud Merger | Derivative and then work on that - or should I work on each Kinect separately?

The idea is to extract the quaternion for the head joint and then convert to a direction vector. From there use the line tool to visualise the vector. My further question then is - is my approach over complicating things? Am I missing some out-of-the box implementation that I can use? Do I even need the depth information for such a problem - or would just using the colour camera with say mediapipe head detection be enough?

Other people may have some different suggestions, but regarding the Kinect approach: this should work in general, but just note that there isn’t really a good way of automatically matching up users between multiple kinect azure cameras. All of the body tracking is done separately, so user 1 for one camera might be user 2 on a different camera. I’ve seen users merge the point clouds together and then do blob detection on the results, but this won’t work if you need skeleton data. I’m not sure if there’s a good way to infer direction from just the point cloud blob.