TD Kinect Tutorials and Best Practices?

Hi folks. Some time ago I created a Kinect driven TD installation for an art show and I have been asked to revisit it. In the end, it worked, but it was not as polished as I would like. Are there any good tutorials on creating Kinect driven experiences in TD? Any best practices? I’m looking at creating an interface, switching between multiple interfaces, painting with gestures, etc.

Some of the problems I had were…

Having the on-screen cursor match where the user was pointing. I was using joint positions (vertical and horizontal position of hand) but I believe I would need to do some vector math to determine where the cursor should be. I’m not familiar with this math, any tutorials?

Switching from one interface screen to the next. Once the user selected the art work on the first screen, selecting something else on the second screen would actually result in the user selecting a different piece of art on the first screen. I believe I had a problem layering where one screen was layered overtop the other and the buttons from the first layer were still active. I think I got around this be disabling the first layer but I suspect that there is a much better solution.

The experience often tracked a person in the background instead of the person that was trying to interact with the experience. Sometimes the person would walk away and a new person would come in but the experience was still tracking the other person as long as he/she was in the field-of-view. I think the way around this would be to setup detection zones, but maybe there is a better approach?

Any help would be greatly appreciated. :slight_smile:

You could calculate the orientation of the forearm and subsequent intersection of a ray cast along that vector, but probably easier to just scale hand translation to a range that comfortably covers the screen. If you don’t want people to have to walk around to reach sides of screen maybe calculate hand translation relative to body (scaled with some leeway for people with shorter arms to reach the edges of frame too).

Not enough information on your multi-screen interface. In general it makes sense to have some sort of enable/disable logic. Are you using Button COMPs (or similar TD GUI objects) or just visual representation of interactive regions on screen?

Crowded rooms are usually tough for Kinect, especially if you need to only pay attention to a single user. Detection zones seems like a good idea if it is obvious to users where the zone is. You could also do it by whoever is the closest to screen, or whoever is most “active” (perhaps whoever has most motion in hands over a duration of time)… or weigh several parameters to determine who is active user. If it is more a problem of users not knowing who is active you could possibly use Kinect TOP to show an overlay of active user somewhere on screen.

To get the cursor tracking a persons joint overlaid on an image, did you try Image Space Positions? Just turn on the toggle in the Kinect CHOP to get these positional values instead of values in 3d space. Depending on what you are doing, they can be easier to work with.

Is there a way to control the range that the Kinect sees in TD? I can test that the user(s) is/are within a certain distance from the Kinect but I’d like to prevent tracking the user at all if possible.

Assuming that the answer to the above is no, I have been able to gather a list of players that are within the tracking area and place them in a table. The table currently contains the player number (p1, p2, etc.), the player ID, and the tracking status. Is there a way that I can use the values in the player numbers column (p1, p2, etc.) in a select such that I only select the channels associated with those players thus ignoring the players outside the tracking area? I hope the above makes sense.



Yes a python expression in the Select CHOP(s) can do that for you. But you could also just select each player in their own Select CHOP using something like p1* and then use the tracking channel 0 or 1 to include/exclude the channels or zero out the channels or control the rendering etc. You can use a Math CHOP and multiply the tracking channel by the other channels, so when its 0 they are all 0 and when 1 the actual value is passed through, or use the tracking channel on a Switch CHOP to include/exclude the channels like a on/off switch.

Thanks Ben,

What I ended up doing was I defined a tracking “zone” and determined which players, if any, were in that zone. Once I determined that, I used a replicator to create a select for each player detected (using their playerIDs), and then merged the resulting channels and used that data for tracking. Is this an efficient approach or would what you suggested be better?

Ideally I would like to prevent them from being tracked by the Kinect at all so that I don’t end up wasting any of the 6 available skeletons on people standing in the background.

Thanks again.


I’m not sure you can do that, the Kinect decides who it wants to track by itself before it passes the data to TouchDesigner.

Not a high tech solution: Sometimes you can get away with aiming your Kinect at a weird angle and put some cardboard/tape over part of the depth lens to block the area you want to ignore, then re-range Kinect CHOP channels to correct for weird camera angle. Had to do that recently at a crowded event. Worked ok.

I’ve been looking into this exact same issue. Have created a multiplayer installation at a Museum and every now and then people outside of my ‘playzone’ are detected in the background and take up 1 of the 6 skeleton spots.

I’m trying to find a way to get lower level into the kinect data before the skeletons are established to prevent this, but obviously quite complex and struggling to find good leads here.

Physical options are helpful to a certain extent, but weird angles can also sometimes influence the kinect skeleton stability.