Hey everyone! ![]()
I’m Vittorio from SUPERBELLO, a video production & live events agency based in Milan. We’re exploring an idea that combines TouchDesigner with real-time AI video generation for conference installations, and I’d love to get some input from the community.
The Concept
We want to create a reactive visual system where:
-
A speaker’s voice is captured live via microphone
-
The audio signal (amplitude, frequency, envelope) drives visual parameters in real-time
-
Specific keywords trigger scene changes or switch between different AI-generated animations
-
Everything outputs to a small LED wall as a dynamic background
TL;DR: speaker talks → audio transforms into AI visuals → LED wall reacts, with trigger words changing the mood/scene.
Questions for the Community
1. AI API Integration
Has anyone successfully connected AI video generation services (Runway, Stability, ComfyUI, local models) to TouchDesigner for live use? What’s the best approach — REST API, WebSocket, local inference? Any TOX components out there for this?
2. Speech-to-Text / Keyword Detection
For detecting spoken keywords in real-time to trigger scene changes: is it better to handle this inside TD (Python, Script CHOP) or use an external service (Whisper, Google Speech API, etc.) and pipe data back in via OSC/WebSocket?
3. Dealing with AI Latency
Since AI generative production (no real-time) typically takes 5-30+ seconds, we’re thinking:
-
Pre-generate a library of clips mapped to keywords
-
Use AI style transfer on existing video loops for faster response
-
Blend/morph between cached outputs during transitions
Anyone tackled this “real-time feel with non-real-time generation” problem before? What worked?
4. Hardware Recommendations
What specs would you suggest for stable 90-minute live sessions outputting 1080p-4K at 25-60fps to LED wall?
Our Situation
We’re TD beginners (minimal experience so far) and trying to figure out if this workflow is realistic before investing in Pro licenses and building it in-house. Any insights on complexity, learning curve, or whether we should bring in a specialist would be super helpful.