Hi everyone,
I’m working on a project in TouchDesigner where I want to create visuals that change dynamically based on specific words being spoken in real-time. My goal is to process live audio input (e.g., from a microphone), transcribe it into text, and then trigger specific visuals depending on the detected words or topics.
What I’d Like to Know:
- Has anyone successfully implemented a similar setup in TouchDesigner, where visuals react to spoken words or topics?
- Are there better ways to manage live audio chunks in TouchDesigner (without using a
Record CHOP
) for external APIs like Whisper? - Any tips on optimizing the transcription and word detection pipeline to minimize latency?
I’d love to hear your thoughts or approaches to solving this problem! Any examples or advice would be greatly appreciated.
Thank you in advance!