Hi all,
I’m troubleshooting StreamDiffusionTD 0.3.0 running locally in TouchDesigner and I’m stuck. I’m looking for guidance on known-good settings / versions and how to resolve a recurring dtype error.
1) Environment
-
TouchDesigner: 2025.32050 (Non-Commercial)
-
OS: Windows 11
-
GPU: NVIDIA RTX 3090 (24 GB)
-
NVIDIA Driver: Studio Driver [PLEASE FILL: version]
-
StreamDiffusionTD:
StreamDiffusionTD_0_3_0.tox(local mode) -
Internet works; HuggingFace downloads begin normally.
2) What I’m doing (minimal repro)
-
New empty
.toe -
Load
StreamDiffusionTD_0_3_0.toxinto a COMP -
Feed a simple TOP (MovieFileIn → Transform) into the tox input
-
In the tox UI:
-
Model:
stabilityai/sd-turbo -
Resolution: 512 × 512
-
IP Adapter: OFF
-
ControlNet: enabled but Weight = 0 (effectively off)
-
Acceleration: TensorRT
-
-
Click Start Server / Start Stream (local)
3) Problem A — “Loading Models” stalls (text encoder)
Often, first run gets stuck for a long time at:
-
text_encoder/model.safetensorsstuck at 99% (or later 100%) -
TD shows “Loading Models…” indefinitely
-
A cmd window shows HF downloads / progress, then no more activity.
What I tried for this:
-
Verified HF cache location is the default:
C:\Users\<me>\.cache\huggingface\hub\...
-
Found leftovers:
-
.incompletefile ~1.26 GB underhub\models--stabilityai--sd-turbo\blobs\... -
.lockunderhub\.locks\models--stabilityai--sd-turbo\...
-
-
Closed TouchDesigner completely
-
Deleted the related
.incompleteand.lock -
Relaunched
This helped it move from 99% → 100%, but then I hit Problem B.
4) Problem B — streaming loop error (HalfTensor mismatch)
After “Loading Models” completes, streaming fails with repeating errors in the cmd log:
ERROR - Error in streaming loop: Input type (torch.cuda.HalfTensor) and weight type (torch.HalfTensor) should be the same
I also see lines like:
IPAdapter disabled in config(expected, since I turned it off)
Net result: TD status stays something like “Server connected, not streaming” / “Stream offline”.
5) Key questions for the community
-
Is TensorRT acceleration known to work reliably in StreamDiffusionTD 0.3.0 on Windows + TD 2025?
- If yes: what are the known-good versions (Torch / CUDA / TensorRT / driver)?
-
Does the dtype error typically indicate:
-
a Torch/CUDA mismatch, or
-
TensorRT engine built with a different precision/device, or
-
a config flag I’m missing (force FP16/FP32, move weights to CUDA, etc.)?
-
-
Is there a recommended model for local StreamDiffusionTD 0.3.0 that is known-good?
- I have
sd-turbo,sdxl-turbo,openjourney-v4available in the tox UI.
- I have
-
Is there a “clean reset” procedure you recommend besides deleting HF
.incomplete/.lock?- e.g., clearing a TensorRT engine/cache folder created by the tox?
-
Any known compatibility issues with TouchDesigner Non-Commercial specifically, or should this behave the same as Commercial for local inference?
-
Is there a compatibility problem with my current TouchD non-commercial version and the StreamDiffusionTD 0.3.0 tox?
Thanks in advance—any pointers to a known-good setup or the likely cause of the HalfTensor mismatch would help a lot.