StreamDiffusionTD 0.3.0 (Local) stuck on “Loading Models” / then “HalfTensor mismatch” error — Windows 11 + RTX 3090

Hi all,
I’m troubleshooting StreamDiffusionTD 0.3.0 running locally in TouchDesigner and I’m stuck. I’m looking for guidance on known-good settings / versions and how to resolve a recurring dtype error.


1) Environment

  • TouchDesigner: 2025.32050 (Non-Commercial)

  • OS: Windows 11

  • GPU: NVIDIA RTX 3090 (24 GB)

  • NVIDIA Driver: Studio Driver [PLEASE FILL: version]

  • StreamDiffusionTD: StreamDiffusionTD_0_3_0.tox (local mode)

  • Internet works; HuggingFace downloads begin normally.


2) What I’m doing (minimal repro)

  1. New empty .toe

  2. Load StreamDiffusionTD_0_3_0.tox into a COMP

  3. Feed a simple TOP (MovieFileIn → Transform) into the tox input

  4. In the tox UI:

    • Model: stabilityai/sd-turbo

    • Resolution: 512 × 512

    • IP Adapter: OFF

    • ControlNet: enabled but Weight = 0 (effectively off)

    • Acceleration: TensorRT

  5. Click Start Server / Start Stream (local)


3) Problem A — “Loading Models” stalls (text encoder)

Often, first run gets stuck for a long time at:

  • text_encoder/model.safetensors stuck at 99% (or later 100%)

  • TD shows “Loading Models…” indefinitely

  • A cmd window shows HF downloads / progress, then no more activity.

What I tried for this:

  • Verified HF cache location is the default:

    • C:\Users\<me>\.cache\huggingface\hub\...
  • Found leftovers:

    • .incomplete file ~1.26 GB under hub\models--stabilityai--sd-turbo\blobs\...

    • .lock under hub\.locks\models--stabilityai--sd-turbo\...

  • Closed TouchDesigner completely

  • Deleted the related .incomplete and .lock

  • Relaunched

This helped it move from 99% → 100%, but then I hit Problem B.


4) Problem B — streaming loop error (HalfTensor mismatch)

After “Loading Models” completes, streaming fails with repeating errors in the cmd log:

ERROR - Error in streaming loop: Input type (torch.cuda.HalfTensor) and weight type (torch.HalfTensor) should be the same

I also see lines like:

  • IPAdapter disabled in config (expected, since I turned it off)

Net result: TD status stays something like “Server connected, not streaming” / “Stream offline”.


5) Key questions for the community

  1. Is TensorRT acceleration known to work reliably in StreamDiffusionTD 0.3.0 on Windows + TD 2025?

    • If yes: what are the known-good versions (Torch / CUDA / TensorRT / driver)?
  2. Does the dtype error typically indicate:

    • a Torch/CUDA mismatch, or

    • TensorRT engine built with a different precision/device, or

    • a config flag I’m missing (force FP16/FP32, move weights to CUDA, etc.)?

  3. Is there a recommended model for local StreamDiffusionTD 0.3.0 that is known-good?

    • I have sd-turbo, sdxl-turbo, openjourney-v4 available in the tox UI.
  4. Is there a “clean reset” procedure you recommend besides deleting HF .incomplete/.lock?

    • e.g., clearing a TensorRT engine/cache folder created by the tox?
  5. Any known compatibility issues with TouchDesigner Non-Commercial specifically, or should this behave the same as Commercial for local inference?

  6. Is there a compatibility problem with my current TouchD non-commercial version and the StreamDiffusionTD 0.3.0 tox?

Thanks in advance—any pointers to a known-good setup or the likely cause of the HalfTensor mismatch would help a lot.

Your best bet for StreamDiffusionTD support is in DOTSimulates discord

1 Like

I’m already in, but nobody answered yet. Anyways, Chat GPT 5.2 is my friend.