A simple Q&A bot from the TouchDesigner offline wiki (OpenAI, Pinecone, Langchain)

UPDATE[ 5/10/2023 ]: Links down, I’m working on a beta of this tool, which will be more performant, and enjoyable to use. I’m working through better ways to process/prepare wiki data for language model ingestion and also a new and improved chat interface/experience… stay tuned :slight_smile:


We’re seeing a lot of really wild services and startups based on what’s beginning to be possible with connecting LLM based AI to various forms of data you already own.

I wanted to try and setup a very crude prototype of a question answering AI bot that can communicate with the vast TouchDesigner wiki / documentation.

This does cost some small moneys to run so if you’re finding this post later on down the road, it may or may not be live anymore, but I’ll try to relay any updates to location or status as time goes on :slight_smile:

Do let me know what your impression is - the hardest part of setting something like this up is preparing / cleaning the input data to be broken up into logical chunks. feedback will help me rethink how I’m parsing the html down the road.

TD Q&A Bot

9 Likes

Not bad!

How does instancing work, and how do I build a network which instances 1000 spheres moving in a circle?

Response:
Instancing in TouchDesigner works by creating copies of a geometry object which can be transformed independently. To build a network which instances 1000 spheres moving in a circle, you would need to create a Geometry COMP with an Instance CHOP, and supply it with X, Y, and Z channels. You would then need to create a SOP with 1000 points, and use this to drive the Instance CHOP. Finally, you would need to create a CHOP network to animate the points in a circular motion.

2 Likes

Question:
Is Touchdesigner a push or a pull-based system, and how does that influence the networks I design?

Response:
TouchDesigner is a pull-based programming environment. This means that an operator only cooks (processes information) when it is requested by another operator in the network. Moving wires help us see at a glance that something is cooking.

3 Likes

it’s always nice to ask it for something and get RFEs back:

To create a render setup with particles from a pointcloud, you will need to use a Particle SOP to create the particles from the pointcloud, and then use a Convert SOP to convert the points into particles. You can then use a Point Render node to render the particles, and adjust the camera using the internal viewport camera component.

3 Likes

hahaha yeah I’m also searching where to find the Instance CHOP, please add to RFE list @snaut

Its so interesting seeing these reactions. @nettoyeur your first Q was a good example use case, where really the question needs an answer that pulls from several parts of the wiki at once to be more accurate, and could use a bit more context/length in the response.

At present, the wiki page chunk (limited to token sizes of LLM’s so long pages are broken up) that is deemed most relevant by using a cosine similarity lookup (aka dot product) is retrieved, then fed to openAI as context, and openAI attempts to pull the answer to your question out of that much smaller text block.

I can already get back a list of all docs ranked by their relevant to the question, I think I need to test some ways to aggregate multiple contexts to answer questions instead of just attempting to use the “most” relevant. I’ll keep thinking on it :slight_smile:

1 Like

Also to reiterate what is stated in the link above, yes the questions and the answers are saved. I am hoping to use some of these answers that come back wrong, to battle test the system and measure improvements in following iterations.

The only information that is kept is query and answer. No record of who asked, ip address, etc is kept.

Here’s an example of what is appended to the log.txt:

Query:
What is TouchDesigner?
Answer:
TouchDesigner is a software product from Derivative (Toronto and Los Angeles) which is used to build interactive 3D and 2D applications. It is “procedural”, “node-based”, real-time and is considered a visual programming language. It is designed to give its users enormous flexibility in building applications without needing to program in a conventional way.

1 Like

Quesiton
How do I create a state machine in TouchDesigner?

Response
Initialize Start is an approach in TouchDesigner to prepare and run timed processes. It applies to any system in TouchDesigner where there first needs be an initialization (the “initializing” state), like reading files into memory, querying servers, pre-simulating dynamics or starting external processes. When initializing is complete, it goes into the “ready” state. Then it is signaled to start via a “Start” pulse parameter. Upon Start, it goes into the “running” state, and when complete (when timer reaches its length and it’s not looping, or when Go to Done is pulsed) it goes into a “done” state.

Question
What is the best GPU for a TouchDesigner installation

Response
For TouchDesigner video servers built for frame accurate professional video applications, the Quadro line should be used because it has a range of features specific to high resolution multiple screen mapping as well as particular frame sync technologies. For most personal TouchDesigner development systems the Geforce line is preferable if for no other reason than price.

Question
When should I store data in an extension member or in an operator’s storage?

Response
You should store data in an extension member when you want to reference the data in Parameter Expressions and have those expression values update automatically when the property value changes. You should store data in an operator’s storage when you want the data to be persistent and saved with the operator.


I do love an honest response

Question
How do I build a multiplayer game with TouchDesigner

Response
I’m sorry, I don’t know.

4 Likes

It sees extra commercial opportunities when I did not think of them yet

Question
suppose we are writing the script for a James Bond movie. How could TouchDesigner be used as the main weapon of the movie’s villain to conquer the world?

Response
TouchDesigner could be used to create interactive 3D and 2D applications that could be used to control the villain’s weapons and other tools. It could also be used to create visual effects and animations for the movie.

6 Likes

just a quick update - I have collected enough Q&A’s from this initial alpha phase to see a few patterns of questions that did not get answered properly and it has led me down a path of better data preparation/manipulation.

the raw wiki format is ideal for human reading but due to the way embeddings are generated from text blobs it is really not ideal for naive processing, so carefully scripted prep of that data into a more NLA friendly format is what I’m working on now.

There’s a lot of repetition across all of the wiki pages intentionally, IE when any chop page is shown, common parameters and info chop pages are displayed on each, so embeddings are muddied a bit by this copypasta.

Anyways, to keep it brief I’ll update everyone here again when the BETA is live - you can expect a nicer looking interface, an actual chat style interaction with the bot with some history of previous messages you can rely on, and much better integration with the TD documentation.

I’m working only with the core node family docs, the ones you get when you click the question mark for a node. Once that’s performing much better it should be possible to extend that to python classes as well, then beyond.

more on this soon :slight_smile:

1 Like

Hey @lucasm,

for parsing the wiki I use something called “mwparserfromhell” which is included as a lib in python in TD. There you can parse the pages and remove transclusions where necessary. Might help automate that a bit?

cheers
Markus

@snaut thank you for that suggestion! yeah I came across that some weeks ago and it is currently the backbone to my more meticulous parsing efforts. mediaWiki api is pretty stellar too, it’s making it possible to access the page categories and page names etc and bucketize things a bit more logically.

Yeah - the combination really makes it work nicely. Let me know if you need more insights, developed some tools to do autmatic page generation and parsing…

cheers
Markus

1 Like

A little update on my end - been spending more and more time trying to visualize / understand how certain fairly easy questions get answered incorrectly over the last few weeks, and one avenue of investigation has led me down the road of visualizing the wiki embeddings on a 2d graph using dimensionality reduction (tsne).

The results are interesting to look at so I wanted to share a couple examples here:


Essentially what I’m finding is that a question about a specific TOP might end up retrieving one chunk about one top but another chunk about another similar TOP. and this is starting to make sense seeing this.

I am using the instructor embeddings model (facebook research) and it does an excellent job of clustering similar types but some of these groups are really tightly packed which makes similarity search really fickle.

One path is capturing more nearest documents and doing Q&A from a larger subset, but then this incurs more cost on the API and the final response takes longer to generate, so I am investigating several different embedding libraries, but also other ways of generating these embedding vectors.

Anyways, interesting stuff. More later :slight_smile:

3 Likes

This is amazing
exploding_head

And some of the bad answers are quite funny