I’m building a pretty complex system where I’m going to need to keep track, recall, and load a lot of data (floats, ints and strings). I still haven’t %100 decided on the way I would like to do this.
I’ve started to build a hierarchy of base COMPs and DATs because I assumed that accessing data from a DAT would be faster than from a custom Python object stored in a COMP. Then today I realized though that I would be using a Python script anyways to store and fetch this data so I’m not sure my assumptions were correct.
What would be faster (less cook time)? Some of this data would be attributes of various COMPs (not set (copied) very often) other data would be parameter values that could be changing every frame depending on whether the parameter was animated in some way.
I thought it might be a hard question to answer without some definitive testing so I did a test.
3 execute DATs
the first execute sets each cell of a table 5000 rows long (with the value of the index of a for loop) every frame
the second stores 5000 keys with an integer value (index of for loop) in a base COMP
and just for thoroughness I did a tscript test writing to a table
Looks like my original assumption was correct.
the performance monitor reads:
1.5 ms cook time for the execute writing to the table
2.5 ms cook time for the execute setting the storage keys
56 ms frame time for the tscript loop (it didn’t return as one cook but as many table propogation changes, looks like Python is way faster than tscript for setting values…)
I also was wondering about reading. Although I forgot about dictionaries. New results. Also my writing times to the table weren’t accurate before because they’re were less operations in the for loop, I was just writing a number not making a name like I was with with the storage.
Each of the scripts has a similar for loop with nearly the same operations
Here are more accurate results:
Table (referenced by index):
read - 1.6 ms
write - 1.6 ms (2.6 ms for writing a concatenated string - similar operation to creating the storage name)
Table reference by row name:
read - 55 ms
write - 55 ms
storage individual keys:
read - 2.5 ms
write - 3.0 ms
storage dictionary:
read - 1.6 ms
write - 2.6 ms
So it looks like Python storage destroys DAT tables for speed when referencing by name (which in most case is much more practical).
It looks dictionaries can be considerably faster than recalling individual values.
All these tests were done using a loop iterated 5000 times. A bit extreme for any practical case but I think it gives a good indication the overall speed. I did just do a test with only 5 iterations and the differences were negligible…
One question regarding your dictionary storage method, how exactly are you accessing it? Are you iterating over the keys() of the stored dictionary? Are you accessing the stored dictionary every time in the for loop, or are you aliasing it with a variable prior to the loop? Do you have an ordered list of key names that you use so that you can also utilize an index?
I’ve been under the impression that creating alias or copied variables for different data types in Execute scripts ahead of any of the functions (def my Func() is faster since the data is already loaded into the script’s memory as opposed to calling it within the function, and certainly as opposed to calling it on every iteration. Also I typically will do a copy operation on stored objects because I haven’t quite figured out if making an alias/variable to something like
myTargetDictionary = myOp.fetch[‘myDictionary’]
means that that fetch operation is used everytime i use myTargetDictionary, but I know that if I say:
will give me a discreet copy of that dictionary, the main problem being that it wont update with the storage unless you “run” the exec script to refresh that data in it’s internal memory…
Hi Peter yeah, I was accessing the dictionary only once before the function, assigned it a variable and then iterated over each key using the variable. I figured it would be quite a bit slower to access each time, but then didn’t realize that it would it access each time anyways because it wasn’t being copied into memory… good call
Although I just did a quick test and assigned the variable using the copy() method instead. It actually slowed down the script by just a bit by. From approx 1.6 to 1.75 ms
Not %100 sure on this but I think the copy operation might be taking more time because it is copying the data to a new memory location while just assigning an alias (or variable) to the dict in storage sets up a pointer to the original memory location. Either way it will have to read from some memory…
Also just checked to see just for reference what would happen if fetching inside the loop. That is considerably slower - 2.6 ms
So the snag is that you won’t be able to see (or manually tweak) your data, as with a Table DAT, if it’s in storage?
Is there a way to get the contents of a Table DAT as a Dictionary? If you have many operations on the same data, and the dictionary were cached, that would be the best of both worlds?
Actually I have been able to get and set individual elements of a dictionary stored in a comp (and of an object stored in a comp composed of multiple dictionaries composed of keys that are dictionaries also… ).
The only problem is that whatever is looking at those values needs to cook in order to update (doesn’t get cooked automatically when the value is changed).
Also if an op is cooking the value will update when the storage changes but value in the UI parameter field doesn’t until a network view change or a right click force cook. ie a constant CHOPs chan1 channel will change value but it’s value0 parameter field doesn’t.
I’m still trying to devise the best (most efficient) method now to use (export or recall to parameters).
You can use an Examine DAT to check if your storage keys/values have changed, it gives a live view on your storage memory in a nice table view. You can set filters on certain keys/values, and together with a DAT Execute you can act on storage change.
Yeah looks like numpy is pretty awesome. For what I’m doing now I don’t know if it’s useful because I’m storing all kinds of data - floats, strings, class instances etc… So far I’ve found dictionaries are working really well for this as well as classes.
Soon though I want to look into some image processing using python (maybe numpy). Also it occurred to me the other day that I could probably use python to make a 32 bit tiff or exr exporter. I looked into the possibility of .tiff export and looks like there are some python libraries so at some point in the near future we might have 32 bit image export…!!!
I haven’t spent much time using numpy structured arrays (or recarrays) - I’m pretty sure they can handle all sorts of dataypes and they are accessible by all sorts of indexing methods. I haven’t done it yet, but there’s ways to convert existing python dicts to recarrays pretty easy:
I’m still a newbie to numpy- but my general understanding is that as a general C library it should be much faster than python in most if not all cases (for handling arrays and matrices).
I would love to try exporting 32 bit images. Here’s a module I found that may be a great head start: