I’m trying to playback either 7 HD videos OR a single 1920x7560 video which must be in huffyuv encoding (made with ffmpeg) or another lossless format. It MUST be lossless because it has specially encoded autostereoscopic content which would get messed up by lossy formats.
When I play individual streams I’m hitting 5-6ms each moviein TOP which adds up to too much time (trying for 30fps). With the full video file I get huge variation (20ms-140ms) in timings.
I’m wondering if there are any tips for HW, drivers, Touch parameter tweeks, or CODEC options that might help with this situation.
How are you using the GPUs? Are you just using one TD instance spanning the outputs on all the GPUs? This can cause issues, so check out this article: derivative.ca/wiki088/index. … phic_Cards
I’m not sure of the decode performance of huffyuv is. What does the Info CHOP say for the ‘last_frame_decode_time’ for one of these movies? I find that lossless H264 is always an option (compressible with ffmpeg).
1 TD instance for all outputs - correct. Running in Quadro Mosaic mode. I’m not on the machine now so I can’t check the Info CHOP values but I will tomorrow.
1920x7560 movie:
I’ve tried h264 lossless which works as a codec but I get times of 7fps and times of 50-60fps (depends on how much is moving it seems) with a 1920x7560 movie
But I need to understand WHY it drops to 7fps and how to get it higher.
1920x3240 movie:
I’ve tried taking a smaller (3screen) h264 file and duplicating it to fill 6 screens and I get fps of
What I also see when the fps is low is ‘Waiting for frame for #### - filename from harddrive’ in Performance Monitor. Can I assume then that this is really a DISK IO issue?
I see the moviein cook at 2.073ms and movie of 108.389ms as an example. In raw FPS I see no difference between the 7screen input and 3screen input movies.
I’m trying to figure if I need to:
get faster disk io (seems odd with 3xRAID0 SSD but…)
run seperate instances of touch with gpu affinity if I’m being bound by decode on GPU?
get more CPU and/or instances if I’m being bound by decode in CPU
Any thoughts on how to progress with figuring out a solution to my bottleneck?
Your sentence was cut off for the FPS you get with the 1920x3240 movie. Do you get ok FPS with the smaller file?
You are most likely running into decode time issues here, not SSD/HDD read speed (sorry, that error message isn’t entirely clear, it means both read from SDD and decode time).
You can use the “disk_read_bit_rate” Info CHOP channel to look at how much data your file is. This number is in megabits/s. If it gets anywhere near your SSD’s read speed, then you are probably running into issues with SSD read speed.
You should split your content into smaller files for sure, as I don’t think H264 can decode such a high resolution in real time. For high resolutions I’d also suggest looking into HAP Q as a coded. You may need to split your work across more SSDs with this coded since it’s not very compressed, but it’s super fast to decompress.
Oops, I get the same 7-50 fps range on 1920x3240 as 1920x7560!
last_frame_decode_time is 2-3 (ms?) up to 100-200 (when FPS drops)
last_gpu_upload is 0.1-0.4
disk_read_bit_rate is 1-255 (but single SSD is rates up to 500 for seq. read and here I have 3 in RAID0)
I must use a lossless codec - is HAP Q lossless?
I brought up Task Manager and looked at the CPU as best I could while this was happening and at the low FPS I see serious CPU spiking (this is an 8core Xeon) - 4 cores up close to 100% (overall around 50%).
I have another CPU coming but I’m not sure if/how to exploit it best.
To use gpu affinity do I need to disable Mosaic? is tearing likely then? If I run sep. instances will that help the CPU load?
HAP Q isn’t lossless, but it’s very hard to artifact with it. If you do a diff between the source image and the HAP encoded frame you can see the compression losses, but you’d be hard pressed to find those errors without the dif, in my experience. May be worth a test.
I would try going down to 1920x1080 movies so you are playing standard HD resolution. Then add more movies at this resolution and see if that plays ok.
I would focus on this right now, since other issues like mosaic/affinity would help a constant low frame rate, not the variable frame rates one you are seeing right now.
I don’t think gpu affinity will help in your case, but yes, you can have affinity and mosaic at the same time, you need to to avoid tearing. You just have to set up multiple mosaic groups , so you have one mosaic group (with a size matching the number of monitors connected to he gpu) per gpu. Requires latest drivers, the configuremosaic.exe … but it works nicely here on a (4)x4x1 mosaic
I was thinking of gpu affinity ONLY if I’m hitting an h264 decode issue with Mosaic.
If it can’t use HW decode across cards and that is a bottleneck (dropping back to CPU which then drops fps) then gpu affinity might help (only 2-3 screens per card).
All H264 decodes are done on the CPU, so your GPU config isn’t the issue here. Nvidia’s HW H264 decode wouldn’t support the lossless H264 profile anyways.
If I have multiple files, will Touch make more use of the other cores/cpus in 1 instance (more than the 4 it seems to use for 1 video right now) or must I setup multiple instances tied to specific cpus to maximize that?
It depends on the codec. Some codecs have threading supported such as H264. H264 does frame level decoding (1 thread per frame), and by default it uses 4 threads (so 4 frames at a time). You can change this with the TOUCH_MAX_DECODE_THREADS env var, but I find 4 is a good number. If you have loads of movies open your memory usage/thread count will explode if you make this too high. Also, having a higher number of threads will increase your seek/loop times since it’s essentially creating a larger buffer queue.