My understanding is that the master decodes the source file and sends
out frames to the slaves in raw (uncompressed) format. The bit depth and
colour-subsamping (444, 422 etc) of the source material are maintained
in what's sent across the network. Therefore network bandwidth depends
on resolution, bit depth and colour sub-sampling of source material.
I don't know if colour space conversion is performed on the slaves or
the master.
I've seen network speed become the limiting factor when I had 4 x 4-core
iMacs on a gigabit network, encoding from ProResHQ 1920x1080 (422 colour
sub-sampling). With a larger number of slaves, I've also hit a limit of
the speed at which the master can decode the source file (about 60fps).
Of course it'd be more efficient if the source file was streamed to
slaves over the network in its native codec (compressed), but I imagine
that'd be hard to implement with codecs like H264 where frames are not
independent of each other (long-GOP).
In your case Wolfgang, my guess is there's 2 things going on:
1. Network saturation due to 12-bit colour and no subsampling
2. Frames being corrupted/lost in transit (the async_read errors)
The two issues may be related. This is all just a guess though.
Maybe try reducing number of encoding threads on the slaves and see if
that reduces the errors?
Jim
On 11/05/2018 17:21, Manuel AC via DCPomatic wrote:
I always thought the the slaves received a transformed
frame in an
stable unknown (to me) format. And that's why the master gets easily
overcharged when converting unfriendly formats.
Manuel AC
On Fri, May 11, 2018 at 8:50 AM, Wolfgang Woehl via DCPomatic
<dcpomatic(a)carlh.net> wrote:
All systems are Windows 10; dcpomatic version
2.12.4 git 001a7047cb on all systems (current version).
Yes, master is encoding as well (and carrying the brunt of the load).
Hm, network throughput ... the tiffs are ~ 9 MB (75 Mbit). That should kind of work for 3
slaves, no?
Wolfgang
Am 11.05.2018 um 14:24 schrieb Carsten Kurz via DCPomatic <dcpomatic(a)carlh.net>et>:
Source
material is 12-bit tiffs with 9 MB each.
We’re seeing a lot less throughput than expected: ~ 6 fps. On the slaves status says ~
0.1 - 0.2 fps respectively.
Hi Wolfgang,
which version of DCP-o-matic are you using? I assume Linux for both master and slaves?
Do you allow the master to encode as well?
I guess 6 * 9MB could already be pushing it for a gigabit network. Hard to say wether
there is a code related issue. Carl will comment.
With smaller source files, we have seen network encoding up to 50fps. I have done a
Sintel test at 2048*858/8Bit at 17fps on a couple of older OS X machines.
Sometimes 'strange' TIFF versions could also cause slow downs.
I guess I would try to gain some insight by adding one machine at a time. I think a while
ago Carl added code to support encoding servers on multiple network segments. Maybe adding
more network ports/cards will help.
I love benchmarking, but I don't have access to such a configuration currently.
- Carsten
_______________________________________________
DCPomatic mailing list
DCPomatic(a)carlh.net
http://main.carlh.net/cgi-bin/mailman/listinfo/dcpomatic
_______________________________________________
DCPomatic mailing list
DCPomatic(a)carlh.net
http://main.carlh.net/cgi-bin/mailman/listinfo/dcpomatic