Hi Tom.

Thanks for your reply. I believe that DOM does frame-level hashing when sending frames over the network when using remote encoding nodes (to catch any data corruption in network transmission), but I don't know if it does this for local encodes.

In general terms, it is possible to hash a file *while* it is being created, as long as the file is created sequentially (i.e. no random access writes). Rather than waiting for the whole file to be created and then running the hash algorithm on the whole file, you stream the file into the hash algorithm - so hashing happens concurrently with creation of the file.

But I don't know if DOM works like this. When creation of the file can be interrupted and later resumed (as it can in DOM), this is a bit more complicated to implement, but it is possible.

Jim


On 16/10/2018 00:40, Tom Haines wrote:
I don't know if DoM does frame level hashing while encoding, but the hashes that are stored in the DCP that are read by the cinema servers are hashes of the full MXF and therefore can't be calculated until all frames are encoded.

On Mon, Oct 15, 2018 at 4:34 PM Jim Dummett via DCPomatic <dcpomatic@carlh.net> wrote:
Hi Carl.

I have a rather obscure question: At what point in the process of DCP
encoding are the hashes calculated?

Reason I am asking is that I have been making a large batch of DCPs but
now I suspect that one of the drives I've been using is faulty. Some
other files on the drive have become corrupt.

I've checked that all the DCPs on that drive pass a hash-check, but...
are the hashes calculated:

1. As the MXF files are created *before* they are written to disc?
or
2. Files written to disc first, and then read back from disc afterwards
to calculate hashes?

Because DCP-o-matic has the lengthy "calculating checksums" phase at the
end of encode, I am suspecting it might be the latter.

If it is, then I am wondering if it's possible some of the DCPs I've
made are corrupt.

I'm thinking that if any of the data got corrupted between DCP-o-matic
writing it to disc and then reading it back again to calculate the
hashes, the hashes would reflect the corrupted data. And so the fact
that my hash-check passes wouldn't guarantee the DCP is as it should be.
Or is there some other integrity checking process which would make this
impossible?

Sorry for rather random question.

Many thanks,

Jim



_______________________________________________
DCPomatic mailing list
DCPomatic@carlh.net
http://main.carlh.net/cgi-bin/mailman/listinfo/dcpomatic


--

TOM HAINES

Screenings | Originals Publicity

M: 310.467.3069 | thaines@netflix.com

5808 W. Sunset Blvd, 5th fl | Los Angeles, CA 90028