Different CRC32 values from Linux and Windows with Python3

0

I've been pulling my hair out with some mismatched CRC-32 values. Originally thought it was my own C# implementation at fault, but traced through common problems (binary vs text files, etc) and looking at the Python and zlib implementations. Now I'm thinking it is something subtle in the hardware/file system. The following results are both with the following Python...

import binascii
buf = open("my_sample_file.zip","rb").read()
print(len(buf))
print( binascii.crc32(buf) & 0xffffffff )

First machine: Ubuntu 18.04.2 LTS. Intel 64 bit. Running Python 3.7.3, 64 bit (bit size confirmed with sys.maxsize). This gives a length of 6239603497, and a CRC of 720283681.

Trying the same script on a Windows 10 PC 64 bit, Intel. Python 3.7.2 64 bit, I get a length of 6239603497 (good), and a CRC of 2907064737!

The zip file unzips fine, and the contents are good.

So why the difference? As far as I can tell, both are using the same underlying zlib library. The only clue is that this is a very large file (around 6GB). Smaller hand-typed "hello world" tests give the same crc.

Currently thinking of dropping the CRC check and replacing with a simpler checksum or xor that I can code myself and has other advantages (eg. download chunks can by pre-calculated and combined at the end for speed).

python
linux
python-3.x
windows
crc32
asked on Stack Overflow Jun 23, 2019 by winwaed • edited Jun 23, 2019 by winwaed

0 Answers

Nobody has answered this question yet.


User contributions licensed under CC BY-SA 3.0