Use ARMv8 CRC Extension to calculate CRC-32/MPEG-2 Checksum

0

I am currently attempting to use ARMv8 CRC Instructions to accelerate the calculation of CRC32/MPEG2 Checksums.

The only examples I found about using these Instructions calculate regular CRC32 Checksums.

uint32_t ZLIB_INTERNAL armv8_crc32_little(unsigned long crc,
                                          const unsigned char *buf,
                                          z_size_t len)
{
    uint32_t c = (uint32_t) ~crc;

    while (len && ((uintptr_t)buf & 7)) {
        c = __crc32b(c, *buf++);
        --len;
    }

    const uint64_t *buf8 = (const uint64_t *)buf;

    while (len >= 64) {
        c = __crc32d(c, *buf8++);
        c = __crc32d(c, *buf8++);
        c = __crc32d(c, *buf8++);
        c = __crc32d(c, *buf8++);

        c = __crc32d(c, *buf8++);
        c = __crc32d(c, *buf8++);
        c = __crc32d(c, *buf8++);
        c = __crc32d(c, *buf8++);
        len -= 64;
    }

    while (len >= 8) {
        c = __crc32d(c, *buf8++);
        len -= 8;
    }

    buf = (const unsigned char *)buf8;

    while (len--) {
        c = __crc32b(c, *buf++);
    }

    return ~c;
}

The difference, according to this list seems to be that for CRC32/MPEG2 the input and output are not reflected and the output is not XORed with 0xFFFFFFFF.

I tried using the algorithm above to calculate CRC32/MPEG2 Checksums by first reversing every Byte in the data ( I actually used just 1 Byte of 0 as data to skip this for now). The resulting CRC32 I then XORed with 0xFFFFFFFF and then reversed but the Result does not match.

For Example:
DATA = 0x00
CRC32 = 0xD202EF8D
CRC32/MPEG2 = 0x4E08BFB4

0xD202EF8D xor 0xFFFFFFFF = 0x2DFD1072
reverse( 0x2DFD1072 ) = 0x13822FED

I am afraid I am in over my HEAD here mathematically speaking. Is it even possible to convert the Result of a CRC32 calculation to CRC32/MPEG2. Is there a way to modify the calculation routine above?

Thanks

c++
checksum
crc
crc32
asked on Stack Overflow Nov 16, 2020 by JohannesW

1 Answer

0

"Reflected" is referring to bits. For a non-reflected CRC with the same polynomial, you would need to reverse the bits of every byte you feed to it. You could use a table to do that. You would then need to reverse the bits of the resulting CRC.

It would be interesting to know if that plus the hardware instructions is faster than a software implementation of the CRC.

answered on Stack Overflow Nov 16, 2020 by Mark Adler

User contributions licensed under CC BY-SA 3.0