_mm_crc32_u8 gives different result than reference code

3

I've been struggling with the intrinsics. In particular I don't get the same results using the standard CRC calculation and the supposedly equivalent intel intrinsics. I'd like to move to using _mm_crc32_u16, and _mm_crc32_u32 but if I can't get the 8 bit operation to work there's no point.

static UINT32               g_ui32CRC32Table[256] =
{
    0x00000000L, 0x77073096L, 0xEE0E612CL, 0x990951BAL,
    0x076DC419L, 0x706AF48FL, 0xE963A535L, 0x9E6495A3L,
    0x0EDB8832L, 0x79DCB8A4L, 0xE0D5E91EL, 0x97D2D988L,
....

// Your basic 32-bit CRC calculator
// NOTE: this code cannot be changed
UINT32 CalcCRC32(unsigned char *pucBuff, int iLen)
{
    UINT32 crc = 0xFFFFFFFF;

    for (int x = 0; x < iLen; x++)
    {
        crc = g_ui32CRC32Table[(crc ^ *pucBuff++) & 0xFFL] ^ (crc >> 8);
    }

    return crc ^ 0xFFFFFFFF;
}


UINT32 CalcCRC32_Intrinsic(unsigned char *pucBuff, int iLen)
{
    UINT32 crc = 0xFFFFFFFF;

    for (int x = 0; x < iLen; x++)
    {
        crc = _mm_crc32_u8(crc, *pucBuff++);
    }
    return crc ^ 0xFFFFFFFF;
}
c
crc
asked on Stack Overflow Mar 20, 2015 by David Larson • edited Mar 20, 2015 by mmgross

2 Answers

3

That table is for a different CRC polynomial than the one used by the Intel instruction. The table is for the Ethernet/ZIP/etc. CRC, often referred to as CRC-32. The Intel instruction uses the iSCSI (Castagnoli) polynomial, for the CRC often referred to as CRC-32C.

This short example code can calculate either, by uncommenting the desired polynomial:

#include <stddef.h>
#include <stdint.h>

/* CRC-32 (Ethernet, ZIP, etc.) polynomial in reversed bit order. */
#define POLY 0xedb88320

/* CRC-32C (iSCSI) polynomial in reversed bit order. */
/* #define POLY 0x82f63b78 */

/* Compute CRC of buf[0..len-1] with initial CRC crc.  This permits the
   computation of a CRC by feeding this routine a chunk of the input data at a
   time.  The value of crc for the first chunk should be zero. */
uint32_t crc32c(uint32_t crc, const unsigned char *buf, size_t len)
{
    int k;

    crc = ~crc;
    while (len--) {
        crc ^= *buf++;
        for (k = 0; k < 8; k++)
            crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
    }
    return ~crc;
}

You can use this code to generate a replacement table for your code by simply computing the CRC-32C of each of the one-byte messages 0, 1, 2, ..., 255.

answered on Stack Overflow Mar 20, 2015 by Mark Adler
0

FWIW, I've obtained SW code that demonstrably matches the Intel crc32c instruction, but it uses a different polynomial: 0x82f63b78 The function definitely doesn't match any of the iSCSI test examples here: https://tools.ietf.org/html/rfc3720#appendix-B.4

What's frustrating in all this is every implementation I've tried for CRC-32C comes out with different hashes from all the others. Is there a true piece of reference code out there?

answered on Stack Overflow Oct 2, 2019 by user2465201

User contributions licensed under CC BY-SA 3.0