Work on an array of signed int as if it contained unsigned values

0

I've inherited some old code that assumes that an int can store values from -231 to 2^31-1, that overflow just wraps around, and that the sign bit is the high-order bit. In other words, that code should have used uint32_t, except that it wasn't. I would like to fix this code to use uint32_t.

The difficulty is that the code is distributed as source code and I'm not allowed to change the external interface. I have a function that works on an array of int. What it does internally is its own business, but int is exposed in the interface. In a nutshell, the interface is:

struct data {
    int a[10];
};
void frobnicate(struct data *param);

I'd like to change int a[10] to uint32_t a[10], but I'm not allowed to modify the definition of struct data.

I can make the code work on uint32_t or unsigned internally:

struct internal_data {
    unsigned a[10];
};
void frobnicate(struct data *param) {
    struct internal_data *internal = (struct internal_data *)param;
    // ... work with internal ...
}

However this is not actually correct C since it's casting between pointers to different types.

Is there a way I can add compile-time guards so that, for the rare people for whom int isn't “old-school” 32-bit, the code doesn't build? If int is less than 32 bits, the code has never worked anyway. For the vast majority of users, the code should build, and in a way that tells the compiler not to do “weird” things with overflowing int calculations.

I distribute the source code and people may use it with whatever compiler they choose, so compiler-specific tricks are not relevant.

I'm at least going to add

#if INT_MIN + 1 != -0x7fffffff
#error "This code only works with 32-bit two's complement int"
#endif

With this guard, what can go wrong with the cast above? Is there a reliable way of manipulating the int array as if its elements were unsigned, without copying the array?

In summary:

  • I can't change the function prototype. It references an array of int.
  • The code should manipulate the array (not a copy of the array) as an array of unsigned.
  • The code should build on platforms where it worked before (at least with sufficiently friendly compilers) and should not build on platforms where it can't work.
  • I have no control over which compiler is used and with which settings.
c
arrays
casting
signed

1 Answer

1
  • However this is not actually correct C since it's casting between pointers to different types.

    Indeed, you cannot do such casts, because the two structure types are not compatible. You could however use a work-around such as this:

    typedef union
     {
       struct   data;
       uint32_t array[10];
     } internal_t;
    
     ...
    
     void frobnicate(struct data *param) {
         internal_t* internal = (internal_t*)param;
         ...
    

    Another option if you can change the original struct declaration but not its member names, is to use C11 anonymous union:

     struct data {
       union {
         int        a[10];
         uint32_t u32[10];
       }
     };
    

    This means that user code accessing foo.a won't break. But you'd need C11 or newer.

    Alternatively, you could use a uint32_t* to access the int[10] directly. This is also well-defined, since uint32_t in this case is the unsigned equivalent of the effective type int.


  • Is there a way I can add compile-time guards so that, for the rare people for whom int isn't “old-school” 32-bit, the code doesn't build?

    The obvious is static_assert(sizeof(int) == 4, "int is not 32 bits"); but again this requires C11. If backwards compatibility with older C is needed, you can invent some dirty "poor man's static assert":

    #define stat_assert(expr) typedef int dummy_t [expr]; 
    

  • #if INT_MIN != -0x80000000

    Depending on how picky you are, this isn't 100% portable. int could in theory be 64 bits, but probably portability to such fictional systems isn't desired either.

    If you don't want to drag limits.h around, you could also write the macro as

    #if (unsigned int)-1 != 0xFFFFFFFF
    

    It's a better macro regardless, since it doesn't have any hidden implicit promotion gems - note that -0x80000000 is always 100% equivalent to 0x80000000 on a 32 bit system.

answered on Stack Overflow Jun 14, 2019 by Lundin

User contributions licensed under CC BY-SA 3.0