What's the difference between casting a long to int versus using a bitwise AND in order to get the 4 least significant bytes?

Question

What's the difference between casting a long to int versus using a bitwise AND in order to get the 4 least significant bytes?

I know that in order to get the 4 least significant bytes of a number of type long I can cast it to int/unsigned int or use a bitwise AND (& 0xFFFFFFFF).

This code produces the following output:

#include <stdio.h>

int main()
{
  long n = 0x8899AABBCCDDEEFF;
  printf("0x%016lX\n", n);
  printf("0x%016X\n", (int)n);
  printf("0x%016X\n", (unsigned int)n);
  printf("0x%016lX\n", n & 0xFFFFFFFF);
}

Output:

0x8899AABBCCDDEEFF
0x00000000CCDDEEFF
0x00000000CCDDEEFF
0x00000000CCDDEEFF

Does that mean that the two methods used are equivalent? If so, do they always produce the same output regardless of the platform/compiler?
Also, is there any catch or pitfall while casting to unsigned int rather than int for the purpose of this question?
Finally, why is the output the same if you change the number n to be an unsigned long instead?

c++

casting

bit-manipulation

bitwise-and

asked on Stack Overflow May 27, 2019 by

Gilgames • edited May 27, 2019 by

Gilgames

3 Answers

For your first question, as others have pointed out, the size of int and long is dependent on the platform, so the methods are not equivalent. In C data types, check that the types say "at least XX bits in size"

For the second question, it comes down to this: long and int are signed, meaning that one bit is reserved for sign (take a look also to two's complement). If you were the compiler, what can you do with negative values (especially the long ones)? As Stepahn Lechner mentioned, this is implementation defined (that is, is up to the compiler).

Finally, in the spirit of "your code must do what it says it does", the best thing to do if you need to do masks is to use masks (and, if you use masks, use unsigned types). Don't try to use cleaver answers. Believe me, they always bite you in the rear. I've dealt with a lot of legacy code to know that by heart.

answered on Stack Overflow May 27, 2019 by

JACH

The methods are definitely different.

According to integral conversion rules (cf, for example, this online c++11 standard), a conversion (e.g. through an explicit cast) from one integral type to another depends on whether the destination type is signed or unsigned. If the destination type is unsigned, one can rely on a "modulo 2n" truncation, whereas with signed destination types one could tap into implementation defined behaviour:

4.7 Integral conversions [conv.integral]

2 If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two's complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). — end note ]

3 If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.

answered on Stack Overflow May 27, 2019 by

Stephan Lechner

What's the difference between casting a long to int versus using a bitwise AND in order to get the 4 least significant bytes?

Type. Casting makes the value an int. And'ing does not change the type.

Range. Depending on int,long range, a cast may not change the value at all.

IDB and UB. implementation defined behavior and undefined behavior are present with mixing signed-ness.

To "get" the 4 LSBytes, use & 0xFFFFFFFFu or cast to uint32_t.

OP's question is unnecessarily convoluted.

long n = 0x8899AABBCCDDEEFF; --> Converting a value outside the range of a signed integer type is implementation-defined.

Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
C11 §6.3.1.3 3

printf("0x%016lX\n", n); --> Printing a long with a "%lX" outside the the common range of long/unsigned long is undefined behavior.

Let's go forward with unsigned long:

  unsigned long n = 0x8899AABBCCDDEEFF;  // no problem,
  printf("0x%016lX\n", n);               // no problem,
  printf("0x%016X\n", (int)n);           // problem, C11 6.3.1.3 3
  printf("0x%016X\n", (unsigned int)n);  // no problem,
  printf("0x%016lX\n", n & 0xFFFFFFFF);  // no problem,

The "no problem" are OK even is unsigned long is 32-bit or 64-bit. The output will differ, yet is OK.

Recall that int,long are not always 32,64 bit. (16,32), (32,32), (32,64) are common.

int is at least 16 bit.
long is at least that of int and at least 32 bit.

answered on Stack Overflow May 27, 2019 by

chux - Reinstate Monica • edited May 27, 2019 by

chux - Reinstate Monica

User contributions licensed under CC BY-SA 3.0