C: What happens (in detail) in x=~x if x is of type char?

4

If we have the following code:

char x = -1;
x =~x;

On an x86 platform with MS VS compiler (which partly supports C99) - what happens in detail when it is running?

To my knowledge, the following happens (please correct me if I am wrong):

  • x is assigned the value -1, which is represented by the bit pattern 0xff since a char is represented by one byte.
  • The ~ operator promotes x to an int, that is, it internally works with the bit pattern 0xffffffff.
  • The ~ operator's result is 0x00000000 (of type int).
  • To perform the assignment, the integer promotions apply (principally). Since in our case the operand on the right hand side is an int, no conversion occurs. The operand on the left hand side is converted to int. The assignment's result is 0x00000000.
  • As a side effect, the left hand side of the assignment is assigned the value 0x00000000. Since x is of type char, there is another implicit conversion, which converts 0x00000000 to 0x00.

There are so many things that actually happen - I find it somehow confusing. In particular: Is my understanding of the last implicit conversion (of int to char) correct? What would happen if the assignment's result could not be stored in a char?

c
type-conversion
operators
asked on Stack Overflow Jul 2, 2018 by maya • edited Jul 2, 2018 by maya

3 Answers

7

Indeed ~x is an int type.

The conversion back to char is well-defined if char is unsigned. It's also well-defined, of course, if the value is in the range supported by char.

If char is signed, then the conversion of ~x to char is implementation-defined, with the possibility that an implementation defined signal is raised.

In your case, you have a platform with a 2's complement int and a 2's complement char, and so ~x is observed as 0.

Note that MSVC doesn't fully support any C standard, and neither does it claim to.

answered on Stack Overflow Jul 2, 2018 by Bathsheba • edited Jul 2, 2018 by Bathsheba
4

You are almost correct, but missing out that char has implementation-defined signedness. It can either be signed or unsigned, depending on compiler.

In either case, the bit pattern for a 8 bit 2's complement char is indeed 0xFF regardless of its signedness. But in case the char is signed, integer promotion will preserve the sign and you still have value -1, binary 0xFFFFFFFF on a 32 bit computer. But if char was unsigned, -1 would have been converted to 255 upon assignment and integer promotion would have given 255 (0x000000FF). So you'd get a different result.

Regarding integer promotion of ~, it only has one operator to the right and that one is promoted.

Finally you assign the result back to char and the outcome will again depend on signedness. You'll have an implicit "lvalue conversion" upon assignment from int to char. The result is implementation-defined - most likely you get the least significant byte of the int.


From this we can learn:

  • Never use char for storing integer values or for arithmetic. Use it for storing characters only. Instead, use uint8_t.
  • Never perform bitwise arithmetic on operands that are potentially signed, or was made signed silently through implicit promotion.
  • The ~ operator is particularly dangerous unless the operand is unsigned int or a larger unsigned type.
answered on Stack Overflow Jul 2, 2018 by Lundin • edited Jul 2, 2018 by Lundin
1

To my knowledge, the following happens (please correct me if I am wrong):

x is assigned the value -1, which is represented by the bit pattern 0xff since a char is represented by one byte.

1 is an integer constant of type int. - negates that to -1 and remains an int. -1 is assigned to a char x . If that char is signed, then x takes on the value of -1. If that char is unsigned, x takes on the value of CHAR_MAX which is also UCHAR_MAX. "bit pattern 0xff" is not relevant here, yet.

The ~ operator promotes x to an int, that is, it internally works with the bit pattern 0xffffffff.

x is promoted to either int (or unsigned on rare machines where CHAR_MAX == UINT_MAX - we will ignore that). An int is at least 16 bits. The value of -1, when encoded as the overwhelmingly common 2's complement, is an all 1 bits pattern. (Other encoding possible - we will ignore that too). If x has the value of UCHAR_MAX, then x will have the bit pattern 00...00 1111 1111 - assuming 8-bit char. Other widths possible - another thing we will ignore.

The ~ operator's result is 0x00000000 (of type int).

Yes, (unless CHAR_MAX == UINT_MAX, in which case it is unsigned and value 11...11 0000 0000).

To perform the assignment, the integer promotions apply (principally). Since in our case the operand on the right hand side is an int, no conversion occurs. The operand on the left hand side is converted to int. The assignment's result is 0x00000000.

No integer promotions here due to assignment. Promotions already occurred due to ~. A type change will occur, assigning an int to a char. That is not a promotion. The result is of type char. As part of the narrowing, the value of 0 goes through no range issues and results in a value of 0 and type char. The value of 11...11 0000 0000 would go through implementation defined behavior and likely result in a value 0 and certainly type char.

Had code been (x =~x) + 0, that char (x =~x) would have been promoted to int before the addition.

As a side effect, the left hand side of the assignment is assigned the value 0x00000000. Since x is of type char, there is another implicit conversion, which converts 0x00000000 to 0x00.

Addressed in previous.

What would happen if the assignment's result could not be stored in a char?

It is implementation defined behavior which value is saved. It could include (rarely) raising an exception.


Bit masking and manipulation is best handled using unsigned types and math.


User contributions licensed under CC BY-SA 3.0