Assigning a character with an acute into a C int variable

2

Given this C instruction:

int c = 'é';

At runtime, after this instruction is executed, if I have a look at the value of c, the hexadecimal value is 0xffffffe9 when I would be expecting 0x000000e9 instead...

Can you explain why it behaves like this and what I should do to get the expected result?

Note that, if I use this instruction instead:

int c = 0xE9;

I get 0x000000e9 as a value for c.

Thank you for helping me.

c
utf-8
character-encoding
asked on Stack Overflow Apr 6, 2015 by Léa Massiot

1 Answer

0

char is signed on your platform. Non-ASCII character constants above 127 are actually negative int values. Try casting the non-ASCII character constant as (unsigned char) or if you use gcc or clang, add -funsigned-char as a command line option.

Using non ASCII characters in your source files is calling for problems. Depending on the encoding used by your text editor and what the compiler expects, you may end up with even more surprising behaviour than what you experienced here.

It is better to avoid them completely... Tant pis.

answered on Stack Overflow Apr 6, 2015 by chqrlie • edited Apr 6, 2015 by chqrlie

User contributions licensed under CC BY-SA 3.0