confusion about int, char, and EOF in C

4

I'm learning K&R's classic C programming book 2nd edition, here's an example on page 17:

#include <stdio.h>
/* copy input to output*/
main()
{
    int c; 
    // char c works as well!!
    while ((c = getchar()) != EOF)
        putchar(c);
}

it's stated in the book that int c is used to hold EOF, which turns out to be -1 in my Windows machine with GCC and can't be represented by char. However, when I tried char c it works with no problem. Curiously I tried some more:

int  a = EOF;
char b = EOF;
char e = -1;
printf("%d %d %d %c %c %c \n", a, b, e, a, b, e);

and the output is -1 -1 -1 with no character displayed (actually according to ASCII table for %c, c here there should be a nbs(no-break space) displayed but it's invisible).

So how can char be assigned with EOF without any compiler error?

Moreover, given that EOF is -1, are both b and e above assigned FF in memory? It should not be otherwise how can compiler distinguish EOF and nbs...?

Update:

most likely EOF 0xFFFFFFFF is cast to char 0xFF but in (c = getchar()) != EOF the the LHS 0xFF is int promoted to 0xFFFFFFFF before comparison so type of c can be either int or char.

In this case EOF happens to be 0xFFFFFFFF but theoretically EOF can be any value that requires more than 8 bits to correctly represent with left most bytes not necessarily being FFFFFF so then char c approach will fail.

Reference: K&R The C Programming Language 2e

enter image description here

c
char
int
eof
asked on Stack Overflow Sep 22, 2015 by mzoz • edited Oct 19, 2019 by mzoz

2 Answers

2

This code works because you're using signed chars. If you look at an ASCII table you'll find two things: first, there are only 127 values. 127 takes seven bits to represent, and the top bit is the sign bit. Secondly, EOF is not in this table, so the OS is free to define it as it sees fit.

The assignment from char to int is allowed by the compiler because you're assigning from a small type to a larger type. int is guaranteed to be able to represent any value a char can represent.

Note also that 0xFF is equal to 255 when interpreted as an unsigned char and -1 when interpreted as a signed char:

0b11111111

However, when represented as a 32 bit integer, it looks very different:

255 : 0b00000000000000000000000011111111
-127: 0b11111111111111111111111110000001
answered on Stack Overflow Sep 22, 2015 by Alex • edited Sep 22, 2015 by WedaPashi
2

EOF and 0xFF are not the same. So compiler has to distinguish between them. If you see the man page for getchar(), you'd know that it returns the character read as an unsigned char cast to an int or EOF on end of file or error.

Your while((c = getchar()) != EOF) is expanded to

((unsigned int)c != (unsigned int)EOF)
answered on Stack Overflow Sep 22, 2015 by WedaPashi • edited Sep 22, 2015 by WedaPashi

User contributions licensed under CC BY-SA 3.0