What is gcc doing under a bitwise complement, and a negative left shift count when constrained by integer size?

2

I'm trying to eliminate as many gcc warnings as possible in some old code, trying to get "cleaner" code by compiling it with a more recent toolchain. The code writes and reads registers (or memory) on ARMv6 hardware, I can't say I completely understand what it actually does but that's the gist of it for this particular line of the code in question. Side note, all the storage types are uint32.

When looking at it on the C source, it's just a bunch of macros with only 1 value being passed on, for example:

writel(readl(ADDR_GPIO_IOTR1)&(~(3<<IOTR_GPIO(26))),ADDR_GPIO_IOTR1);

That line and many others where that 26 is replaced by other values (30, 58, 59, which I presume are GPIO "pins") are generating the warnings left shift count is negative.

When looking at the preprocessed code, the bit (~(3<<IOTR_GPIO(26))) turns out to be:

(~(3<<(~(3<<((26%16)<<1)))))

That is clearly a negative left shift, no matter the value passed to the macro, that bitwise complement operator is going to turn the result of the shift 3<<anything into a negative number.

Considering that all those 3 numbers are inferred to be of type "int" (they are signed), the result of that operation should always be 0xffffffff, right?

So, IOTR_GPIO(GPIO) is defined as (~(3<<((GPIO%16)<<1))). I wrote a testcase to see what the compiler will do in each step for any value of GPIO I pass on the commandline, this is what I get for a run with 26 as the value of GPIO:

          26%16=0x0000000a     [0b1010]
       0xa << 1=0x00000014     [0b10100]
      3 << 0x14=0x00300000     [0b1100000000000000000000]
    ~(0x300000)=0xffcfffff     [0b11111111110011111111111111111111]

So far, a negative int32, as expected.

3 << 0xffcfffff=0x80000000     [0b10000000000000000000000000000000]

Now what is going on here? I'm pretty sure that shift should have zeroed out everything.

  ~(0x80000000)=0x7fffffff     [0b1111111111111111111111111111111]

So, no, I'm not getting 0xffffffff after all, regardless, I still get 0x7fffffff for almost all values (it changes when 0 > GPIO < 3.

However, here is what happens when I print the result of the whole preprocessed code with a fixed value:

(~(3<<(~(3<<((26%16)<<1)))))=
                0xffffffff     [0b11111111111111111111111111111111]

The clear difference is that for my step-by-step test the compiler does not know the value of GPIO beforehand, as I'm passing that as an argument to my test program. When printing the result of the preprocessed code the compiler has optimized out the value at compile time and returns what I had expected.

So why isn't that negative shift returning all zeros for my testcase?, besides the fact that negative shifts are undefined behavior?

A question to myself is "how the heck is this actually working?" I truly don't expect an answer to that.

But I would like at least an opinion, considering:

  • I have replicated the compilation of this bit of code on a testcase 1:1 (same toolchain, same gcc arguments) of the running code.
  • I even ran the testcase on the ARMv6 hardware in question and I got the exact same results as on a modern gcc-5.3.0 on x86_64 (with or without -m32, as I'm storing everything in uint32_t).
  • There are no other versions of these lines anywhere to be found in history, as far as I can deduce, they were added to "support a new chip" (guessing from the couple #ifdef around this).

What would the intention of the programmer in this case could have been? Even the original toolchain spits the exact same warning, I don't think it was ignored.

What I may be really asking is "how was this intentional"?.

Might it be that at some other point (linking perhaps?) something changes and a different result is being used? Kind of hard to duplicate/testcase/inspect that I think. But I'm going to put a printf there somewhere and run it just to make sure I'm not going crazy.

The testcase I made: negative_shift_test.c

The original, unmodified messed up code: starts here

The complete, indented preprocessed line (#L3093 in the linked code above):

({ 
    do { 
        __asm__ __volatile__ ("mcr p15, 0, %0, c7, c10, 4" : : "r" (0) : "memory"); 
        outer_sync(); 
    } while (0); 
    (
        (void)(
            (void)0,
            *(volatile unsigned int *)(((((0x088CE000) - 0x00000000 + 0xf0000000) + 0x004))) = 
            (( u32) (( __le32)(__u32)(
                ({ 
                    u32 __v = ({ 
                        u32 __v = (( __u32)(__le32)(( __le32) ((void)0, *(volatile unsigned int *)(((((0x088CE000) - 0x00000000 + 0xf0000000) + 0x004)))))); 
                        __v;
                    });
                    __asm__ __volatile__ ("mcr p15, 0, %0, c7, c10, 4" : : "r" (0) : "memory");
                    __v; 
                }) & (~(3<<(~(3<<((26%16)<<1))))) /* sits here unmolested */
            )))
        )
    ); 
});

(read address, bitwise & (AND) the result of the read and write that back to the same address, if I understood it correctly).

c
gcc
asked on Stack Overflow Jan 11, 2016 by Javier

2 Answers

3

One side of the problem is just this:

I wrote a testcase

As you said yourself, you wrote a testcase for an unreadable piece of code that happens to work despite hitting undefined behavior. That's really not a big surprise that your testcase does something unexpected, and different. Touching anything in that code can dispel the magic, you can't just deconstruct it and run it bit by bit. Changing the toolchain can also break it, BTW.

Without digging further into what gcc might do with the code, assuming that the facts are all true, this question is unanswerable because it contains a contradiction:

So why isn't that negative shift returning all zeros for my testcase?, besides the fact that negative shifts are undefined behavior?

You seem to expect undefined behavior to have some defined behavior...

OTOH, the question below is easy to answer:

A question to myself is "how the heck is this actually working?" I truly don't expect an answer to that.

The answer is "why not ?". UB can be the behavior that the author expects, as it's "defined" (hum) as any behavior.

So the actual problem is this:

The code writes and reads registers (or memory) on ARMv6 hardware, I can't say I completely understand what it actually does

You can't refactor it without understanding it. That involves finding the author, and torturing him (or her) if necessary. No need to torture other innocent people.

PS oh and that question is easy too:

What would the intention of the programmer in this case could have been? Even the original toolchain spits the exact same warning, I don't think it was ignored.

That's called an evil programmer. One more reason to find him.

PPS I'm betting on a bug, the author forgot that IOTR_GPIO already does the ~(3<< shift and does it twice. The to-infinity-and-beyond 2nd shift doesn't make sense.

answered on Stack Overflow Jan 11, 2016 by Ilya • edited Jan 11, 2016 by Ilya
2

First and foremost, the source in question belongs to the opensourced kernel for a specific Samsung Galaxy model (the GT-S5367).

That being said, that model belongs to the bcm21553 family of boards, for which there were many source code zip packages released by Samsung.

In that family the S5360 is a whole "sub" family with many variants, the Totoro board. The S5367 is also a Totoro board.

When looking for different versions of the same file to spot differences in these lines that performed the negative left shifts, I restricted my search to the S5360 alone, suffice to say I found no differences, every single source had the same bug.

After a while testing with many printk() in the kernel and looking at the generated output, I decided to search on github for the dubious macro itself IOTR_GPIO.

In doing so I found many duplicates of the macro definition on source derived from my own sub family (plenty of board-totoro.c).

But then, to my surprise, a different board, the Torino (still based on the bcm21553 had the same macro definition but without the extra negative shift.

So (I'm assuming) this ended up being just a copypasta bug. I believe the intention was to move the mask to (or from) the macro definition, but the programmer forgot to remove the code off the other side.

The code worked fine because all it does is read a value that is flattened against a mask (created by this macro) and then write it back on the same spot.

Since the actual, working, mask just positions two zero bits across depending on the GPIO pin, when the value being read (and bitwise &'ded) has those two bits also cleared then the full, bogus, 0xffffffff mask results in no difference against the proper mask, and thus the code works fine, even with such a nasty bug in place.

TL;DR, as @ilya pointed out in a comment, the correct macro definition is:

#define IOTR_GPIO(GPIO) ((GPIO%16)<<1)

No negative shift there to worry about, the bitwise complement is done afterwards to create the actual mask and not to shift bits again.

With that change the code compiles without warnings and works just fine as before.

PS Thanks @ilya for helping the brainstorming out :+).

answered on Stack Overflow Jan 12, 2016 by Javier

User contributions licensed under CC BY-SA 3.0