Experiencing odd behavior with bitshifting/masking

1

Backstory: I'm working on a toy compiler that takes some simplified assembly-like text and converts it into 32-bit instructions.

The conversion into 32-bit instructions is working correctly, but when I try to output the results, I run into some issues. Specifically, the macros I wrote to pull out the various segments seem to be mangling values in ways I can't explain.

The following loop is where I run into issues (bit manipulations are stored in macros, but I expanded them out for simpler debugging):

while(pc < num_insts)
{
   printf("full: %x | op: %x | rs: %x | rt: %x | rd: %x | imm: %x\n",
      inst_mem[pc],
      ((inst_mem[pc] >> 26) & 0x0000003F),
      ((inst_mem[pc] >> 21) & 0x0000003F),
      ((inst_mem[pc] >> 16) & 0x0000003F),
      ((inst_mem[pc] >> 11) & 0x0000F800),
      (inst_mem[pc++] & 0x0000FFFF));
}

Which prints out the following:

full: 20450008 | op: 8 | rs: 2 | rt: 5 | rd: 0 | imm: 10
full: 0 | op: 0 | rs: 0 | rt: 0 | rd: 0 | imm: 8

The correct values for those lines would be:

full: 20450008 | op: 8 | rs: 1 | rt: 4 | rd: 0 | imm: 16
full: 20240010 | op: 8 | rs: 2 | rt: 5 | rd: 0 | imm: 8

If I replace that with the simpler

while(pc < num_insts)
{
   printf("full: %x", inst_mem[pc++]);
}

then each full address is output as I would expect. This implies to me that the parsing is all working as it should, and that my macros to pull values out of the address after parsing and jacking with something they shouldn't be. I'm just not sure what that something would be.

Since inst_mem[] contains int32_t values and all shifts are less than 32, it shouldn't be an issue of undocumented shift behavior.

If anyone can provide a nudge in the right directions, I would be more than grateful. I'm running out of ideas.

c
bit-manipulation
undefined-behavior
bit-shift
asked on Stack Overflow Sep 22, 2013 by Dan • edited Oct 30, 2013 by Shafik Yaghmour

2 Answers

3

The order of evalutation of function arguments doesn't have to be "first argument first". Therefore, it is very dangerous to have increments like pc++ when other function arguments use pc. Change pc++ to pc and increment pc afterwards:

while(pc < num_insts)
{
   printf("full: %x | op: %x | rs: %x | rt: %x | rd: %x | imm: %x\n",
      inst_mem[pc],
      ((inst_mem[pc] >> 26) & 0x0000003F),
      ((inst_mem[pc] >> 21) & 0x0000003F),
      ((inst_mem[pc] >> 16) & 0x0000003F),
      ((inst_mem[pc] >> 11) & 0x0000F800),
      (inst_mem[pc] & 0x0000FFFF));
   pc++;
}
answered on Stack Overflow Sep 22, 2013 by us2012
1

The behavior of the program is both unspecified and undefined. The order of evaluation of function arguments is unspecified if we look at the C99 draft standard section 6.5.2.2 Function calls paragraph 10 is says(emphasis mine):

The order of evaluation of the function designator, the actual arguments, and subexpressions within the actual arguments is unspecified, but there is a sequence point before the actual call.

So we can not determine when the sub-expression inst_mem[pc++] & 0x0000FFFF will be executed with respect to the other arguments and so we don't know when pc will be incremented.

It is undefined behavior because if we look at section 6.5 Expressions paragraph 2 says(emphasis mine):

Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression.72) Furthermore, the prior value shall be read only to determine the value to be stored.73)

and in footnote 73 provides the following examples of undefined behavior:

i = ++i + 1;
a[i++] = i;

So within a sequence point if an object is modified the previous value can only be read to determine the value to be stored. In your code pc is being modified with the pc++ expression and is being read several other times to determine array indexes and thus invokes undefined behavior.

The fix is straight forward and that is to move the pc++ out of the printf call.

answered on Stack Overflow Sep 22, 2013 by Shafik Yaghmour • edited Sep 22, 2013 by Shafik Yaghmour

User contributions licensed under CC BY-SA 3.0