GCC Jump Table initialization code generating movsxd and add?

2

When I compile a switch statement with optimization in GCC, it sets up a jump table like this,

(fcn) sym.foo 148
  sym.foo (unsigned int arg1);
; arg unsigned int arg1 @ rdi
0x000006e0      83ff06         cmp edi, 6                              ; arg1
0x000006e3      0f87a7000000   ja case.default.0x790
0x000006e9      488d156c0100.  lea rdx, [0x0000085c]
0x000006f0      89ff           mov edi, edi
0x000006f2      4883ec08       sub rsp, 8
0x000006f6      486304ba       movsxd rax, dword [rdx + rdi*4]
0x000006fa      4801d0         add rax, rdx                            ; '('
;-- switch.0x000006fd:
0x000006fd      ffe0           jmp rax                                 ; switch table (7 cases) at 0x85c

Is the MOVSXD and ADD the best way to do that,

movsxd rax, dword [rdx + rdi*4]
add rax, rdx

Isn't that the same as using LEA with displacement

lea rax, [rdx + rdi*4 + rdx]

It occurs to me that I probably don't understand what's going on here. RDX seems to be the start off the start of the jump table. RDI is the incoming argument to the switch statement. Why are we adding RDX twice though?

This is the switch statement I was compiling with -O3,

int foo (int x) {
  switch(x) {
    //case 0: puts("\nzero"); break;
    case 1: puts("\none"); break;
    case 2: puts("\ntwo"); break;
    case 3: puts("\nthree"); break;
    case 4: puts("\nfour"); break;
    case 5: puts("\nfive"); break;
    case 6: puts("\nsix"); break;
  }
  return 0;
}
gcc
x86
switch-statement
x86-64
jump-table
asked on Stack Overflow Sep 5, 2018 by Evan Carroll • edited Sep 5, 2018 by Evan Carroll

2 Answers

3

GCC is using relative displacements in its jump table (relative to the base of the table), instead of absolute addresses. So the jump table itself is position-independent, and doesn't need fixups when it's relocated, e.g. as part of loading a PIE executable or a PIC shared library.

If you compile with -fno-pie -no-pie, gcc might choose to use a table of jump targets with jmp [table + rdi*8]

Targets like x86-64 Linux do support runtime data fixups, so a simple jump table would be possible. But some targets don't support fixups at all, which is why gcc -fPIC / -fpie avoids it entirely. This potential optimization is gcc bug 84011. See discussion there for more.


It's unfortunate gcc is using a jump table instead of realizing that the only difference between each case is the data, not code. So really it just needs a table lookup of string pointers. (Which could be done with relative displacements if it wanted to.)

That's a separate missed optimization, which I reported as bug 85585. (That reminds me, I have a followup to that half-written which I should finish and post.)

answered on Stack Overflow Sep 5, 2018 by Peter Cordes
1

Is the MOVSXD and ADD the best way to do that,

It could be done with just an add with a qword memory operand. Of course the downside is that it makes the table twice as big.

Isn't that the same as using LEA with displacement

No, lea does not access memory.

Why are we adding RDX twice though?

The first time it is used as the base of the table to index into it. The table holds addresses relative to itself, so adding RDX to the value from the table creates an absolute address.

By the way this could easily be improved:

mov edi, edi     ; truncate rdi to 32bit

A self-mov cannot be mov-eliminated on current architectures, so it would be better to mov to some other register.

answered on Stack Overflow Sep 5, 2018 by harold

User contributions licensed under CC BY-SA 3.0