So I've been working on a simple format string exploit and for the past 3 hours or so I have been bashing my head against the table wondering why my hex values weren't appearing on the stack.
If anyone can enlighten me, I would appreciate it a lot.
1.
Initially I was using python for the scripting when doing these challenges and for this example in particular:
python -c 'print "AAAAA\xcc\xd5\xff\x4f"' > a
And subsequently viewing the stack in GDB:
format string>
0xffffd550: 0xffffd584 0xf7ffdab8 0x41f95300 0x41414141
0xffffd560: 0x95c38cc3 0x0a4fbfc3 0xf7e2ec00 0xf7f8f820
Now it looks like it is not appearing after the "AAAAA" (used 5 since not aligned).
2.
However, when I use another address that I had been previously working with:
python -c 'print "AAAAA\x5c\x57\x55\x56"' > a
I get:
format string>
0xffffd550: 0xffffd584 0xf7ffdab8 0x41f95300 0x41414141
0xffffd560: 0x5655575c 0x0000000a 0xf7e2ec69 0xf7f8f820
And it seems perfectly fine?
3.
Also, when I use something like:
echo -en "AAAAA\xcc\xd5\xff\x4f" > b
I am able to properly set the value into the stack as so:
format string>
0xffffd550: 0xffffd584 0xf7ffdab8 0x41f95300 0x41414141
0xffffd560: 0x4fffd5cc 0x00000000 0xf7e2ec69 0xf7f8f820
Below are the outputs of the files a and b respectively:
AAAAA���O
AAAAAÌÕÿO
The problem with the first example is that your string contains values greater than 0x7F. When Python outputs the string, it decides (based on your system and language settings) that it should write out the characters in UTF-8 format.
UTF-8 expresses characters 0x7F and lower as themselves, so the A
and x4f
characters are written out unchanged. However, UTF-8 expresses character with values above 0x7F as a sequence of multiple bytes. In this case the characters greater than 0x7F are \xcc
, \xd5
and \xff
. The UTF-8 encodings for those characters are 0xC3 0x8C
, 0xC3 0x95
and 0xC3 BF
respectively. Those are the values that show up in your memory dump.
You could get around this by forcing Python to emit the string using an encoding that handles values above 0x7F by passing them as themselves, without transformation. "latin1" is such an encoding, so you could use this command:
python 'print u"AAAAA\xcc\xd5\xff\x4f".encode("latin1")'
but that's ugly.
Also, the Python versions always emit a newline character (0x0A) at the end of the string. It shows up in your memory dump in the word after the values you intended to deliver. You can get around that by writing:
python -c 'import sys; sys.stdout.write(u"AAAAA\xcc\xd5\xff\x4f".encode("latin1"))'
but that's even uglier.
I'd forget trying to use a Python one-liner for this and stick with the echo -ne
approach.
User contributions licensed under CC BY-SA 3.0