I am reading a book, Hacking: The Art of Exploitation 2nd Edition, and I'm at the chapter of format string vulnerability. I read the chapter multiple times but I'm unable to clearly understand it, even with some googling.
So, in the book there is this vulnerable code:
char text; ... strcpy(text, argv); printf("The right way to print user-controlled input:\n"); printf("%s", text); printf("\nThe wrong way to print user-controlled input:\n"); printf(text);
Then after compiling,
reader@hacking:~/booksrc $ ./fmt_vuln $(perl -e 'print "%08x."x40') The right way to print user-controlled input: %08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x. %08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x. %08x.%08x. The wrong way to print user-controlled input: bffff320.b7fe75fc.00000000.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252 e78.252e7838.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252e78.2 52e7838.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252e78.252e78 38.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.
The bytes 0x25, 0x30, 0x38, 0x78, and 0x2e seem to be repeating a lot.
reader@hacking:~/booksrc $ printf "\x25\x30\x38\x78\x2e\n" %08x.
First, why is that value repeating itself?
As you can see, they’re the memory for the format string itself. Because the format function will always be on the highest stack frame, as long as the format string has been stored anywhere on the stack, it will be located below the current frame pointer (at a higher memory address).
But it seems to me this contradicts what he previously wrote and the way stack frames are organized
When this printf() function is called (as with any function), the arguments are pushed to the stack in reverse order.
So, shouldn't the format string be at a lower memory address since it is the first argument? And where is the format string stored?
reader@hacking:~/booksrc $ ./fmt_vuln AAAA%08x.%08x.%08x.%08x The right way to print user-controlled input: AAAA%08x.%08x.%08x.%08x The wrong way to print user-controlled input: AAAAbffff3d0.b7fe75fc.00000000.41414141
Here again, why is
AAAA repeated in
41414141. From what I understand, the
printf function prints
AAAA first, then when it sees the first
%08x, it gets a value from a memory address in the preceding stack frame, then does the same with the second
%08x, thus the value of the second is located in a memory address higher than the first one, and finally returns to the value of
AAAA located in a lower memory address, in the stack frame of
I debugged the first example with
$(perl -e 'print "%08x."x40') as argument. I run: Linux 5.3.0-40-generic, 18.04.1-Ubuntu, x86_64
(gdb) run $(perl -e 'print "%08x." x 40') Starting program: /home/kuro/fmt_vuln $(perl -e 'print "%08x." x 40') The right way to print user-controlled input: %08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x. The wrong way to print user-controlled input: 07a51260.4b3eb8c0.4b10e154.00000000.4b16c3a0.9d357fc8.9d357b10.78383025.30252e78.2e783830.3830252e.252e7838.78383025.30252e78.2e783830.3830252e.252e7838.78383025.30252e78.2e783830.3830252e.252e7838.78383025.30252e78.2e783830.3830252e.252e7838.78383025.30252e78.2e783830.3830252e.252e7838.4b618d00.4b5fd000.00000000.9d357c80.00000000.00000000.00000000.4b3ef6f0. Breakpoint 1, main (argc=2, argv=0x7ffd9d357fc8) at fmt_vuln.c:19 19 printf("[*] test_val @ 0x%08x = %d 0x%08x\n", &test_val, test_val, test_val); (gdb) x/-100xw $rsp 0x7ffd9d357940: 0x00000400 0x00000000 0x4b07c1aa 0x00007fb8 0x7ffd9d357950: 0x00000016 0x00000000 0x00000003 0x00000000 0x7ffd9d357960: 0x00000001 0x00000000 0x00002190 0x000003e8 0x7ffd9d357970: 0x00000005 0x00000000 0x00008800 0x00000000 0x7ffd9d357980: 0x00000000 0x00000000 0x00000400 0x00000000 0x7ffd9d357990: 0x00000000 0x00000000 0x5e970730 0x00000000 0x7ffd9d3579a0: 0x65336234 0x30663666 0x90890300 0x79e57be9 0x7ffd9d3579b0: 0x1cd79dbf 0x00000000 0x00000000 0x00000000 0x7ffd9d3579c0: 0x05cec660 0x000055ef 0x9d357fc0 0x00007ffd 0x7ffd9d3579d0: 0x00000000 0x00000000 0x00000000 0x00000000 0x7ffd9d3579e0: 0x9d357ee0 0x00007ffd 0x4b062f26 0x00007fb8 0x7ffd9d3579f0: 0x00000030 0x00000030 0x9d357be8 0x00007ffd 0x7ffd9d357a00: 0x9d357a10 0x00007ffd 0x90890300 0x79e57be9 0x7ffd9d357a10: 0x4b3ea760 0x00007fb8 0x07a51260 0x000055ef 0x7ffd9d357a20: 0x4b3eb8c0 0x00007fb8 0x4b0891bd 0x00007fb8 0x7ffd9d357a30: 0x00000000 0x00000000 0x4b3ea760 0x00007fb8 0x7ffd9d357a40: 0x00000d68 0x00000000 0x00000169 0x00000000 0x7ffd9d357a50: 0x07a51260 0x000055ef 0x4b08af51 0x00007fb8 0x7ffd9d357a60: 0x4b3e62a0 0x00007fb8 0x4b3ea760 0x00007fb8 0x7ffd9d357a70: 0x0000000a 0x00000000 0x05cec660 0x000055ef 0x7ffd9d357a80: 0x9d357fc0 0x00007ffd 0x00000000 0x00000000 0x7ffd9d357a90: 0x00000000 0x00000000 0x4b08b403 0x00007fb8 0x7ffd9d357aa0: 0x4b3ea760 0x00007fb8 0x9d357ee0 0x00007ffd 0x7ffd9d357ab0: 0x05cec660 0x000055ef 0x4b0808f5 0x00007fb8 0x7ffd9d357ac0: 0x00000000 0x00000000 0x05cec824 0x000055ef (gdb) x/100xw $rsp 0x7ffd9d357ad0: 0x9d357fc8 0x00007ffd 0x9d357b10 0x00000002 0x7ffd9d357ae0: 0x78383025 0x3830252e 0x30252e78 0x252e7838 0x7ffd9d357af0: 0x2e783830 0x78383025 0x3830252e 0x30252e78 0x7ffd9d357b00: 0x252e7838 0x2e783830 0x78383025 0x3830252e 0x7ffd9d357b10: 0x30252e78 0x252e7838 0x2e783830 0x78383025 0x7ffd9d357b20: 0x3830252e 0x30252e78 0x252e7838 0x2e783830 0x7ffd9d357b30: 0x78383025 0x3830252e 0x30252e78 0x252e7838 0x7ffd9d357b40: 0x2e783830 0x78383025 0x3830252e 0x30252e78 0x7ffd9d357b50: 0x252e7838 0x2e783830 0x78383025 0x3830252e 0x7ffd9d357b60: 0x30252e78 0x252e7838 0x2e783830 0x78383025 0x7ffd9d357b70: 0x3830252e 0x30252e78 0x252e7838 0x2e783830 0x7ffd9d357b80: 0x78383025 0x3830252e 0x30252e78 0x252e7838 0x7ffd9d357b90: 0x2e783830 0x78383025 0x3830252e 0x30252e78 0x7ffd9d357ba0: 0x252e7838 0x2e783830 0x4b618d00 0x00007fb8 0x7ffd9d357bb0: 0x4b5fd000 0x00007fb8 0x00000000 0x00000000 0x7ffd9d357bc0: 0x9d357c80 0x00007ffd 0x00000000 0x00000000 0x7ffd9d357bd0: 0x00000000 0x00000000 0x00000000 0x00000000 0x7ffd9d357be0: 0x4b3ef6f0 0x00007fb8 0x4b6184c8 0x00007fb8 0x7ffd9d357bf0: 0x9d357c80 0x00007ffd 0x4b3ef000 0x00007fb8 0x7ffd9d357c00: 0x4b3ef914 0x00007fb8 0x4b3ef3c0 0x00007fb8 0x7ffd9d357c10: 0x4b617048 0x00007fb8 0x00000000 0x00000000 0x7ffd9d357c20: 0x00000000 0x00000000 0x4b6179f0 0x00007fb8 0x7ffd9d357c30: 0x4b0030e8 0x00007fb8 0x00000000 0x00000000 0x7ffd9d357c40: 0x4b3efa00 0x00007fb8 0x00000480 0x00000000 0x7ffd9d357c50: 0x00000027 0x00000000 0x00000000 0x00000000
The values, that appear before "%08x." in the Wrong way output, appear in lower addresses than "%08x." values. Why? The format string is supposed to be at the top of the stack.
The values, that appear after the "%08x." values in the Wrong way output, appear in higher addresses than"%08x." values. So in the preceding stack.
Why is it like this? Shouldn't the output begin from the format string values, or after?
Also, in the book, it doesn't print values after "%08x." values. But some are printed in my case. And some values in the output don't even figure in the stack, like 4b16c3a0.
I have to recommend against what you're doing. You're focussing on security vulnerabilities in C without a strong understanding of the language itself. That's an exercise in frustration. As evidence, I offer that every question you're posing about the exercise is answered by understanding printf(3), not stack vulnerabilities.
The output of your perl line (the contents of
argv) starts with,
%08x.%08x.%08x.%08x.%08x. Thats a format string. Each
%08x is looking for a further
printf argument, an integer to print in hex representation. Normally, you might do something like,
int a = 'B'; printf( "%02x\n", a );
which produces 42 much faster than the computer in the Hitchhiker's Guide to the Galaxy.
What you've done is pass a long format string with zero arguments. printf(3) can't know how many arguments it was passed; it has to infer them from the format string. Your format string tells printf to print a long list of integers. Since none were provided, it looks for them "up the stack" (wherever they should have been). You print nonsense because the contents of those memory locations is unpredictable. Or, at any rate, weren't defined by you.
In the "good" case, the format string is
"%s", declaring one argument of type string, which you provided. That works much better, yes.
Most compilers nowadays take special care with printf. They can produce warnings if the format string isn't a compile-time constant, and they can verify that each argument is of the correct type for its corresponding format specifier. The whole chapter in your book can thus be made moot simply by using the compiler's capabilities and paying attention to its diagnostics.
User contributions licensed under CC BY-SA 3.0