I tried to simulate how the syscall instruction working on Windows 7 X64 (SP1), so I program a 64bit GCC example with MinGW64. As I know, for Windows, all syscall entry point is within ntdll.dll or ntdll32.dll (in this case, we just care for ntdll.dll).
Status = NtCreateFile(&FileHandle, // returned file handle
(GENERIC_WRITE | SYNCHRONIZE), // desired access
&ObjectAttributes, // ptr to object attributes
&Iosb, // ptr to I/O status block
0, // allocation size
FILE_ATTRIBUTE_NORMAL, // file attributes
0, // share access
FILE_SUPERSEDE, // create disposition
FILE_SYNCHRONOUS_IO_NONALERT, // create options
NULL, // ptr to extended attributes
0); // length of ea buffer
This is original part of source code written by C, and then I rewrite it by gas
asm volatile
(
"leaq %4, %%r9\n\t"
"leaq %3, %%r8\n\t"
"movq %2, %%rdx\n\t"
"leaq %1, %%rcx\n\t"
"movq %11,0x50(%%rsp)\n\t"
"movq %10,0x48(%%rsp)\n\t"
"movq %9, 0x40(%%rsp)\n\t"
"movq %8, 0x38(%%rsp)\n\t"
"movq %7, 0x30(%%rsp)\n\t"
"movq %6, 0x28(%%rsp)\n\t"
"movq %5, 0x20(%%rsp)\n\t"
"movq %%r9, 0x18(%%rsp)\n\t"
"movq %%r8, 0x10(%%rsp)\n\t"
"movq %%rdx, 0x8(%%rsp)\n\t"
"movq %%rcx, (%%rsp)\n\t"
"movq __imp_NtCreateFile(%%rip), %%rax\n\t"
"call *%%rax\n\t"
: "=a"(Status)
: "m"(FileHandle), "g"(GENERIC_WRITE | SYNCHRONIZE),"m"(ObjectAttributes),"m"(Iosb),"g"(0),"g"(FILE_ATTRIBUTE_NORMAL),"g"(0),"g"(FILE_SUPERSEDE),"g"(FILE_SYNCHRONOUS_IO_NONALERT),"g"(NULL),"g"(0)
: "%rcx", "%rdx", "%r8", "%r9", "%r10","%r11"
);
Till now, the program works as expected: it created a text file and write something in the file.
I use windbg to disassembly the ntdll!NtCreateFile, and only saw (rewrited as GAS AT&T format)
"movq $0x52, %%rax\n\t"
"movq %%rcx, %%r10\n\t"
"syscall\n\t"
"ret\n\t"
I added this part of code within my program as
asm volatile
(
"leaq %4, %%r9\n\t"
"leaq %3, %%r8\n\t"
"movq %2, %%rdx\n\t"
"leaq %1, %%rcx\n\t"
"movq %11,0x50(%%rsp)\n\t"
"movq %10,0x48(%%rsp)\n\t"
"movq %9, 0x40(%%rsp)\n\t"
"movq %8, 0x38(%%rsp)\n\t"
"movq %7, 0x30(%%rsp)\n\t"
"movq %6, 0x28(%%rsp)\n\t"
"movq %5, 0x20(%%rsp)\n\t"
"movq %%r9, 0x18(%%rsp)\n\t"
"movq %%r8, 0x10(%%rsp)\n\t"
"movq %%rdx, 0x8(%%rsp)\n\t"
"movq %%rcx, (%%rsp)\n\t"
"movq $0x52, %%rax\n\t"
"movq %%rcx, %%r10\n\t"
"syscall\n\t"
: "=a"(Status)
: "m"(FileHandle), "g"(GENERIC_WRITE | SYNCHRONIZE),"m"(ObjectAttributes),"m"(Iosb),"g"(0),"g"(FILE_ATTRIBUTE_NORMAL),"g"(0),"g"(FILE_SUPERSEDE),"g"(FILE_SYNCHRONOUS_IO_NONALERT),"g"(NULL),"g"(0)
: "%rcx", "%rdx", "%r8", "%r9", "%r10","%r11"
);
now the Status always return with value ' 0xc000000d', program failed. Now I have several confused questions:
how the parameters stored in user mode stack pass into kernel mode here? since I see nothing is done within NtDll!NtCreateFile.
How the correct return value to assign back to %%rax? This part is also missting within disassmebler.
how to make my code work as expeted when perform direct syscall?
thanks a lot for your great help.
OK, Here show working code
asm volatile
(
"leaq %4, %%r9\n\t"
"leaq %3, %%r8\n\t"
"movq %2, %%rdx\n\t"
"leaq %1, %%rcx\n\t"
"movq %11,0x50(%%rsp)\n\t"
"movq %10,0x48(%%rsp)\n\t"
"movq %9, 0x40(%%rsp)\n\t"
"movq %8, 0x38(%%rsp)\n\t"
"movq %7, 0x30(%%rsp)\n\t"
"movq %6, 0x28(%%rsp)\n\t"
"movq %5, 0x20(%%rsp)\n\t"
"push $_end \n\t"
"movq %%rcx,%%r10\n\t"
"movq $0x52,%%rax\n\t"
"syscall\n\t"
"ret\n\t"
"_end:\n\t"
: "=a"(Status)
: "m"(FileHandle), "g"(GENERIC_WRITE | SYNCHRONIZE),"m"(ObjectAttributes),"m"(Iosb),"g"(0),"g"(FILE_ATTRIBUTE_NORMAL),"g"(0),"g"(FILE_SUPERSEDE),"g"(FILE_SYNCHRONOUS_IO_NONALERT),"g"(NULL),"g"(0)
: "%rcx", "%rdx", "%r8", "%r9", "%r10","%r11"
);
it is not really painful to simulate the call/ret. Here I used a workaround which Linus ever used in his Linux 0.11.
I think you are wrong concerning the depth of the stack. Many of the arguments are passed via the stack. The syscall expects them exactly where they are if the library call is in-between.
If you skip the library call and do the syscall yourself (what you only should do for experimenting, not for productive stuff!), there is one item missing on the stack.
So either push a dummy value to the stack or adjust the offsets.
In detail, the following happens in the original code:
movq %%rcx, (%%rsp)
).call
to __imp_NtCreateFile
. This puts the return address to the stack and performs a transfer of the %tip
to the library function.If you do the syscall yourself, you have to put in another item in order to compensate for this return address which moves the kernel's view of the stack.
User contributions licensed under CC BY-SA 3.0