OS resets on far jump after disabling paging

0

I'm working on modifying a routine that switches to and from realmode to perform a BIOS interrupt, but running into issues with paging. I had it working prior with no paging, but now that my OS uses paging, I need to disable it before entering realmode (and enable it after).

My issue is that when performing the far jump to cause the page disabling to take effect, something goes terribly wrong and I get a reboot.

The code shown below works by creating an identity mapping first using the page table boot_page_table1 which is simply a page table that identity maps the first 4 MiB. This has to be done since im curently using paging to run my kernel code from higher memory and all kernel code is addressed starting at 0xC0100000 while being loaded starting at 0x00100000. I then flush the TLB and jump to a nearby label, but this time using an address in lower memory. My instruction pointer should now be pointing to identity mapped code and it should be safe to disable paging. The paging bit is then disabled in cr3, the TLB is flushed again because I'm paranoid, and the code to switch modes continues.

The code works by coping itself into 16-bit memory at 0x7c00 and then jumping to that so it can work in 16-bit realmode.

If I do NOT disable the paging bit and leave everything else the same, the jmpw CODE16:REBASE(p_mode16) works and the infinite loop after the jump is entered leaving me to think that this problem occurs due to how I disabled paging. Am I missing something when disabling paging? I saw on other posts that "because what you're doing is very unusual you may run into bugs and compatibility problems with your emulator", but I'm not yet sure if its just my code thats wrong.

The code is written using intel syntax with the GAS assembler.

.intel_syntax noprefix

.code32

.global int32, _int32

#define regs16_t_size                          13*2
#define INT32_BASE                             0x00007C00
#define REBASE(x)                              (((x) - reloc) + INT32_BASE)
#define GDTENTRY(x)                            ((x) &lt;< 3)
#define CODE32                                 0x08
#define DATA32                                 0x10
#define CODE16                                 0x18
#define DATA16                                 0x20
#define STACK16                                (INT32_BASE - regs16_t_size)

.global reloc
.global int32_end

.section .text
    int32: .code32                             # by Napalm
    _int32:
        cli                                    # disable interrupts
        pusha                                  # save register state to 32bit stack

        # Enable identity mapping the first MiB, jump, then disable paging
        push [boot_page_directory] # Push first page directory entry to restore it after

        mov eax, (offset boot_page_table1) - 0xC0000000 + 0x003
        mov [boot_page_directory], eax

        mov ecx, cr3 # Reload crc3 to force a TLB flush so the changes to take effect.
        mov cr3, ecx

        mov eax, (offset napalm_switch_disable_paging) - 0xC0000000
        jmp eax
        napalm_switch_disable_paging:

        # Code is now running with the instruction pointer in lower memory,
        # but the code is still assembled as though its in higher memory. Because
        # of this, something like jmp INT32_BASE would fail since it would
        # assemble as a relative jump from an address around 0xC0100000 to 0x7C00
        # but will be running at an address around 0x00100000 causing it to jump to
        # 0x40007C00.

        # Disable paging bit
        mov eax, cr0
        and eax, ~0x80000000
        mov cr0, eax

        mov ecx, cr3 # Reload crc3 to force a TLB flush so the changes to take effect.
        mov cr3, ecx

        mov  esi, (offset reloc) - 0xC0000000  # set source to code below
        mov  edi, INT32_BASE                   # set destination to new base address
        mov  ecx, int32_end - reloc            # set copy size to our codes size
        cld                                    # clear direction flag (so we copy forward)
        rep  movsb                             # do the actual copy (relocate code to low 16bit space)
        mov eax, INT32_BASE
        jmp eax                         # jump to new code location
    reloc: .code32                             # by Napalm
        mov  [REBASE(stack32_ptr)], esp        # save 32bit stack pointer
        sidt [idt_ptr]               # save 32bit idt pointer
        sgdt [gdt_ptr]               # save 32bit gdt pointer
        lgdt [REBASE(gdt16_ptr)]               # load 16bit gdt pointer
        lea  esi, [esp+0x24]                   # set position of intnum on 32bit stack
        lodsd                                  # read intnum into eax
        mov  [REBASE(ib)], al                  # set intrrupt immediate byte from our arguments 
        mov  esi, [esi]                        # read regs pointer in esi as source
        mov  edi, STACK16                      # set destination to 16bit stack
        mov  ecx, regs16_t_size                # set copy size to our struct size
        mov  esp, edi                          # save destination to as 16bit stack offset
        rep  movsb                             # do the actual copy (32bit stack to 16bit stack)

        jmpw CODE16:REBASE(p_mode16)      # switch to 16bit selector (16bit protected mode)


    p_mode16: .code16
        jmp .-2

... 
more of the routine thats not run due to the bug 
...

    stack32_ptr:                               # address in 32bit stack after we
        .4byte 0x00000000                          #   save all general purpose registers

    idt16_ptr:                                 # IDT table pointer for 16bit access
        .2byte 0x03FF                              # table limit (size)
        .4byte 0x00000000                          # table base address

    gdt16_base:                                # GDT descriptor table
        .null:                                 # 0x00 - null segment descriptor
            .4byte 0x00000000                      # must be left zero'd
            .4byte 0x00000000                      # must be left zero'd

        .code32:                               # 0x01 - 32bit code segment descriptor 0xFFFFFFFF
            .2byte 0xFFFF                          # limit  0:15
            .2byte 0x0000                          # base   0:15
            .byte 0x00                            # base  16:23
            .byte 0x9A                            # present, iopl/0, code, execute/read
            .byte 0xCF                            # 4Kbyte granularity, 32bit selector; limit 16:19
            .byte 0x00                            # base  24:31

        .data32:                               # 0x02 - 32bit data segment descriptor 0xFFFFFFFF
            .2byte 0xFFFF                          # limit  0:15
            .2byte 0x0000                          # base   0:15
            .byte 0x00                            # base  16:23
            .byte 0x92                            # present, iopl/0, data, read/write
            .byte 0xCF                            # 4Kbyte granularity, 32bit selector; limit 16:19
            .byte 0x00                            # base  24:31

        .code16:                               # 0x03 - 16bit code segment descriptor 0x000FFFFF
            .2byte 0xFFFF                          # limit  0:15
            .2byte 0x0000                          # base   0:15
            .byte 0x00                            # base  16:23
            .byte 0x9A                            # present, iopl/0, code, execute/read
            .byte 0x0F                            # 1Byte granularity, 16bit selector; limit 16:19
            .byte 0x00                            # base  24:31

        .data16:                               # 0x04 - 16bit data segment descriptor 0x000FFFFF
            .2byte 0xFFFF                          # limit  0:15
            .2byte 0x0000                          # base   0:15
            .byte 0x00                            # base  16:23
            .byte 0x92                            # present, iopl/0, data, read/write
            .byte 0x0F                            # 1Byte granularity, 16bit selector; limit 16:19
            .byte 0x00                            # base  24:31

    gdt16_ptr:                                 # GDT table pointer for 16bit access
        .2byte gdt16_ptr - gdt16_base - 1          # table limit (size)
        .4byte gdt16_base                          # table base address

    int32_end:                                 # end marker (so we can copy the code)
        .byte 0x00

The line with jmp .-2 at the p_mode16 label is never reached and a reboot happens instead. If the jmp .-2 is put right before the jmpw then the OS enters an infinite loop as expected. Im running this on QEMU version 2.11.1 with qemu-system-i386.

x86
paging
gnu-assembler
osdev
intel-syntax
asked on Stack Overflow May 20, 2020 by 23scurtu • edited May 21, 2020 by 23scurtu

1 Answer

2

The problem is this:

gdt16_ptr:                             # GDT table pointer for 16bit access
    .2byte gdt16_ptr - gdt16_base - 1  # table limit (size)
    .4byte gdt16_base                  # GDT base in higher half that WILL NOT WORK WHEN PAGING IS DISABLED

Because you told the CPU that the GDT is in the higher half, after you disable paging it can't access GDT entries properly (it probably accesses a physical address at 0xC000???? and reads who-knows-what instead - e.g. maybe a PCI device's registers, maybe "not RAM or device", etc), so it crashes when the far jump tries to load CODE16 into CS (because "who-knows-what" isn't a valid code descriptor).

To fix the problem you'll need to modify the value for the GDT's base before the sgdt [gdt_ptr] is executed (e.g. maybe just use .4byte REBASE(gdt16_base) instead of .4byte gdt16_base if gdt16_ptr isn't used elsewhere).

answered on Stack Overflow May 21, 2020 by Brendan

User contributions licensed under CC BY-SA 3.0