General Protection Fault when trying to `sti`

1

Trying to implement hardware interrupts on a test bootloader. Exceptions are working(thus found it is GPF). When trying to sti, a GPF is occured. Here is my main code:

    cli
    lgdt [gdt_desc]
    lidt [idt_desc]
    mov eax, cr0
    or eax, 1
    mov cr0, eax
    jmp 0x8:bit_32
bit_32:
[bits 32]
    mov ax, 0x10
    mov ds, ax
    mov es, ax
    mov fs, ax
    mov gs, ax
    mov ss, ax
    mov eax, 0x8000
    mov esp, eax
    mov ebp, esp
    sti                              ; exception raised

This is how my GDT looks like:

start_gdt:

null:
    dd 0x0
    dd 0x0
code:
    dw 0xffff
    dw 0x0
    db 0x0
    db 10011010b
    db 01000000b
    db 0x0
data:
    dw 0xffff
    dw 0
    db 0x0
    db 10010010b
    db 01001011b
    db 0x0

gdt_desc:
    dw gdt_desc-start_gdt-1
    dd start_gdt
    

And this is how my IDT looks like:

start_idt:

i0:
    dw genroutine
    dw 0x8
    db 0
    db 10001110b
    dw 0
    
i1: dw genroutine
    dw 0x8
    db 0
    db 10001111b
    dw 0
    
i2: dw genroutine
    dw 0x8
    db 0
    db 10001110b
    dw 0
    
i3: dw genroutine
    dw 0x8
    db 0
    db 10001111b
    dw 0
    
i5: dw genroutine
    dw 0x8
    db 0
    db 10001111b
    dw 0
.
.
;around 50 times, with some modification like for keyboard, GPF etc.

my PIC set up code:

    mov al, 0x11
    out 0x20, al
    jmp $+2
    jmp $+2
    out 0xA0, al
    jmp $+2
    jmp $+2
    mov al, 0x20
    out 0x21, al
    jmp $+2
    jmp $+2
    mov al, 0x28
    out 0xA1, al
    jmp $+2
    jmp $+2
    mov al, 4
    out 0x21, al
    mov al, 2
    jmp $+2
    jmp $+2
    out 0xA1, al
    jmp $+2
    jmp $+2
    mov al, 11111101b
    out 0x20, al
    mov al , 11111101b
    jmp $+2
    jmp $+2
    out 0x21, al
    ret
    

Tried to enable sti to check keyboard interrupt after modifying that entry in IDT, but then found that sti causes a GPF exception. qemu log:

check_exception old: 0xffffffff new 0xd
     1: v=0d e=07c2 i=0 cpl=0 IP=0008:0000000000007c74 pc=0000000000007c74 SP=0010:0000000000008000 env->regs[R_EAX]=0000000000008000
EAX=00008000 EBX=00007e15 ECX=00000022 EDX=00002080
ESI=00007e00 EDI=00000800 EBP=00008000 ESP=00008000
EIP=00007c74 EFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010 00000000 000bffff 004b9300 DPL=0 DS   [-WA]
CS =0008 00000000 0000ffff 00409a00 DPL=0 CS32 [-R-]
SS =0010 00000000 000bffff 004b9300 DPL=0 DS   [-WA]
DS =0010 00000000 000bffff 004b9300 DPL=0 DS   [-WA]
FS =0010 00000000 000bffff 004b9300 DPL=0 DS   [-WA]
GS =0010 00000000 000bffff 004b9300 DPL=0 DS   [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT=     00007d4c 00000017
IDT=     00007e15 0000038f
CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=00000044 CCD=00008000 CCO=EFLAGS  
EFER=0000000000000000

I have no idea why this happens, nor I have much knowledge about it to find out myself. Please help.

x86
interrupt
qemu
bootloader
osdev
asked on Stack Overflow Jul 10, 2020 by Pranav appu • edited Jul 10, 2020 by Michael Petch

1 Answer

4

I had observed that in your error output you got this exception:

check_exception old: 0xffffffff new 0xd
     1: v=0d e=07c2 i=0 cpl=0 IP=0008:0000000000007c74 ...

The important part is that this is exception is a #GP (General Protection Fault) with an error code of 0x7c2. The OSdev Wiki has a synopsis of the Exceptions and how to interpret the error code for a #GP exception:

enter image description here

The error code 0x7c2 is binary 11111000 01 0. Bit 0 being clear means that this wasn't an exception with an external cause. Bit 1 and 2 is 01 which means the exception was caused when accessing the IDT. 11111000 is the index of the interrupt vector which is 0xF8. This is a red flag. Your PIC remapping code appears to be remapping the master pic to 0x20-0x27 and the slave PIC to 0x28-0x2f. Interrupt 0xF8 makes no sense unless the PIC remapping code is wrong.

Upon reviewing the PIC remapping code I noticed an issue:

mov al, 0x11
out 0x20, al
out 0xA0, al
mov al, 0x20
out 0x21, al
mov al, 0x28
out 0xA1, al
mov al, 4
out 0x21, al
mov al, 2
out 0xA1, al
mov al, 11111101b
out 0x20, al
mov al , 11111101b
out 0x21, al
ret

I have removed the jmp $+2 for clarity and because they aren't needed. If you alternate updating the master and slave PIC ports then the out instruction will act as the needed delay. The OSDev Wiki has a section on doing the PIC remapping and initialization. Your code differs here:

mov al, 4
out 0x21, al          ; This is Correct
mov al, 2
out 0xA1, al          ; This is Correct
mov al, 11111101b
out 0x20, al          ; This is Wrong
mov al , 11111101b
out 0x21, al          ; This is Wrong
ret

After writing 4 to port 0x21 and 2 to port 0xA1 you need to write 1 to port 0xA1 and 1 to port 0xA2. Then you are free to write the interrupt mask to port 0x21 and port 0xA1 to enable and disable the needed interrupts. The correct code could look something like:

mov al, 4
out 0x21, al          ; This is Correct
mov al, 2
out 0xA1, al          ; This is Correct
mov al, 1
out 0xA1, al          ; This is Correct
out 0x21, al          ; This is Correct

; Now set the PIC masks. Each bit in the mask is 0=enabled interrupt, 1=disabled.
mov al, 0
out 0x21, al          ; Enable all interrupts on Slave
out 0xA1, al          ; Enable all interrupts on Master

; Now set the PIC masks. Each bit in the mask is 0=enabled interrupt, 1=disabled.
; mov al, 0xfc
; out 0x21, al          ; Disable all interrupts on Master except timer and keyboard
                        ; 0xfc = 0b11111100
; mov al, 0xff
; out 0xA1, al          ; Disable all interrupts on Slave
    
ret

I was able to reproduce your QEMU exceptions and interrupts by using your incorrect initialization code. I would get an interrupt on 0xF8:

 0: v=f8 e=0000 i=0 cpl=0 IP=0008:00007c51 pc=00007c51 ...

Followed by #GP exception since it was unhandled and outside my IDT:

1: v=0d e=07c2 i=0 cpl=0 IP=0008:00007c51 pc=00007c51 ...

After the fix I start getting interrupts like the timer with correct entries similar to:

0: v=20 e=0000 i=0 cpl=0 IP=0008:00007c4f pc=00007c4f ...
answered on Stack Overflow Jul 10, 2020 by Michael Petch • edited Nov 19, 2020 by Michael Petch

User contributions licensed under CC BY-SA 3.0