I'm writing a linux kernel device driver for a custom PCIe device. An user space application is mmapped to this device and frequently accessing its memory (read and write). The PCIe device is driven by an external power supply which may be turned off during runtime.
Whenever the device is reset, all memory reads of my user application return 0xFFFFFFFF
.
I want to detect device resets as soon as possible in the kernel driver, so I implemented an error_detected
callback function according to https://www.kernel.org/doc/html/latest/PCI/pci-error-recovery.html.
static pci_ers_result_t mydevice_error_detected(struct pci_dev* dev, pci_channel_state_t state) {
printk(KERN_ALERT "mydevice PCI error detected");
return PCI_ERS_RESULT_DISCONNECT;
}
static struct pci_error_handlers mydevice_error_handlers = {
.error_detected = mydevice_error_detected,
.slot_reset = mydevice_slot_reset,
.resume = mydevice_resume
};
static struct pci_driver mydevice_driver = {
.name = "mydevice",
.id_table = mydevice_ids,
.probe = mydevice_probe,
.remove = mydevice_remove,
.suspend = mydevice_suspend,
.resume = mydevice_resume,
.err_handler = &mydevice_error_handlers
};
However, mydevice_error_detected
is never called during device reset, even though the user space application is continuously trying to unsuccessfully read device memory (and get 0xFFFFFFFF
as result).
Also, lspci
still lists the device after PCI rescan, even if it got turned off:
01:00.0 Unassigned class [ff00]: MyVendorId Device 5a00 (rev ff)
The only difference is that "rev ff
" occurs at the end of the line when the device is in turned off state. Otherwise lspci
returns
01:00.0 Unassigned class [ff00]: MyVendorId Device 5a00
I'm pretty sure the device is completely turned off, since configuration space can not be accessed during reset. I'd expect the kernel to call the error detection callback whenever the first memory read request to the device fails/timeouts. Is my assumption correct?
User contributions licensed under CC BY-SA 3.0