My program after running for around few hours randomly crashes because of a segmentation fault. My environment is Ubuntu (Linux)
When I try to print the data structure thats being accessed when it crashed the pointer is always pointing to invalid memory.
(gdb) p *xxx_info[8]
**Cannot access memory at address 0x7fd200000000**
(gdb)
In order to detect data corruption I add two fence variables that were hardcoded with a well defined value so that I can detect a memory corruption. I see that my fence variables for the linked list node which caused my process to crash had been violated from my earlier logs.
(gdb) p xxx_info
$2 = {0x7fd248000c30, 0x7fd248001050, 0x7fd248000b30, 0x7fd248000d40, 0x7fd248000f50, 0x0,
0x7fd248001160, 0x7fd248001280, 0x7fd200000000, 0x7fd2480008c0,
0x7fd248003000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
(gdb) p xxx_info[8]
$3 = (xxx_info_t *) 0x7fd200000000
(gdb) p *xxx_info[8]
**Cannot access memory at address 0x7fd200000000**
(gdb)
I added the two fence variables at the start and the end of the data structures.
(gdb) pt xxx_info_t
type = struct xxx_info_t {
uint32_t begin_fence;
char *str;
int cluster_id;
uint32_t end_fence;
}
Under normal circumstances the fence variables MUST always be 0xdeaddead as shown below:
(gdb) p/x *xxx_info[7]
$4 = {**begin_fence= 0xdeaddead**, str= 0x7fd248002f10, cluster_id= 0x1f2, **end_fence= 0xdeaddead**}
(gdb)
Whenever I access the array of pointers xxx_info I check for the fence values for each index. I noticed that I would get error messages saying that the fence variables for index 8 look corrupted. Later the code crashed when accessing index 8.
This means that somewhere I am overwriting the memory address pointed to by the index 8 of the array xxx_info .
My question is:
How can I debug these errors? Can I dynamically set some breakpoints so that whenever somebody overwrites that memory address I complain and assert. If this were a global variable that somebody were corrupting I could have set a HW breakpoint. Since this is a list created on heap (i malloc) the addresses that I will get will be dynamic and hence I can't use memory breakpoints/watchpoints.
Any ideas on what I can do?
User contributions licensed under CC BY-SA 3.0