Segmentation fault in pthread_cond_signal()

2

First let me provide some background. There are two threads in the production code and synchronization is done via wait and signal. Basic Structure of the code given below. main.c create the thread. main.c also calls funca() which signals the other thread. The mutex and condition variable is declared and initialized in a.c. a.c also has the definition of funca() and the definition of thread_func(). thread_func() waits for the condition and upon being signaled, unlocks the mutex and does some work.

main.c

pthread_create(thread_id, thread_func)

funca();

a.c

pthread_mutex_t     renotify_signal_mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t      renotify_signal_cond = PTHREAD_COND_INITIALIZER;

thread_func() {
        pthread_mutex_lock(&renotify_signal_mutex);
        pthread_cond_wait(&renotify_signal_cond, &renotify_signal_mutex);
        pthread_mutex_unlock(&renotify_signal_mutex);

        <<<<< Does some work here
}

funca() {

    pthread_mutex_lock(&renotify_signal_mutex);
    pthread_cond_signal(&renotify_signal_cond);
    pthread_mutex_unlock(&renotify_signal_mutex);

}

The segmentation fault is coming in pthread_cond_siganl(). Upon examining in gdb I could see that the mutex the condition variable binds to is corrupted i.e. the address should be that of signal_mutex but actually it is pointing to invalid memory. Please see gdb output below:

(gdb) x/40 0x85084a0
0x85084a0 <renotify_signal_mutex>:      0x00000001      0x00000000      0x00003b1a      0x00000000
0x85084b0 <renotify_signal_mutex+16>:   0x00000002      0x00000000      0x00000000      0x00000000
0x85084c0 <renotify_signal_cond>:       0x00000001      0x00000008      0x00000004      0x00000000
0x85084d0 <renotify_signal_cond+16>:    0x00000004      0x00000000      0x00000003      0x00000000
0x85084e0 <renotify_signal_cond+32>:    0x0200a084      0x00005008      0x00000000      0x00000000
0x85084f0 <_breakpoint_target_>:        0x00000000      0x00000000      0x00000000      0x00000000
0x8508500 <bgp_asn_buffer>:     0x00000000      0x00000000      0x00000000      0x00000000
0x8508510 <bgp_asn_buffer+16>:  0x00000000      0x00000000      0x00000000      0x00000000
0x8508520 <bgp_asn_buffer+32>:  0x00000000      0x00000000      0x00000000      0x00000000
0x8508530 <bgp_asn_buffer+48>:  0x00000000      0x00000000      0x00000000      0x00000000
(gdb) p renotify_signal_cond
$51 = {
  __data = {
    __lock = 1,
    __futex = 8,
    __total_seq = 4,
    __wakeup_seq = 4,
    __woken_seq = 3,
    __mutex = 0x200a084,
    __nwaiters = 20488,
    __broadcast_seq = 0
  },
  __size = "\001\000\000\000\b\000\000\000\004\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000\003\000\000\000\000\000\000\000\204\240\000\002\bP\000\000\000\000\000\000\000\000\000",
  __align = 34359738369
}
gdb) x 0x200a084
0x200a084:      Cannot access memory at address 0x200a084
(gdb)


(gdb) p &renotify_signal_mutex
$53 = (pthread_mutex_t *) 0x85084a0 <renotify_signal_mutex>

As you can see in the gdb output that the mutex field in pthread_cond_t structure is pointing to invalid memory instead of pointing to renotify_signal_mutex. Also the __nwaiters = 20488 looks wrong.

From the memory dump I dont see any possibility of memory overwriting. I also don't see any possibility of using uninitialized mutex/condition which might have lead to this. Can someone please help me with this?

Thanks

c
linux
pthreads
asked on Stack Overflow Sep 28, 2018 by NeilB

2 Answers

2

It may be a simplification in your code example, but the pthread_create does not look correct. The format for pthread_create is:

int pthread_create( pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine)(void *), void *arg);

This will possible corrupt memory. Also, thread_func, should be passed as &thread_func to pthread_create.

answered on Stack Overflow Sep 28, 2018 by JonBelanger
0

There can be many reasons for memory corruption.

  • The memory corruption is happening because of any other variable. Possibly writing an array goes out of bound or writing to const string or something similar. Sometimes stack trace does not show exact point of corruption.
  • It seems mutex binding is happening at the time of pthread_cond_wait(). There is another thread using the same condition variable with a different mutex. But here, mutex address cannot be accessed. So, the corrupt mutex is not a global variable.
answered on Stack Overflow Oct 1, 2018 by Preeti

User contributions licensed under CC BY-SA 3.0