Python program crashing on RPI compute - syslog contains message "INFO: task python3:724 blocked for more than 120 seconds."

0

I am working on a project with a PI compute module set up on a custom board. The project has to reliably run ideally indefinitely while logging sensor data and taking video and audio when the device is triggered. It is pretty much done, except that sometimes it crashes without leaving any clue as to why in the logs that I have set up

The program is written in Python3. It uses multiprocessing and has 3 processes, each with a few threads, and a number of threads that run from the main thread. I have tried writing in a system that polls all threads and processes once a minute and if one crashes, automatically restarts the whole system (the program is set to run on boot), so that anything does go wrong it can recover automatically.

The most recent crash occured after about 3.5 days of running smoothly, and it was not caught by the program itself. The logs from the program itself did not indicate that anything was wrong. Useful information might be that the system uses about 20% of CPU when not recording, and about 40% CPU when it is. It records to a USB.

The following was found in /var/log/syslog:

May 19 12:28:02 raspberrypi kernel: [322680.792629] INFO: task scsi_eh_0:69 blocked for more than 120 seconds.

May 19 12:28:02 raspberrypi kernel: [322680.792893] INFO: task usb-storage:71 blocked for more than 120 seconds.

May 19 12:28:02 raspberrypi kernel: [322680.793216] INFO: task python3:724 blocked for more than 120 seconds.

These are timestamped about 4 minutes after the last log of the python program. The python logs indicate that the system was recording audio and video when the system crashed.

This is the thread that writes video. self.frames is a list that contains 5 lists. When recording video, the main thread writes 5 seconds of video into a list at a specific index and sends that index to the below thread through self.BufferQueue for it to be written to a file.

    def WriteFrames(self):
        process = 1
        buffer = None
        out = None

        while True:
            buffer = self.BufferQueue.get()

            if(buffer != 5):
                try:            
                    if process == 1:
                        out = cv2.VideoWriter('/media/pi/%s/%s.avi'%(self.file,self.text),self.fourcc,FPS,(640,480))
                        logging.debug("videowriter defined")
                        process = 0
                    for frame in range(0,(FPS*RECORD_TIME)+10):
                        if self.USBInserted == 1:
                            out.write(self.frames[buffer][frame])
                except Exception as e:
                    logging.error("error in writeframes")
                    logging.exception(e)
                self.frames[buffer] = []
            else:
                if(out is not None):    #will only be none if writeframes error occurs on the first record. Should not happen
                    out.release()
                process = 1
                logging.debug("videowriter released") 

Audio is written by the following thread. It works the same way as writeFrames above:

def WriteAudio(self):
        buffer = 0
        wf = None
        while True:
            buffer = self.AudioBufferQueue.get()
            if buffer != 5 and self.USBInserted == 1:# and not wferror:
                try:
                    wf = wave.open("/media/pi/%s/%s.wav"%(self.file,self.text),'wb')
                    wf.setnchannels(CHANNELS)
                    wf.setsampwidth(self.p.get_sample_size(FORMAT))
                    wf.setframerate(RATE)    
                    #wferror = False
                except Exception as e:
                    logging.exception(e)
                    ##print("error defining wf")
                    #wferror = True  #redefine wf if error occurs

                while buffer != 5 and self.USBInserted == 1:# and not wferror:
                    try:
                        wf.writeframes(b''.join(self.AudioSamples[buffer]))
                    except wave.Error as e:
                        logging.exception(e)

                    self.AudioSamples[buffer] = []
                    #logging.debug("audio buffer " + str(buffer) + " cleared WriteAudio")
                    buffer = self.AudioBufferQueue.get()
                if(wf is not None):
                    wf.close()

It won't let me include the whole /var/log/syslog from the day of the crash but here are (what seem to be) the important bits:

May 19 12:28:02 raspberrypi kernel: [322680.792893] INFO: task usb-storage:71 blocked for more than 120 seconds.
May 19 12:28:02 raspberrypi kernel: [322680.792900]       Tainted: G         C O    4.14.34-v7+ #1110
May 19 12:28:02 raspberrypi kernel: [322680.792905] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 19 12:28:02 raspberrypi kernel: [322680.792910] usb-storage     D    0    71      2 0x00000000
May 19 12:28:02 raspberrypi kernel: [322680.792932] [<8079a0b8>] (__schedule) from [<8079a730>] (schedule+0x50/0xa8)
May 19 12:28:02 raspberrypi kernel: [322680.792948] [<8079a730>] (schedule) from [<8079e314>] (schedule_timeout+0x1d0/0x3f4)
May 19 12:28:02 raspberrypi kernel: [322680.792964] [<8079e314>] (schedule_timeout) from [<8079b3b4>] (wait_for_common+0xc0/0x184)
May 19 12:28:02 raspberrypi kernel: [322680.792979] [<8079b3b4>] (wait_for_common) from [<8079b498>] (wait_for_completion+0x20/0x24)
May 19 12:28:02 raspberrypi kernel: [322680.792996] [<8079b498>] (wait_for_completion) from [<805b3314>] (usb_sg_wait+0x174/0x188)
May 19 12:28:02 raspberrypi kernel: [322680.793014] [<805b3314>] (usb_sg_wait) from [<805eaffc>] (usb_stor_bulk_transfer_sglist.part.2+0x88/0xec)
May 19 12:28:02 raspberrypi kernel: [322680.793032] [<805eaffc>] (usb_stor_bulk_transfer_sglist.part.2) from [<805eb0bc>] (usb_stor_bulk_srb+0x5c/0x64)
May 19 12:28:02 raspberrypi kernel: [322680.793050] [<805eb0bc>] (usb_stor_bulk_srb) from [<805eb1e8>] (usb_stor_Bulk_transport+0x124/0x370)
May 19 12:28:02 raspberrypi kernel: [322680.793067] [<805eb1e8>] (usb_stor_Bulk_transport) from [<805ebaac>] (usb_stor_invoke_transport+0x30/0x4bc)
May 19 12:28:02 raspberrypi kernel: [322680.793085] [<805ebaac>] (usb_stor_invoke_transport) from [<805ea7b0>] (usb_stor_transparent_scsi_command+0x18/0x1c)
May 19 12:28:02 raspberrypi kernel: [322680.793104] [<805ea7b0>] (usb_stor_transparent_scsi_command) from [<805ecf7c>] (usb_stor_control_thread+0x198/0x2a0)
May 19 12:28:02 raspberrypi kernel: [322680.793120] [<805ecf7c>] (usb_stor_control_thread) from [<8013d860>] (kthread+0x13c/0x16c)
May 19 12:28:02 raspberrypi kernel: [322680.793136] [<8013d860>] (kthread) from [<8010810c>] (ret_from_fork+0x14/0x28)
May 19 12:28:02 raspberrypi kernel: [322680.793216] INFO: task python3:724 blocked for more than 120 seconds.
May 19 12:28:02 raspberrypi kernel: [322680.793223]       Tainted: G         C O    4.14.34-v7+ #1110
May 19 12:28:02 raspberrypi kernel: [322680.793227] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 19 12:28:02 raspberrypi kernel: [322680.793232] python3         D    0   724    522 0x00000000
May 19 12:28:02 raspberrypi kernel: [322680.793254] [<8079a0b8>] (__schedule) from [<8079a730>] (schedule+0x50/0xa8)
May 19 12:28:02 raspberrypi kernel: [322680.793269] [<8079a730>] (schedule) from [<8014b0b4>] (io_schedule+0x20/0x40)
May 19 12:28:02 raspberrypi kernel: [322680.793284] [<8014b0b4>] (io_schedule) from [<8079b0a0>] (bit_wait_io+0x1c/0x70)
May 19 12:28:02 raspberrypi kernel: [322680.793298] [<8079b0a0>] (bit_wait_io) from [<8079ad30>] (__wait_on_bit+0x94/0xd0)
May 19 12:28:02 raspberrypi kernel: [322680.793313] [<8079ad30>] (__wait_on_bit) from [<8079adf4>] (out_of_line_wait_on_bit+0x88/0x90)
May 19 12:28:02 raspberrypi kernel: [322680.793332] [<8079adf4>] (out_of_line_wait_on_bit) from [<802c26f4>] (__wait_on_buffer+0x40/0x44)
May 19 12:28:02 raspberrypi kernel: [322680.793351] [<802c26f4>] (__wait_on_buffer) from [<802c690c>] (__block_write_begin_int+0x510/0x850)
May 19 12:28:02 raspberrypi kernel: [322680.793366] [<802c690c>] (__block_write_begin_int) from [<802c6ce4>] (block_write_begin+0x5c/0xd4)
May 19 12:28:02 raspberrypi kernel: [322680.793380] [<802c6ce4>] (block_write_begin) from [<802c6fb8>] (cont_write_begin+0x25c/0x3c0)
May 19 12:28:02 raspberrypi kernel: [322680.793396] [<802c6fb8>] (cont_write_begin) from [<80389260>] (fat_write_begin+0x64/0x94)
May 19 12:28:02 raspberrypi kernel: [322680.793414] [<80389260>] (fat_write_begin) from [<8021e2a4>] (generic_perform_write+0xb8/0x1b0)
May 19 12:28:02 raspberrypi kernel: [322680.793430] [<8021e2a4>] (generic_perform_write) from [<80220b90>] (__generic_file_write_iter+0x19c/0x1f8)
May 19 12:28:02 raspberrypi kernel: [322680.793446] [<80220b90>] (__generic_file_write_iter) from [<80220d70>] (generic_file_write_iter+0x184/0x298)
May 19 12:28:02 raspberrypi kernel: [322680.793463] [<80220d70>] (generic_file_write_iter) from [<8028a9a8>] (__vfs_write+0xe0/0x138)
May 19 12:28:02 raspberrypi kernel: [322680.793480] [<8028a9a8>] (__vfs_write) from [<8028abc8>] (vfs_write+0xb4/0x1bc)
May 19 12:28:02 raspberrypi kernel: [322680.793495] [<8028abc8>] (vfs_write) from [<8028ae18>] (SyS_write+0x54/0xb0)
May 19 12:28:02 raspberrypi kernel: [322680.793511] [<8028ae18>] (SyS_write) from [<80108060>] (ret_fast_syscall+0x0/0x28)
May 19 12:30:05 raspberrypi kernel: [322803.672673] INFO: task scsi_eh_0:69 blocked for more than 120 seconds.
May 19 12:30:05 raspberrypi kernel: [322803.672690]       Tainted: G         C O    4.14.34-v7+ #1110
May 19 12:30:05 raspberrypi kernel: [322803.672695] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 19 12:30:05 raspberrypi kernel: [322803.672704] scsi_eh_0       D    0    69      2 0x00000000
May 19 12:30:05 raspberrypi kernel: [322803.672752] [<8079a0b8>] (__schedule) from [<8079a730>] (schedule+0x50/0xa8)
May 19 12:30:05 raspberrypi kernel: [322803.672769] [<8079a730>] (schedule) from [<8079aba8>] (schedule_preempt_disabled+0x18/0x1c)
May 19 12:30:05 raspberrypi kernel: [322803.672787] [<8079aba8>] (schedule_preempt_disabled) from [<8079c4b0>] (__mutex_lock.constprop.3+0x190/0x58c)
May 19 12:30:05 raspberrypi kernel: [322803.672805] [<8079c4b0>] (__mutex_lock.constprop.3) from [<8079c9c8>] (__mutex_lock_slowpath+0x1c/0x20)
May 19 12:30:05 raspberrypi kernel: [322803.672821] [<8079c9c8>] (__mutex_lock_slowpath) from [<8079ca28>] (mutex_lock+0x5c/0x60)
May 19 12:30:05 raspberrypi kernel: [322803.672839] [<8079ca28>] (mutex_lock) from [<805ea52c>] (device_reset+0x28/0x54)
May 19 12:30:05 raspberrypi kernel: [322803.672859] [<805ea52c>] (device_reset) from [<80571424>] (scsi_try_bus_device_reset+0x34/0x58)
May 19 12:30:05 raspberrypi kernel: [322803.672875] [<80571424>] (scsi_try_bus_device_reset) from [<80572be8>] (scsi_eh_ready_devs+0x25c/0x88c)
May 19 12:30:05 raspberrypi kernel: [322803.672889] [<80572be8>] (scsi_eh_ready_devs) from [<805741dc>] (scsi_error_handler+0x3a0/0x3a4)
May 19 12:30:05 raspberrypi kernel: [322803.672908] [<805741dc>] (scsi_error_handler) from [<8013d860>] (kthread+0x13c/0x16c)
May 19 12:30:05 raspberrypi kernel: [322803.672928] [<8013d860>] (kthread) from [<8010810c>] (ret_from_fork+0x14/0x28)
-
-
(Repeats a few times from here)
-
-
May 19 12:32:08 raspberrypi kernel: [322926.553061]       Tainted: G         C O    4.14.34-v7+ #1110
May 19 12:32:08 raspberrypi kernel: [322926.553064] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 19 12:32:08 raspberrypi kernel: [322926.553066] python3         D    0   729    522 0x00000000
May 19 12:32:08 raspberrypi kernel: [322926.553077] [<8079a0b8>] (__schedule) from [<8079a730>] (schedule+0x50/0xa8)
May 19 12:32:08 raspberrypi kernel: [322926.553084] [<8079a730>] (schedule) from [<8014b0b4>] (io_schedule+0x20/0x40)
May 19 12:32:08 raspberrypi kernel: [322926.553091] [<8014b0b4>] (io_schedule) from [<8021c9b4>] (wait_on_page_bit+0x110/0x130)
May 19 12:32:08 raspberrypi kernel: [322926.553101] [<8021c9b4>] (wait_on_page_bit) from [<8022c43c>] (write_cache_pages+0x3f8/0x4d4)
May 19 12:32:08 raspberrypi kernel: [322926.553111] [<8022c43c>] (write_cache_pages) from [<802d09a0>] (mpage_writepages+0x6c/0xd8)
May 19 12:32:08 raspberrypi kernel: [322926.553120] [<802d09a0>] (mpage_writepages) from [<803892e0>] (fat_writepages+0x20/0x24)
May 19 12:32:08 raspberrypi kernel: [322926.553127] [<803892e0>] (fat_writepages) from [<8022ea5c>] (do_writepages+0x30/0x8c)
May 19 12:32:08 raspberrypi kernel: [322926.553135] [<8022ea5c>] (do_writepages) from [<8021eda0>] (__filemap_fdatawrite_range+0xc0/0xfc)
May 19 12:32:08 raspberrypi kernel: [322926.553142] [<8021eda0>] (__filemap_fdatawrite_range) from [<8021ee14>] (filemap_fdatawrite+0x38/0x40)
May 19 12:32:08 raspberrypi kernel: [322926.553150] [<8021ee14>] (filemap_fdatawrite) from [<8038a10c>] (writeback_inode+0x30/0x34)
May 19 12:32:08 raspberrypi kernel: [322926.553158] [<8038a10c>] (writeback_inode) from [<8038a14c>] (fat_flush_inodes+0x3c/0x94)
May 19 12:32:08 raspberrypi kernel: [322926.553165] [<8038a14c>] (fat_flush_inodes) from [<803882b4>] (fat_file_release+0x48/0x5c)
May 19 12:32:08 raspberrypi kernel: [322926.553173] [<803882b4>] (fat_file_release) from [<8028bab0>] (__fput+0x9c/0x1e8)
May 19 12:32:08 raspberrypi kernel: [322926.553182] [<8028bab0>] (__fput) from [<8028bc6c>] (____fput+0x18/0x1c)
May 19 12:32:08 raspberrypi kernel: [322926.553190] [<8028bc6c>] (____fput) from [<8013bbf8>] (task_work_run+0xbc/0xe0)
May 19 12:32:08 raspberrypi kernel: [322926.553199] [<8013bbf8>] (task_work_run) from [<8010b810>] (do_work_pending+0xcc/0xd0)
May 19 12:32:08 raspberrypi kernel: [322926.553208] [<8010b810>] (do_work_pending) from [<80108094>] (slow_work_pending+0xc/0x20)
May 19 13:07:36 raspberrypi systemd[1]: Stopping Session c1 of user pi.
May 19 13:07:36 raspberrypi systemd[1]: Stopping Session c2 of user pi.
May 19 13:07:36 raspberrypi systemd[1]: Stopping Disk Manager...
May 19 13:07:36 raspberrypi systemd[1]: Stopped target Sound Card.
May 19 13:07:36 raspberrypi systemd[1]: Stopped target Timers.
May 19 13:07:36 raspberrypi systemd[1]: Stopped Daily apt upgrade and clean activities.
May 19 13:07:36 raspberrypi systemd[1]: Stopped Daily Cleanup of Temporary Directories.
May 19 13:07:36 raspberrypi systemd[1]: Stopping User Manager for UID 1000...
May 19 13:07:36 raspberrypi systemd[1]: Stopping Save/Restore Sound Card State...
May 19 13:07:36 raspberrypi systemd[1]: Stopping Authorization Manager...
May 19 13:07:36 raspberrypi systemd[1]: Stopped target Graphical Interface.
May 19 13:07:36 raspberrypi systemd[1]: Stopping Light Display Manager...
May 19 13:07:36 raspberrypi systemd[1]: Stopped Daily apt download activities.
May 19 13:07:36 raspberrypi systemd[1]: Unmounting RPC Pipe File System...
May 19 13:07:36 raspberrypi systemd[1]: Stopped target System Time Synchronized.
May 19 13:07:36 raspberrypi systemd[1]: Closed Load/Save RF Kill Switch Status /dev/rfkill Watch.
May 19 13:07:36 raspberrypi systemd[1]: Stopped target Multi-User System.
May 19 13:07:36 raspberrypi systemd[1]: Stopped Check for v3d driver.
May 19 13:07:36 raspberrypi systemd[1]: Stopping triggerhappy global hotkey daemon...
May 19 13:07:36 raspberrypi systemd[1]: Stopped target Login Prompts.
May 19 13:07:36 raspberrypi systemd[1]: Stopping Getty on tty1...
May 19 13:07:36 raspberrypi systemd[1]: Stopping LSB: Autogenerate and use a swap file...
May 19 13:07:36 raspberrypi systemd[1]: Stopping OpenBSD Secure Shell server...
May 19 13:07:36 raspberrypi systemd[1]: Stopping Avahi mDNS/DNS-SD Stack...
May 19 13:07:36 raspberrypi systemd[1]: Stopping LSB: Switch to ondemand cpu governor (unless shift key is pressed)...
May 19 13:07:36 raspberrypi systemd[1]: Stopping System Logging Service...
May 19 13:07:36 raspberrypi systemd[1]: Stopping D-Bus System Message Bus...
May 19 13:07:36 raspberrypi systemd[1]: Stopping Regular background program processing daemon...
May 19 13:07:36 raspberrypi systemd[1]: Stopped Regular background program processing daemon.
May 19 13:07:36 raspberrypi systemd[1]: Stopped Getty on tty1.
May 19 13:07:36 raspberrypi systemd[1]: Stopped Authorization Manager.
May 19 13:07:36 raspberrypi systemd[1]: Stopped /etc/rc.local Compatibility.
May 19 13:07:36 raspberrypi systemd[1]: Removed slice system-autologin.slice.

Is this error in syslog a result of the PI itself or could it be caused by my code that writes audio and video? (I do believe that it could be improved) Any help is appreciated. This is difficult to debug since it takes so long for the error to occur.

python-3.x
multiprocessing
raspberry-pi3
asked on Stack Overflow May 21, 2019 by J. Long

0 Answers

Nobody has answered this question yet.


User contributions licensed under CC BY-SA 3.0