ffmpeg timestamp information using fps filter isn't aligned with ffprobe

1

I'm using ffmpeg to extract images (thumbnails) from a video using the filter fps so that I get an image every 0.5 seconds. This is the command I use:

ffmpeg -i video.mp4 -f image2 -filter:v fps=1/0.5 -y out_%3d.png

I want to know the timestamp for these images and I've found out that ffmpeg behaves differently than ffprobe.

First of all, I haven't found a way to get the timestamps as metadata (log files or whatever) but I got to overlay the timestamp in the images themselves using ffmpeg:

ffmpeg -i video.mp4 \
-vf "fps=fps=1/5,drawtext=fontfile=/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf: text='%{pts\:hms}': x=(w-tw)/2: y=h-(2*lh): fontcolor=white: box=1: boxcolor=0x00000000@1" \
out_with_timestamp_%03d.png

However, using ffprobe you can simulate the use of the same fps filter and ffprobe allows you to get some information from the frames. This command is supposed to simulate the ffmpeg one and provides some metadata where you can extract the timestamps from:

fprobe -hide_banner \
-i "movie=video.mp4,fps=fps=1/0.5[out0]" \
-f lavfi -show_frames -show_entries frame=pkt_pts_time -of csv=p=0

The problem is that the timestamps that ffmpeg prints into the images are different than the ones that ffprobe gives, and ffprobe ones are wrong while the ones given by ffmpeg are right.
The timetamps given by ffmpeg are in the middle of the period of time while the ones given by ffprobe are in the very begining of the window.

Is there any way to extract thumbnails and its timestamp from a video using ffmpeg?

You can find all the steps to be able to reproduce this behaviour here: https://www.joseoc.com/en/video/ffmpeg/extract-images-from-video/#getting-the-timestamp-for-the-images

video
ffmpeg
thumbnails
asked on Stack Overflow Jul 13, 2018 by Jose.OC

1 Answer

0

ffmpeg and ffprobe do yield the same timestamps.

However, you want to know the source timestamps, for which you have to apply the fps filter afterwards. What the fps filter does is to drop and/or duplicate frames from the source stream to generate a constant frame-rate output. It also regenerates smooth timestamps at the given rate. So you have to draw the text earlier to imprint the source TS.


Here's a crude workaround to get source TS to file.

Run,

ffmpeg -i video.mp4 -vf "showinfo,fps=fps=1/5,showinfo" out_%03d.png 2> showinfo.log

showinfo.log, will have lines like these

[Parsed_showinfo_0 @ 000000000051fe40] n:  56 pts:  28672 pts_time:2.33333 pos:   159155 fmt:yuv420p sar:0/1 s:1280x720 i:P iskey:0 type:P checksum:E16BA92B plane_checksum:[DDD39061 7EA5EE0E B9642AAD] mean:[188 113 134 ] stdev:[35.2 24.8 6.3 ]
[Parsed_showinfo_0 @ 000000000051fe40] n:  57 pts:  29184 pts_time:2.375   pos:   161402 fmt:yuv420p sar:0/1 s:1280x720 i:P iskey:0 type:P checksum:A363FB5E plane_checksum:[9BBF72F4 87165D47 ADB32B23] mean:[188 113 134 ] stdev:[35.3 24.7 6.3 ]
[Parsed_showinfo_0 @ 000000000051fe40] n:  58 pts:  29696 pts_time:2.41667 pos:   173158 fmt:yuv420p sar:0/1 s:1280x720 i:P iskey:0 type:B checksum:5EE879C4 plane_checksum:[55EA6029 A578E90A 4F463082] mean:[188 113 134 ] stdev:[35.3 24.6 6.3 ]
[Parsed_showinfo_0 @ 000000000051fe40] n:  59 pts:  30208 pts_time:2.45833 pos:   170104 fmt:yuv420p sar:0/1 s:1280x720 i:P iskey:0 type:B checksum:EE8A0C3A plane_checksum:[A90276D4 620C6744 F5012E13] mean:[188 114 134 ] stdev:[35.4 24.5 6.3 ]
[Parsed_showinfo_0 @ 000000000051fe40] n:  60 pts:  30720 pts_time:2.5     pos:   175138 fmt:yuv420p sar:0/1 s:1280x720 i:P iskey:0 type:B checksum:8E112769 plane_checksum:[9A02FB68 1FF2EC88 1E573F5B] mean:[188 114 134 ] stdev:[35.5 24.4 6.3 ]
[Parsed_showinfo_2 @ 0000000004947540] n:   0 pts:      0 pts_time:0       pos:   170104 fmt:yuv420p sar:0/1 s:1280x720 i:P iskey:0 type:B checksum:EE8A0C3A plane_checksum:[A90276D4 620C6744 F5012E13] mean:[188 114 134 ] stdev:[35.4 24.5 6.3 ]
[Parsed_showinfo_0 @ 000000000051fe40] n:  61 pts:  31232 pts_time:2.54167 pos:   164026 fmt:yuv420p sar:0/1 s:1280x720 i:P iskey:0 type:P checksum:BB465233 plane_checksum:[EAEAE802 57E310BE 62275964] mean:[187 114 134 ] stdev:[35.7 24.4 6.3 ]
[Parsed_showinfo_0 @ 000000000051fe40] n:  62 pts:  31744 pts_time:2.58333 pos:   187333 fmt:yuv420p sar:0/1 s:1280x720 i:P iskey:0 type:B checksum:9D4E3B24 plane_checksum:[D5D19959 05A4FDBE 88B7A3EF] mean:[187 114 134 ] stdev:[35.8 24.4 6.3 ]

For each Parsed_showinfo_2 line, you have to extract the Parsed_showinfo_0 line whose pos is the same. The pts_time in Parsed_showinfo_0 is your source TS.

answered on Stack Overflow Jul 13, 2018 by Gyan • edited Jul 16, 2018 by Gyan

User contributions licensed under CC BY-SA 3.0