且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

使用ffmpeg从视频中提取多个图像,并获取提取的图像的时间戳

更新时间:2023-09-21 23:05:40

在回答我自己的问题时,我现在发现了一个解决方法,尽管我不确定它是如何工作的。通过在 -vf 中定义选择参数,并添加一个 vsync 0 参数如下:

  ffmpeg -i input.avi -vframes 10 -vf'[in] select = not mod(n\,300 * 19.05))[s1]; [s1] showinfo [out]'-vsync 0%02d.jpg& output.txt 

...该函数现在只返回所需的10个帧。以下是前两帧的样例stderr输出:

  [Parsed_showinfo_1 @ 0x21b1b60] n:0 pts:2 pts_time:0.104992 pos :10248 fmt:yuv420p sar:1/1 s:640x480 i:P iskey:1类型:I校验和:F6FDCFBF plane_checksum:[5FB6331C 9D9D7F99 44FB1D0A]意思是:[183 126 155] stdev:[19.6 0.8 2.8] 
frame = 1 fps = 0.0 q = 4.1 size = N / A time = 00:00:00.15比特率= N / A
frame = 1 fps = 1.0 q = 4.1 size = N / A time = 00:00: 00.15比特率= N / A
frame = 1 fps = 0.7 q = 4.1 size = N / A time = 00:00:00.15比特率= N / A
frame = 1 fps = 0.5 q = 4.1 size = N / A时间= 00:00:00.15比特率= N / A
帧= 1 fps = 0.4 q = 4.1大小= N / A时间= 00:00:00.15比特率= N / A
帧= 1 fps = 0.3 q = 4.1大小= N / A时间= 00:00:00.15比特率= N / A
帧= 1 fps = 0.3 q = 4.1大小= N / A时间= 00:00: 00.15比特率= N / A
帧= 1 fps = 0.2 q = 4.1大小= N / A时间= 00:00:00.15比特率= N / A
帧= 1 fps = 0.2 q = 4.1大小= N / A时间= 00:00:00.15比特率= N / A
[Parsed_showinfo_1 @ 0x21b1b60] n:1 pts:5717 pts_time:300.121 pos:24474150 fmt:yuv420p sar:1/1 s:640x480 i:P iskey:0类型:P校验和:BAECD030 plane_checksum:[F609470E 45F694CE 4BFCF445]意思是:[148 126 152] stdev :[17.7 0.8 2.3]
frame = 2 fps = 0.4 q = 2.1 size = N / A time = 00:05:00.17比特率= N / A
frame = 2 fps = 0.4 q = 2.1 size = N / A时间= 00:05:00.17比特率= N / A
frame = 2 fps = 0.3 q = 2.1 size = N / A时间= 00:05:00.17比特率= N / A
frame = 2 fps = 0.3 q = 2.1 size = N / A time = 00:05:00.17比特率= N / A
frame = 2 fps = 0.3 q = 2.1 size = N / A time = 00:05: 00.17比特率= N / A
frame = 2 fps = 0.3 q = 2.1 size = N / A time = 00:05:00.17比特率= N / A
frame = 2 fps = 0.2 q = 2.1 size = N / A时间= 00:05:00.17比特率= N / A
frame = 2 fps = 0.2 q = 2.1 size = N / A time = 00:05:00.17 bitrate = N / A
frame = 2 fps = 0.2 q = 2.1 size = N / A time = 00:05:00.17比特率= N / A

仍然不确定为什么这样工作,或者为什么每个 frame = ... 在输出中被复制了很多次,但它似乎在做这个工作!


I'm using ffmpeg to extract one frame (as a jpeg) every five minutes from videos, and piping the output from the console to a text file in order to get the exact timestamps of the extracted frames.

The command I'm using is:

ffmpeg -i input.avi -ss 00:10:00 -vframes 10 -vf showinfo,fps=fps=1/300 %03d.jpg &> output.txt

Where -ss 00:10:00 lets me skip ahead 10 mins in the video before starting, and -vframes 10 lets me capture only the first 10 frames (1 frame per 5 mins).

This almost works fine except that the command outputs information for all frames, including those that were not written as a jpeg. Here's a three line sample output:

[Parsed_showinfo_0 @ 0x2219020] n:11427 pts:11429 pts_time:599.979 pos:48892180 fmt:yuv420p sar:1/1 s:640x480 i:P iskey:0 type:P checksum:6309A75D plane_checksum:[15A29007 1617E1FE D93A3549] mean:[146 125 153 ] stdev:[17.6 1.0 2.1 ]
[Parsed_showinfo_0 @ 0x2219020] n:11428 pts:11430 pts_time:600.031 pos:48898094 fmt:yuv420p sar:1/1 s:640x480 i:P iskey:0 type:B checksum:815D031A plane_checksum:[E004E973 E28CE2D5 F56636B4] mean:[146 125 153 ] stdev:[17.6 1.0 2.1 ]
[Parsed_showinfo_0 @ 0x2219020] n:11429 pts:11431 pts_time:600.084 pos:48892448 fmt:yuv420p sar:1/1 s:640x480 i:P iskey:0 type:P checksum:6CE2D3C5 plane_checksum:[E983BD86 38B9E198 93B13498] mean:[146 125 153 ] stdev:[17.6 1.0 2.1 ]

I would expect the middle line, with pts_time:600.031, to be the first frame extracted as an image, but have no way to distinguish it from the other frames either side, where images were not extracted.

Does anyone know of a way to resolve this?

Thank you!

In answer to my own question I've now found a workaround, though I'm not exactly sure how it works. By defining a select argument within -vf and also adding a vsync 0 parameter like so:

ffmpeg -i input.avi -vframes 10 -vf '[in]select=not(mod(n\,300*19.05))[s1];[s1]showinfo[out]' -vsync 0 %02d.jpg >& output.txt

...the function now returns the desired 10 frames only. Here's a sample stderr output of the first two frames:

[Parsed_showinfo_1 @ 0x21b1b60] n:0 pts:2 pts_time:0.104992 pos:10248 fmt:yuv420p sar:1/1 s:640x480 i:P iskey:1 type:I checksum:F6FDCFBF plane_checksum:[5FB6331C 9D9D7F99 44FB1D0A] mean:[183 126 155 ] stdev:[19.6 0.8 2.8 ]
frame=    1 fps=0.0 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
frame=    1 fps=1.0 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
frame=    1 fps=0.7 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
frame=    1 fps=0.5 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
frame=    1 fps=0.4 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
frame=    1 fps=0.3 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
frame=    1 fps=0.3 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
frame=    1 fps=0.2 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
frame=    1 fps=0.2 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
[Parsed_showinfo_1 @ 0x21b1b60] n:1 pts:5717 pts_time:300.121 pos:24474150 fmt:yuv420p sar:1/1 s:640x480 i:P iskey:0 type:P checksum:BAECD030 plane_checksum:[F609470E 45F694CE 4BFCF445] mean:[148 126 152 ] stdev:[17.7 0.8 2.3 ]
frame=    2 fps=0.4 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A    
frame=    2 fps=0.4 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A    
frame=    2 fps=0.3 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A    
frame=    2 fps=0.3 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A    
frame=    2 fps=0.3 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A    
frame=    2 fps=0.3 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A    
frame=    2 fps=0.2 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A    
frame=    2 fps=0.2 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A    
frame=    2 fps=0.2 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A   

Still unsure exactly why this works or why each frame= ... is replicated so many times in the output, but it seems to do the job!