Beware! There are easier ways to do this ;-) But I’ve long wanted to
learn how to use some of the more advanced features of ffmpeg
, and
putting together a slideshow (with a inset video) for an
online course during Covid-19 was
the opportunity. Now I have a pipeline, this should not take so long
next time.
1. Write the talk
...as you normally would for a live presentation. Make a PDF version of the slides.
2. Record the talk
Either with your laptop webcam, or using a digital camera. I’ll be using a resolution of 1280 x 720 pixels, at 25 frames per second. Since the video inset will be shrunk, a smaller resolution (800x600) would be fine.
If you’re like me, you can’t do this in a single take. No problem. Also, be careful about the sound quality. One could record sound separately from the moving images, and later combine them, but this is beyond this particular HOWTO.
3. Process the talk video
Using ffplay
, determine the start and stop time in each
segment. (ffprobe
is invaluable too.) Trim the videos and transcode
to MP4. In my limited experience, working with MP4s in ffmpeg
is
most successful.
ffmpeg -i PA131753.MOV -ss 7 -t 509 part1.mp4
ffmpeg -i PA131754.MOV -ss 6 -t 449 part2.mp4
ffmpeg -i PA131755.MOV -ss 6 -t 284 part3.mp4
ffmpeg -i PA131758.MOV -ss 80 -t 557 part4.mp4
Concatenate the segments:
ffmpeg -i part1.mp4 -i part2.mp4 -i part3.mp4 -i part4.mp4 \
-filter_complex \
"[0:v:0][0:a:0] [1:v:0][1:a:0] [2:v:0][2:a:0] [3:v:0][3:a:0] \
concat=n=4:v=1:a=1 [outv][outa]" \
-map "[outv]" -map "[outa]" me.mp4
(I couldn’t get the simpler -concat
function to work.) In words:
“send the video of input 0 to stream 0, the audio of input 0 to stream
0, the video of input 1 to stream 0... concatenate four inputs into
one video and one audio stream, them map those streams to the video
and audio in the output.
Then add a fade-in and fade-out (not necessary, of course):
ffmpeg -y \
-i me.mp4 \
-filter_complex \
"color=black : 1280x720 : d=1720 [base] ; \
[0:v] fade=in : st=0 : d=2 : alpha=1 [v1] ; \
[v1] fade=out : st=1718 : d=2 : alpha=1 [v2] ; \
[base][v2] overlay [v3] ; \
[0:a] afade=t=in : st=0 : d=2 [a1] ; \
[a1] afade=t=out : st=1718 : d=2 [a2] " \
-map [v3] -map [a2] \
-t 1720 \
me_fade.mp4
In words: “make a 1280x720 black screen for 1720 s; take the video from the first input and fade in over 2 s, then take that stream and fade out for 2 s starting at second 1718; overlay the faded stream on the black; fade the audio in and out similarly, and combine; trim the final video to 1720 s.”
4. Make the slide video
Watch the talk video and jot down the start time of each slide, and
the end time of the last slide. I tried to use the concat
demuxer
with a script listing slides and durations, but the resultant video
was faulty, skipping some slides. In the end, I made a video of each
slide separately. Some awk and some shell script:
echo "
1 0
2 65
3 108
...
17 1644
18 1720" | \
awk '{if ($1>1) {print "ffmpeg -loop 1 -i img" $1 -1 \
".jpg -vf \"fps=25,format=yuvj420p\" -t " $2 - last \
" img" $1-1 ".mp4"} last = $2}' > img2mp4.sh
cat img2mp4.sh
ffmpeg -loop 1 -i img1.jpg -vf "fps=25,format=yuvj420p" -t 65 img1.mp4
ffmpeg -loop 1 -i img2.jpg -vf "fps=25,format=yuvj420p" -t 43 img2.mp4
...
sh img2mp4.sh
Then combine them:
ffmpeg \
-i img1.mp4 -i img2.mp4 -i img3.mp4 -i img4.mp4 \
-i img5.mp4 -i img6.mp4 -i img7.mp4 -i img8.mp4 \
-i img9.mp4 -i img10.mp4 -i img11.mp4 -i img12.mp4 \
-i img13.mp4 -i img14.mp4 -i img15.mp4 -i img16.mp4 \
-i img17.mp4 \
-filter_complex \
"[0:v:0] \
[1:v:0] \
...
[16:v:0] \
concat=n=17:v=1 [outv]" \
-map "[outv]" slides.mp4
5. Overlap the video inset on the slides
ffmpeg \
-y \
-i slides.mp4 \
-i me_fade.mp4 \
-filter_complex \
"[1] scale=w=320 : h=180 [2] ; \
[0][2] overlay=920 : 240 [3] " \
-map "[3]" -map "[1:a]" \
slideshow.mp4
In words: “Scale the second input to 320 x 180, then overlay than on the first input, locating the inset video at 920 pix across and 240 down; use the audio from the first input.
Yay! Ffmpeg
is pretty handy, when you finally get it. And the final
file is only 58M for a 28 minute slideshow (see
snippet).