MKV Files with HDMV/PGS (bitmapped) to 'burned in' subtitles
by TB0ne from LinuxQuestions.org on (#5JXK1)
Recently ripped a movie that was in a language I do not speak, but it had English subtitles. Not a problem...until the Roku media player wouldn't display the subtitles. It was a Matroska (.MKV) file, and instead of text subtitles in .srt format, it had HDMV/PGS subtitles. Not words, per se, but bitmapped images OF text.
There are processes that you can use in conjunction with Tesseract to OCR the bitmaps and generate an srt, but this seemed a bit much. After some experimentation, I came up with:
Code:ffmpeg -i <INPUT MKV FILE NAME> -filter_complex "[0:v][0:s]overlay[v]" -map "[v]" -map 0:a -c:a libmp3lame <OUTPUT FILE NAME.MP4>...which transcoded the file into an MP4, with burned-in subtitles. May not be 100% lossless, but it still looks pretty good, and was only a single-step process.
There are processes that you can use in conjunction with Tesseract to OCR the bitmaps and generate an srt, but this seemed a bit much. After some experimentation, I came up with:
Code:ffmpeg -i <INPUT MKV FILE NAME> -filter_complex "[0:v][0:s]overlay[v]" -map "[v]" -map 0:a -c:a libmp3lame <OUTPUT FILE NAME.MP4>...which transcoded the file into an MP4, with burned-in subtitles. May not be 100% lossless, but it still looks pretty good, and was only a single-step process.