Given a hypothetical folder structure like this:
Star.Trek.Discovery.S04E06.German.DL.1080p.BluRay.x264-iNTENTiON/
├── star.trek.discovery.s04e06.german.dl.1080p.bluray.x264-intention.mkv
├── star.trek.discovery.s04e06.german.dl.1080p.bluray.x264-intention.nfo
└── Subs
├── star.trek.discovery.s04e06.german.dl.1080p.bluray.x264-intention-eng.idx
├── star.trek.discovery.s04e06.german.dl.1080p.bluray.x264-intention-eng.sub
├── star.trek.discovery.s04e06.german.dl.1080p.bluray.x264-intention.idx
└── star.trek.discovery.s04e06.german.dl.1080p.bluray.x264-intention.sub
Star.Trek.Discovery.S04E07.German.DL.1080p.BluRay.x264-iNTENTiON/
├── star.trek.discovery.s04e07.german.dl.1080p.bluray.x264-intention.mkv
├── star.trek.discovery.s04e07.german.dl.1080p.bluray.x264-intention.nfo
└── Subs
├── star.trek.discovery.s04e07.german.dl.1080p.bluray.x264-intention-eng.idx
├── star.trek.discovery.s04e07.german.dl.1080p.bluray.x264-intention-eng.sub
├── star.trek.discovery.s04e07.german.dl.1080p.bluray.x264-intention.idx
└── star.trek.discovery.s04e07.german.dl.1080p.bluray.x264-intention.sub
4 directories, 12 files
What’s the best way to integrate all the subtitles into the corresponding MKV file?
If you arent afraid of remuxing: use mkvtoolnix
Optionally has a GUI.Juat so you know: If the name is identical to the mkv + the suffix
.de.sub
some media libraries automatically detect them and you can use them as is without altering the media file.
Jellyfin for example lists the file as “Ger Sub - External” in the subtitle list.
Just make sure to stay consistent or use sonarr/bazarr instead.Console batch file for mkv merge should do that as long as subs are named same as video files (or something consistent like that name +subs).
You’d want to target the sub files not idx. If that fails I’d just do an intermediate step of merging the sub and idx files into an mks file then merging that into the mkv. The gui only needs the sub files added. I’ll try and come back to this though someone else already posted a script that should be able to be adapted to do this.
EDIT: https://github.com/gwen-lg/subtile-ocr is better than https://github.com/ruediger/VobSub2SRT Subtitle-ocr had no issue with recognising “I” as “I” while vobsub2srt constantly sees “I” as “|” and blacklisting “|” causes “I” to be recognised as “]” and blacklisting “|” “[” and “]” just leaves a blank space so thanks for the recommendation! I’ll be using that to convert VOBSUB to srt instead
for f in ./*.idx; do subtile-ocr -l ger -o "${f%.*}.srt" "$f"; done
And then https://mkvtoolnix.download/ to remove any unwanted subtitles from the mkv files
for f in ./*.mkv; do outdir="nosubs" mkvmerge -o "${outdir}/$f" --no-subtitles "$f"; done
Finally, I merge the srt files with the mkv files
for f in ./nosubs/*.mkv; do outdir=addedsubs g="${f##*/}" mkvmerge -o "${outdir}/$g" "$f" --language 0:ger --track-name 0:German "${g%.*}".srt; done
Once the files in the addedsubs directory look good, I delete all of the other files. You can add it all to a bash file or make an alias or both and run it wherever
I’m pretty sure MKV can handle VOBSUB. Why do you convert them to .srt before merging them?
Edit:
I’ve also just found this: https://github.com/elizagamedev/vobsubocr
The most comparable tool to vobsubocr is VobSub2SRT, but vobsubocr has significantly better output, especially for non-English languages, mainly because VobSub2SRT does not do much preprocessing of the image at all before sending it to Tesseract. For example, Tesseract 4.0 expects black text on a white background, which VobSub2SRT does not guarantee, but vobsubocr does. Additionally, vobsubocr splits each line into separate images to take advantage of page segmentation method 7, which greatly improves accuracy of non-English languages in particular.
Edit 2:
And a fork of it, of course: https://github.com/gwen-lg/subtile-ocr
As you seems to not update this project anymore, I have done a fork to continue the project. With subtile-ocr I have use subtile subtile is a fork no longer maintained vobsub crate. With this I was able to :
- modernise the code by :
- update dependencies, especially nom who need a lot of code modification.
- migrate to thiserror and anyhow for error management
- do some small optim (by reducing a lot the memory allocation count) And it could be a better start to add functionality (like managing .sup: blue-ray subtitle format).
Yeah, mkv can handle VOBSUB. I just prefer text based like srt or ass since you can edit the subtitles to get better timings or changing font/ colour or fix spellings really easily. I also find the VOBSUB a bit blurry around the edges of the text.
If you’re happy with the VOBSUB, then the last bit of code above will merge them with the mkv file and they should be automatically on when using vlc.
Thanks for the links, I was thinking about how outdated vobsub2srt was and definitely want to try these instead!
Iirc vobsub is not text so while you can add it to the container it will always require a transcode on plex/jelly/etc to burn in.
I wasn’t aware of the transcoding requirement, thank you. So I guess converting the subtitles is a best practice I should adopt.
What they told you is misleading.
Transcoding and burning in subtitles for Plex and similar only happens in some cases if your streaming device doesn’t support image based subtitles. Plex themselves could fix this on a lot more devices but don’t.
10 years ago it was the case that there were a LOT of issues with anything but text subtitles. These days it depends. If you’re running it directly off a smart tv (bad experience anyways, not recommended) it’s likely to be an issue. If you’re using an Android streaming device or Apple TV or gaming console there’s a good chance the subs just work.
Truth is lots of things can force transcoding with Plex including using certain audio formats in certain media containers. Most of these days picture subs work. If you can get text subs it’s not a bad thing but I wouldn’t go through the hassle of doing flawed OCR unless you can confirm it’s an issue you’re experiencing with your setup.
If vobsub is image based (pretty sure it is) then it needs an OCR converter. Most streaming setups will burn the image based subs in. Honestly for 1080p and lower a modern cpu/gpu won’t miss a beat.
- modernise the code by :
MKVToolNix Batch Tool should be able to do it automatically, assuming the subtitle format is supported.
MKVToolNix Batch Tool works on Windows 32-bit (x86) and Windows 64-bit (x64) operating systems,
Unfortunately I’m a Linux user.
Works fine on my Fedora install with the gui, haven’t tried just the batch tool.
MakeMKV can do that for you.