A tool that uses Meta's MMS model to create interlinear audio bibles, in potentially 1,000+ different languages.
The idea is explained in depth in this blog post
git clone git@github.com:theJoshMuller/audio-interlinearify.gitI've found working in a virtual enviromentment to be really helpful:
python3.11 -m venv venv
source venv/bin/activateOnce your environment is in place, install the requirements:
pip install -r requirements.txtYou'll also likely need to make sure that, whatever system you're using has ffmpeg and sox installed.
On Arch Linux that's:
sudo pacman -S ffmpeg soxOn macOS (using Homebrew), you'll also need rubberband:
brew install ffmpeg sox rubberbandYour installation command will depend on your system.
To make an interlinear audio bible, you'll need a text (txt) and audio (mp3) file for each language, for each chapter you want to create an interlinear audio file for.
The text files will need to have each "segment" (verse, sentence, etc.) that you want separated by lines, and each txt file will need to have the same number of lines.
Example files are provided in ./sample_data
Here's the syntax for the command:
python interlinearify.py \
--audio1 "./sample_data/ISA_061.eng.mp3" \
--txt1 "./sample_data/ISA_061.eng.txt" \
--audio2 "./sample_data/ISA_061.heb.mp3" \
--txt2 "./sample_data/ISA_061.heb.txt" \
--language1 "eng" \
--language2 "heb" \
--output "ISA_061.eng-heb-interlinear.mp3"Language options can be found in ./data/mms_languages.json
This is just a fun side project. If you want to contribute, feel free!
Thanks to Trent Cowden for building out TimeStampAudio CLI, and thanks to Meta for releasing their MMS model.