I'm working on adding English subtitles to some instructional videos that are in Japanese using Whisper. I've successfully extracted the audio track with ffmpeg, and Whisper does a great job transcribing the audio. However, when I try to translate the transcription using the '--task translate' option, it continues to output Japanese without translating it to English, as if the translation option is being ignored. I've attempted various command line options, most recently:
whisper --verbose True --fp16 False -f srt --task translate --language ja japan-video-sample.mp3
The verbose feature also doesn't seem to give me any insight into the problem. What am I doing wrong?
4 Answers
I believe I found a fix for this. Switching from the 'turbo' model to the 'large' model helped with translating Spanish, and I think it should work for Japanese as well. When I used this command:
whisper --verbose True --model large --language es --task translate .\spanish-sample.mp3
It produced English text successfully. Although the verbose option still doesn’t seem to do much.
It seems like Whisper isn't set up to translate audio, which is why you're just getting the transcription in Japanese. To get your translations, consider using GPT-based models like 'gpt-40-mini' after obtaining the text from Whisper. That way, you can pass the transcribed text to it and ask for an English translation.
For a quick solution, you could just copy the transcript you get and paste it into something like Gemini 2.5 Pro. That tool handles translations well and can also work with subtitles.
I had a similar issue with a clean Spanish audio sample, and even with a capable machine, Whisper still wouldn't translate for me. I used the command:
whisper --verbose True --language es --task translate .spanish-sample.mp3
But it just gave me the transcription without the translation. You're definitely not alone with this problem.
Related Questions
Extract Audio From Video File
Compress MP3 File
Online Audio Converter
Convert MP4 to MP3