Hi everyone! I'm currently working on a project that focuses on converting audio to subtitles in over 70 languages. To produce high-quality subtitles, I utilize machine learning models to analyze the text grammatically. This way, my software can intelligently determine where to break lines for subtitles. I'm using Python with Stanza, an NLP library that requires a GPU for optimal performance.
I'm facing some challenges because I have a limited budget and the user traffic may vary unpredictably. I was thinking of using a pay-per-use, scale-to-zero GPU service, and after testing, I confirmed that cold starts won't be an issue for my project.
However, Stanza necessitates downloading a specific large model for each language, which complicates things further. To minimize cold starts, I considered setting up 70 different containerized services—one for each language.
While I'm confident in my ability to implement this with a dynamic Dockerfile and set up CI/CD for deployments, managing all these containers seems daunting from a hosting and operations standpoint. I'm not a DevOps expert, and I really could use some advice or feedback on how to move forward!
2 Answers
Just to add to that, the models required for deep linguistic analysis need to be specific for each language. That’s why handling them separately could seem necessary. You can learn more about that if you check Stanza's documentation. It might give you better insights on how to manage the models efficiently.
You might want to reconsider the idea of having a separate container for each language. Many models can process multiple languages without reconfiguring for each one, which could save you a lot on hosting costs. Additionally, using a managed Kubernetes service might be ideal given the number of containers you're dealing with, but keep in mind that those can be pricey too.
Related Questions
How To: Running Codex CLI on Windows with Azure OpenAI
Set Wordpress Featured Image Using Javascript
How To Fix PHP Random Being The Same
Why no WebP Support with Wordpress
Replace Wordpress Cron With Linux Cron
Customize Yoast Canonical URL Programmatically