I'm looking to add a 'Listen to the Article' button on our website and need advice on how to host the audio files for this feature. We have over 60,000 articles from the last 25 years, and if we digitize from 1958, that number goes up to over 100,000. We also add around 40-50 new articles every week. While not every article will have an audio file, managing the hosting costs is a concern. I've considered using S3, but that could get pricey. Should I host the files locally and monitor bandwidth usage instead? Are there any audio-specific hosting providers that you would recommend, similar to podcast hosting? I'd appreciate any guidance you can share!
4 Answers
Have you thought about a client-side approach for generating audio? That way, it only creates the file when the button is clicked, saving storage and time on transcription. You could leverage an API for your preferred audio generation model.
How will you generate the audio files? Are you thinking of using an AI solution for that?
You might want to check various storage providers for affordable options with decent speeds. S3 has different tiers, so you might find a balance between cost and reliability. Personally, I’d lean towards self-hosting, but securing offsite redundancy would be key given the volume of data you have. A good home network should manage it fine at first.
You could mount a separate storage volume on a server just for the audio files. That might help with management!
Consider Backblaze B2; it’s pretty cost-effective at $6 per TB monthly. Plus, if you route traffic through Cloudflare, you can avoid egress fees since they partner in the Bandwidth Alliance.

Yeah, probably AI, unless there's a unique aspect to the article.