I'm developing an app that translates text between languages using OpenAI's ChatGPT model, but it's quite slow in responding. My process involves breaking the input text into sentences with spaCy, a Python library for language processing, and then translating and aligning each sentence one by one with ChatGPT before sending the results to the frontend for display. This method ends up taking a lot of time. I'm considering lazy loading, but I'm worried it won't be fast enough. I'm also thinking about sending requests in batches of five at a time and possibly using WebSockets for real-time communication between my frontend and backend to gradually translate five sentences at once. Since this is my first website project, I'm unsure about the best approach to tackle this issue.
5 Answers
Are you making a separate API call for each sentence? That could really slow things down. Look into whether OpenAI allows batch processing; that could save you a lot of time. Typically, in web development, reducing the number of API calls or making parallel requests can improve performance. You've got the right idea with progressive loading; it helps scale better. WebSockets are a solid choice, but you can also opt to send multiple requests from the frontend and load responses as they come without using WebSockets.
That’s not really what lazy loading is about. You might want to rethink that part.
Why not translate the whole text at once instead of going sentence by sentence? It might be faster overall.
If I send the entire text, it could take 1 or 2 minutes just to translate and another minute for alignment, which feels like a long wait.
Just out of curiosity, what does 'align' mean in this context?
It seems tricky, but your idea about batch processing combined with WebSockets sounds like a good plan, especially to enhance user experience while they wait for results.
I guess it all depends on how long it takes to translate and align one sentence. I'm considering sending 5 sentences in each request or 5 batch requests.