I've been working with OpenSearch Serverless and I'm running into an issue with the bulk API. I have a file containing around 3000 documents, but only about 700 of them are syncing successfully before I hit a timeout. Does anyone know if there's a limit on how many documents can be processed at once, or if there are better practices for handling bulk inserts?
2 Answers
Honestly, I've found OpenSearch Serverless to be quite underwhelming. It can get pretty expensive, and it doesn't scale as well as you might hope. In a previous role, we tried to use it for logging and ended up switching back to the non-serverless version due to performance issues. Just something to keep in mind if you're exploring options!
Yes, there are definitely limits. From what I've seen, whether you're using OpenSearch in a serverless setup or not, if you try to ingest too many documents at once without employing an ingestion queue, it can refuse to process them. The bulk API sometimes won't throw back error messages, which makes it tricky. You might want to consider building your own ingestion queue and retrying failed documents; it can save you a lot of headaches!
I completely agree with you! A decent database should handle data inserts more reliably. It's frustrating when these issues pop up.
Is it common for there to be silent failures like that? I recently tried indexing documents and only a portion went through, but there were no errors logged.

Thanks for the heads-up! I’m trying to get familiar with it for an upcoming interview, but it sounds like the non-serverless option might be more practical.