I'm using yauzl, a node library for handling zip files, and it crashes whenever it encounters a malformed zip file. When my pod processes a bad file, it goes through a cycle: the pod crashes, Kubernetes restarts it, and it attempts to process the same bad file again, leading to another crash. This results in a CrashLoopBackOff state. If the problematic file is stored in a queue or persistent storage, it will continue to cause crashes until someone manually removes it. I'm curious if anyone has implemented crash isolation for workloads that involve file parsing, particularly for faulty inputs.
1 Answer
You definitely need to manage the yauzl errors directly in your application code. If you don't account for these exceptions, you're just asking for trouble.

Exactly, or you could consider moving the message from the queue to a Dead Letter Queue after a few failed attempts.