I've been frustrated with our current CI strategy, which revolves around a 'rerun until green' approach. It feels like gambling every time I push to main: the tests pass locally, but then they fail in the pipeline. I find myself hitting the rerun button repeatedly, and sometimes they pass after a few attempts, leading us to just ship the code without addressing the underlying issue. It's become so normal that no one even checks what went wrong anymore. I understand this is a crazy way to handle things, but fixing flaky tests takes time, and there always seems to be something more urgent that needs attention. I've tried extending wait times and running tests in Docker to mimic the CI environment, but those flaky tests still linger. One of my colleagues is pushing for a complete tool switch, considering options like Testim, Momentic, or even rewriting everything in Playwright. I'm at a loss and wondering—has anyone found real solutions for managing flaky tests, or is this just something we all have to deal with?
5 Answers
The first thing you need to do is tackle those flaky tests. It's essential to fix them instead of treating them as a minor inconvenience. Have clear policies in place that set flaky tests as high-priority fixes. It might be free to start, just involves some communication with the team and getting everyone on the same page about the risks of relying on unreliable tests. Trust me, it will pay off in the long run!
At the end of the day, monitoring which tests are flaking the most is a good place to start. Look at logging results over time to see trends. Then, you can replace or fix the culprits one at a time, rather than trying to overhaul everything at once.
Right! A systematic approach will get you further than trying to resolve everything in one go. It’s about making your pipeline predictable again.
It’s important to differentiate between flaky CI and flaky tests. Typically, flaky tests arise from issues like improperly managed state, misconfigured harnesses, or race conditions. Identifying the common causes is crucial for fixing them.
Totally! Once you pinpoint what's causing the flakiness, you can address those specific issues instead of just hoping they'll pass on a rerun.
You really have to invest time in fixing your flaky tests. If they're inconsistent, they should be seen as failing tests. Besides that, some teams might choose to remove them if they're not providing any real value. It's about prioritizing and getting that message across to your management.
Exactly! If you're constantly rerunning flaky tests, they're just wasting your team's time—and time is money. Engaging your directors about this issue could spark the change needed.
When dealing with flaky tests, adding timeouts or improving the use of states can really help stabilize your test outcomes. If it's an option, switching to a tool like Playwright might also relieve some of the race condition headaches.
Definitely! Each testing framework has its quirks, and some are better suited for certain scenarios. Playwright tends to be more forgiving with async situations, which might be beneficial in your case.

Absolutely! You can emphasize that a feature-rich product isn't worth much if it's half-broken. Making your product managers aware of the impact on customer satisfaction and reliability is key to getting their support.