How can I more effectively re-run terminated AWS Spot Instance jobs in CI?

0
9
Asked By CreativeCoder42 On

Hey everyone, I'm currently using a script that triggers every 15 minutes to re-run jobs that have been terminated, but I'm finding it doesn't catch all the terminated workflows. I came across an old post discussing AWS spot instances in CI jobs and I'm curious if there are any newer, better solutions available. I'd appreciate any insights or advice! Thanks!

4 Answers

Answered By DevOpsWizard On

As a general rule, any workload on spot instances should ideally be designed to be restartable. This can save you a lot of hassle in the long run.

Answered By TechGuru99 On

Before diving into solutions, I'm curious why you're re-running those terminated workflows automatically? Is it a necessity due to the large number of tests you’re running?

CreativeCoder42 -

That's exactly why I'm asking for advice! I have hundreds of tests running daily and can't afford to manually re-run each one.

Answered By CloudNinja On

If you're using Buildkite, they have built-in automated retries for steps that help manage spot instance failures effortlessly.

Answered By CodeWhiz123 On

Have you considered setting up your workflow like this?

```yaml
on:
workflow_run:
workflows: ["Main Workflow"]
types:
- completed
```
This way, you can check if the workflow finished successfully or was terminated and rerun it. It might be more efficient than polling every 15 minutes. You could also explore using `workflows:[all]` instead of listing them individually, saving you some time. Let me know if you give this a shot!

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.