I'm looking for a reliable way to save and manage key-value outputs from one pipeline run to the next. To be clear, I want to persist data outside of just passing values between jobs in the same pipeline. Currently, I've been using S3 to store the data in JSON or YAML and then accessing it in future runs, but it feels a bit too manual for something that seems common in workflows. I'd love to hear about more effective or maintainable solutions that you've found in real-world scenarios. Any best practices or potential pitfalls to watch out for? For context, I'm running a list of client names through a stepwise migration process where new clients are flagged and old ones are removed. If a step fails, that client doesn't get removed until it succeeds—the migration steps are all idempotent. Thanks!
5 Answers
Have you considered using a matrix to manage your data? Instead of typical criteria, you could use client IDs or names for better organization. Not sure if it's a perfect fit, but worth exploring!
You might also like the idea of using a lightweight key-value store with a RESTful interface, like Kinto. It can provide a more precise reading/writing experience without needing to overwrite entire blobs. This could work well as a sidecar solution!
For my similar use case, we decided to just go with MySQL. It's proved super handy, especially when we need to modify or add business logic. Sure, a static JSON file could work, but SQL gives us timestamping and auto-incrementing features, plus it’s great for making a status dashboard!
How are you interacting with the database in your jobs? Just standard SQL queries?
It’s tough to give specific suggestions without knowing your tools. If you're using something like Jenkins, you can archive artifacts to pull from in future runs. Alternatively, consider pushing your data to git with a meaningful commit message for traceability.
I'm using GitLab CI with a custom alpine image, so I can add any tools I need.
S3 is solid, but remember to keep your URIs unique to avoid overwriting issues. It's definitely doable, but I get the feeling that there’s a need for a more streamlined method out there. Let me know if you hit any snags with this approach!
Yeah, I'm sticking with S3 for now, though I'm itching for an easier solution.
Kinto sounds interesting! I couldn't find much info on it—do you have a link?