I've been using AWS for about five years now and feel comfortable with tools like ECS, Lambda, RDS, and CloudFormation. However, I notice a significant gap between just operating these services and actually designing a system from the ground up, complete with the right service choices, understanding failure points, and managing cost tradeoffs. This gap is becoming increasingly important for me as my team is in the process of redesigning several services. I want to take on more responsibility in architectural decisions rather than merely implementing what others have decided.
My daily tasks mostly revolve around operations work, such as reviewing security groups, managing IaC drift, and debugging issues that arise under load. The design work I get to do is often fragmented. To prepare for an internal architecture review, I've started taking small scenarios from our environment and documenting my decisions: selecting ECS over Lambda, identifying failure points, determining what requires Multi-AZ support, predicting cost spikes, and deciding what I would prioritize for monitoring. I've also been experimenting with coding assistants like Claude and Beyz to test small designs and practice articulating tradeoffs, mainly due to the lack of a regular design partner to collaborate with. For those who have transitioned from hands-on cloud work to architectural ownership, what strategies helped you develop your system design skills?
5 Answers
You seem to be on the right track with your decision documentation! A technique that worked for me was continually asking, 'What would break first?' during operations. It forces you to trace back to the design's purpose and flaws. Practicing redesigning real incidents or outages can help solidify your understanding. It’s not just theory; it’s active learning, and it makes a big difference when it comes to long-term retention of those design instincts.
It sounds like you’re already doing a lot of the right things, especially with your design testing. I’d also suggest looking into tools like Hindsight to retain context over multiple design discussions, which could really help you build a solid understanding of architecture principles as they relate to your environment.
I can relate; I transitioned from being hands-on to an architect role and found that over time and experience made a huge difference. Using AI tools to help design better solutions is vital, but having access to solid documentation and case studies is also essential. Collaborating with more experienced colleagues can also open your eyes to new ideas and approaches.
Doing ops work really teaches you which system design patterns are effective and which ones just lead to problems. It's crucial to not just follow the standard procedures but to dive in deeper and learn from the issues you encounter. That’s how you gain valuable insights.
I think you're on the right path! Nowadays, a lot of people just describe their situation to AI and let it help them figure out the solutions. It's a great way to get started, especially with your five years of experience. Just make sure you take the time to really understand the solutions provided by discussing them further with the AI.

Related Questions
Can't Load PhpMyadmin On After Server Update
Redirect www to non-www in Apache Conf
How To Check If Your SSL Cert Is SHA 1
Windows TrackPad Gestures