I recently got hired at a major product company, and I've found the onboarding process to be quite challenging. The confluence pages are outdated, the inventory is unclear, and nobody seems to know the exact number of clusters we have—except, perhaps, the CTO. The virtual machines are scattered across OCI, AWS, and Azure, and there are hundreds of build configurations in TeamCity for different purposes. As a new DevOps engineer, it's taking me months to familiarize myself with the infrastructure, and I'm still stumbling onto things I never knew existed. I'm curious if creating an AI tool that can answer specific infrastructure-related queries, like how many VMs are running Windows ARM 64 or which Kubernetes clusters are below version 1.30, would be beneficial for your team. Would it alleviate operational overhead for you as it would for me?
2 Answers
There are tools available for managing multi-cloud environments that you might find useful for this type of infrastructure juggling.
That's where a Configuration Management Database (CMDB) comes into play! It can help keep track of configurations and assets in your infrastructure.

CMDBs are helpful, but they can be static and quickly become outdated, requiring a lot of maintenance.