I'm running several Docker containers that rely on the Docker socket, like Portainer, Autoheal, and Watchtower. Recently, after an update to Docker CE, these containers seemed to lose their connection to the socket but didn't actually fail—they just sat idle. To tackle this, I've set up a container called docker-watchdog that runs a health check every minute and marks itself as unhealthy if a 'docker ps' command stalls. Now, I'm looking for a way to restart the other containers automatically if the docker-watchdog container goes unhealthy. I've noticed that using 'depends_on' only determines startup dependencies, but I want to mark other containers as unhealthy based on the status of docker-watchdog. Any suggestions?
2 Answers
It sounds like you might be taking the wrong approach here. Instead of monitoring with a watchdog, why not have your apps retry the connection to the Docker socket continuously? Plus, it's generally unsafe to expose the Docker socket directly to apps. If you go that route, consider using a socket proxy to manage health checks and dependencies. That way, if the proxy goes unhealthy, the other containers will know to restart as you set them to depend on the proxy's health.
Also, keep in mind that it's essential to not only rely on 'docker ps' for health checks since it's not a reliable indicator of your application's health.
Have you thought about implementing health checks on the individual containers themselves? If they expose ports or URLs, you could create a health check that attempts to access them. If it fails for a certain period, then you could trigger a restart for those containers. That's a method often used in Kubernetes, and it could work for you too.
That could work, but not all containers will support the same commands, so you might end up needing unique checks for each one!

That's true, but remember that 'depends_on' only helps with starting the containers, not restarting them if they go unhealthy—it's all about that health check!