System Operations

Best Practices for Decentralized Logging in a Large Retail Network?

June 26, 2025

Asked By TechScribe42 On June 26, 2025

Hey everyone! We're in the process of developing a software system for a large retailer with thousands of stores, each running its own server. This means we're dealing with around 10,000 distributed backend instances worldwide. We're facing a bit of a dilemma when it comes to logging, with two conflicting requirements. On one hand, we need all logs centralized for monitoring, which is set up with Elastic, but we want to keep costs manageable—we're looking at potentially millions per year for logging alone. On the other hand, we often receive complaints that our logs lack detail when bugs occur, but increasing the amount of logging might blow our budget. One idea I had was to implement decentralized log stores where each server stores logs locally and sends the critical ones to Elastic for central monitoring. We need a solution that allows us to connect to each store for querying without having to log in to each server individually (they're all Windows). Does anyone know of a system that can manage decentralized logging while still allowing for central oversight?

5 Answers

Answered By WiseOwler On June 27, 2025

In a year, when whatever custom solution you try falls apart, you might realize it would have been smarter to invest in a managed service from the start. If you're dead set on using Elastic, have you checked out their cross-cluster search feature? It could be beneficial.

TechScribe42 - June 27, 2025

Definitely agree! I mention that to my team all the time, but it’s tough when the decision-makers aren’t on board. Thanks for the link!

Answered By CloudWhisperer89 On June 27, 2025

If you don’t need to keep logs local, consider shipping them to cloud or blob storage. Keeping them distributed can be tricky unless there's someone on-site who needs access. Once they're in blob storage, you’ll have different options for querying based on the provider.

LogGuru72 - June 27, 2025

That’s a good point! I wasn’t considering blob storage before, but I did come across Loki from Grafana—it looks promising as a low-cost storage option. Our client has some strict reliability requirements, wanting logs stored locally anyway since they prefer working offline.

Answered By SpeedyLogger On June 27, 2025

I used to push around 80 TBs of logs into Quickwit backed by S3, and it worked well! As long as your users know how to search, most queries came back in under 3 seconds. You can send all logs to S3 and then ingest them into Quickwit from there.

Answered By MetricsMaster On June 27, 2025

Avoid reinventing the wheel! Check out the Grafana stack and VictoriaMetrics. Start with open telemetry and limit what you log at first to get a handle on bandwidth and timing, then gradually scale up. Make sure to coordinate the log structures and buffer them in a proxy to enrich the data beyond just the code.

Answered By DevOpsNinja On June 26, 2025

Why not run another Elastic instance on the local machines that only collects info level logs? You could send just the error logs and a sample of info logs to the centralized Elastic instance, saving some space and resources.

Best Practices for Decentralized Logging in a Large Retail Network?

5 Answers

Related Questions

Can't Load PhpMyadmin On After Server Update

Redirect www to non-www in Apache Conf

How To Check If Your SSL Cert Is SHA 1

Windows TrackPad Gestures

LEAVE A REPLY Cancel reply