I'm part of the team at SigNoz and we've noticed that users often look at a span in a trace that indicates a duration of, let's say, 1.9 seconds. They then open another tab to check percentile distributions to see if that duration is actually considered slow or if it's typical for that operation. To address this, we developed a feature that displays the percentile right in the trace detail view. When a user clicks on a span, they see a badge like 'p78', meaning the span duration is slower than 78% of similar spans (taking into account the same service, operation, and environment) over the past hour. Users can click to expand and view the p50, p90, and p99 durations for comparison. I'm looking for feedback on this feature—do you find it helpful, or could it complicate the UI?
5 Answers
I think the real metric for 'slow' should align with customer expectations. If your users think something is slow, then it is. That’s the true measure.
I see the concerns raised regarding performance and accuracy, but I believe having this percentiles information could be really helpful. An alternative might be to provide quick access to similar traces, allowing users to adjust filters for a more accurate percentile calculation. I’m a Grafana user and I'm just starting with traces in a production environment.
I'm cautious about this feature potentially giving misleading information. While you might define 'similar' based on certain fields like service, env, or operation, not all services will fit that mold. The accuracy of percentiles relies on how relevant those fields are to the context of the data being analyzed. It might be wise to only show percentiles based on the user's specific query.
What exactly do you mean by spans? I’m a bit confused here.
That’s a good feature, but I wonder about the performance impact. Every time you visualize a span, aren’t you querying all similar spans? This could slow things down quite a bit in real-world use. Plus, it would be great to have this feature be customizable. Some people may want to see percentiles for different timeframes like weekly or monthly, or based on various criteria. It might be better to have it off by default so users can choose when they want to enable it.

Related Questions
Can't Load PhpMyadmin On After Server Update
Redirect www to non-www in Apache Conf
How To Check If Your SSL Cert Is SHA 1
Windows TrackPad Gestures