Account Diagnostics

Visualize traffic statistics within your account by navigating to the “Account Diagnostics” submenu from the “Metrics” tab.

Contents

Account Diagnostics

Live Metrics Graph

In the graph above, the orange line is the live metric limit - the number of metrics you can update in a rolling 5-minute period. One metric name might look like: my.server.cpu.load. In this example, the limit is set at 1,500,000 meaning that up to 1.5M metric names can be sent concurrently.

The green line is the number of live metrics incoming for the account, on this graph it fluctuates between around 275K and 350K live metrics. When we see more than the limit sent at the same time, some metrics will be dropped. Note: Live metrics can also be referred to as 'concurrent' or 'active' metrics.

Datapoint Rates Graph

In the graph above, the dark blue line is the number of data points allowed per second or the data point rate limit. In this example it is set at 750,000, allowing the user to send 750K data points per second.

The other lines in the graph represent the number of data points per second hitting your account, by protocol.

Metrics Created and Deleted Graph

The above graph will give you visual insights into recent metrics being created, deleted, and expired. This can be useful for tracking traffic spikes, and monitoring any configured expiry rules.

Activity by Protocol

This card provides a quick overview of your current traffic and the icon in the status column provides information on your last received data. The green icon indicates that we have seen data arrive recently on that interface. A yellow or red icon indicates that no data has arrived for that protocol for at least 5 and 15 minutes respectively. A blue icon indicates that we have never seen any traffic on that interface.

Busy Metrics Being Ratelimited Graph

If a user is sending a high volume of datapoints per second to a single metric, we implement per-metric rate limiting rules to protect our backend. These rules are defined differently than the Live Metric ratelimiting rules, and only target individual metrics with a very high rate of Data Points Per Second. You can read more about why these rules are important, and how they work in this informative blog article.

Invalid Metrics Graph

If a user sends metrics that do not match the Graphite format, they will be reported as 'invalid' and cannot be ingested. In the above panel you can see the offending metrics, reason for reporting as invalid, protocol, IP sent from, and timestamp of attempted ingestion.

To avoid heavy impact on our ingestion servers, the list is refreshed every 5min, there is a limit of 100 metrics, and the invalid metric names are only stored for 24hrs. We also include a related panel in your HG Traffic Dashboard, as well as an alert for Datapoints Dropped in every Hosted Graphite account. NOTE: we currently do not track invalid metrics for StatsD.

Why are there Account Limits?

TL;DR - As a prevention measure against accidents and malice.

It’s possible for a user to run a script that accidentally (or deliberately) updates millions of metrics a second. Sensible limits on what data we process ensure that one customer cannot affect the quality of service for others. Generally, we want customers to be able to send data at a high rate and we can monitor and increase any limits as necessary. Check out this article for more details on why these limits are put in place.

PreviousAccess Keys NextAccount Settings

Last updated 1 year ago

Was this helpful?