This documents the metrics and tags emitted by gostatsd, their type, tags, and interpretation. All internal metrics are snapshot after a flush, then queued internally for sending in the next flush. Specifically this means that internal metrics will lag regular metrics by one flush interval. See below for notes on how channels are monitored.

Metric types:

type	description
gauge (flush)	A value sent as a gauge with the value reset / calculated / sampled every flush interval
gauge (time)	A single duration measured in milliseconds and sent as a gauge
gauge (cumulative)	An internal counter sent as a gauge with the value never resetting
gauge (sparse)	The same as a cumulative gauge, but data is only sent on change
counter	An internal counter, reset on flush

Metrics:

Name	type	tags	description
aggregator.metricmaps_received	gauge (flush)	aggregator_id	The number of datapoint batches received during the flush interval
aggregator.aggregation_time	gauge (time)	aggregator_id	The time taken (in ms) to aggregate all counter and timer
			datapoints in this flush interval
aggregator.process_time	gauge (time)	aggregator_id	The time taken to process all synchronous flush actions
aggregator.reset_time	gauge (time)	aggregator_id	The time taken to reset the aggregator after flush
parser.bad_lines_seen	gauge (sparse)		The number of unparseable lines
parser.events_received	gauge (cumulative)		The number of events parsed
parser.metrics_received	gauge (cumulative)		The number of metrics parsed
receiver.datagrams_received	gauge (cumulative)		The number of datagrams received
receiver.avg_datagrams_in_batch	gauge (flush)		The average number of datagrams per batch (up to receive-batch-size). This
			can be used to tweak receive-batch-size if necessary to reduce memory usage.
channel.avg	gauge (flush)	channel	The average of all samples in the flush interval
channel.min	gauge (flush)	channel	The minimum sample seen
channel.max	gauge (flush)	channel	The maximum sample seen
channel.last	gauge (flush)	channel	The last sample seen
channel.capacity	gauge (flush)	channel	The capacity of the channel
channel.samples	gauge (flush)	channel	The number of samples seen (guaranteed to be at least 1)
heartbeat	gauge (flush)	version, commit	The value 1, tagged by the version (git tag) and short commit hash
flusher.total_time	gauge (time)		Time taken to flush all metrics to all backends for the flush interval
backend.created	gauge (cumulative)	backend	Lifetime number of metric batches generated by the backend
backend.create.failed	gauge (cumulative)	backend	Lifetime number of metric batches which failed to be serialized (DATALOSS!)
backend.retried	gauge (sparse)	backend	Lifetime number of metric batches retried by the backend
backend.dropped	gauge (cumulative)	backend	Lifetime number of metric batches dropped by the backend (DATALOSS!)
backend.sent	gauge (cumulative)	backend	Lifetime number of metric batches successfully transmitted
backend.series.sent	gauge (cumulative)	backend	Lifetime number of metric series successfully transmitted
cloudprovider.aws.describeinstancecount	gauge (cumulative)		The cumulative number of times DescribeInstancesPages has been called
cloudprovider.aws.describeinstanceinstances	gauge (cumulative)		The cumulative number of instances which have been fed in to DescribeInstancesPages
cloudprovider.aws.describeinstancepages	gauge (cumulative)		The cumulative number of pages from DescribeInstancesPages
cloudprovider.aws.describeinstanceerrors	gauge (cumulative)		The cumulative number of errors seen from DescribeInstancesPages
cloudprovider.aws.describeinstancefound	gauge (cumulative)		The cumulative number of instances successfully found via DescribeInstances
cloudprovider.cache_positive	gauge (flush)		The absolute number of positive entries in the cache
cloudprovider.cache_negative	gauge (flush)		The absolute number of negative entries in the cache
cloudprovider.cache_refresh_positive	gauge (cumulative)		The cumulative number of positive refreshes
cloudprovider.cache_refresh_negative	gauge (cumulative)		The cumulative number of refreshes which had an error refreshing and used old data
cloudprovider.cache_hit	gauge (cumulative)		The cumulative number of cache hits (host was in the cache)
cloudprovider.cache_miss	gauge (cumulative)		The cumulative number of cache misses
cloudprovider.hosts_queued	gauge (flush)	type	The absolute number of hosts waiting to be looked up
cloudprovider.items_queued	gauge (flush)	type	The absolute number of metrics or events waiting for a host lookup to complete
http.forwarder.invalid	counter		The number of failures to prepare a batch of metrics to forward
http.forwarder.created	counter		The number of batches prepared for forwarding
http.forwarder.sent	counter		The number of batches successfully forwarded
http.forwarder.retried	counter		The number of retries sending a batch
http.forwarder.dropped	counter		The number of batches dropped due to inability to forward upstream
http.incoming	counter	server-name, result, failure	The number of batches forwarded to the server, and the results of processing them
http.incoming.metrics	counter	server-name	The number of metrics received over http

Tag	Description
aggregator_id	The index of an aggregator, the amount corresponds to the --max-workers flag
channel	The name of an internal channel
version	The git tag of the build
commit	The short git commit of the build
backend	The backend sending a particular metric
type	Either metric or event
result	Success to indicate a batch of metrics was successfully processed, failure to indicate a batch of metrics was not processed, with additional failure tag for why)
failure	The reason a batch of metrics was not processed
server-name	The name of an http-server as specified in the config file

A number of channels are tracked internally, they emit metrics under the channel.* space. They will all have a channel tag, and may have additional tags specified below. Channels are sampled at a regular interval. After a flush, basic stats are sent about the data sampled (internal metrics lag regular metrics by a flush interval) and the samples are reset.

Channel name	Additional tags	Description
dispatch_aggregator_map	aggregator_id	Channel to dispatch metric maps to a given aggregator.
backend_events_sem		Semaphore limiting the number of events in flight at once. Corresponds to
		the `--max-concurrent-events` flag.

If both --internal-namespace and --namespace are specified, and metrics are dispatched internally, the resulting metric will be namespace.internal_namespace.metric.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

METRICS.md

METRICS.md

Files

METRICS.md

Latest commit

History

METRICS.md

File metadata and controls