Monitor Type: prometheus/prometheus
(Source)
Accepts Endpoints: Yes
Multiple Instances Allowed: Yes
This monitor scrapes Prometheus server's own internal collector metrics from a Prometheus exporter and sends them to SignalFx. It is a wrapper around the prometheus-exporter monitor that provides a restricted but expandable set of metrics.
To activate this monitor in the Smart Agent, add the following to your agent config:
monitors: # All monitor config goes under this key
- type: prometheus/prometheus
... # Additional config
For a list of monitor options that are common to all monitors, see Common Configuration.
Config option | Required | Type | Description |
---|---|---|---|
httpTimeout |
no | int64 |
HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (default: 10s ) |
username |
no | string |
Basic Auth username to use on each request, if any. |
password |
no | string |
Basic Auth password to use on each request, if any. |
useHTTPS |
no | bool |
If true, the agent will connect to the server using HTTPS instead of plain HTTP. (default: false ) |
httpHeaders |
no | map of strings |
A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. |
skipVerify |
no | bool |
If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (default: false ) |
sniServerName |
no | string |
If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. |
caCertPath |
no | string |
Path to the CA cert that has signed the TLS cert, unnecessary if skipVerify is set to false. |
clientCertPath |
no | string |
Path to the client TLS cert to use for TLS required connections |
clientKeyPath |
no | string |
Path to the client TLS key to use for TLS required connections |
host |
yes | string |
Host of the exporter |
port |
yes | integer |
Port of the exporter |
useServiceAccount |
no | bool |
Use pod service account to authenticate. (default: false ) |
metricPath |
no | string |
Path to the metrics endpoint on the exporter server, usually /metrics (the default). (default: /metrics ) |
sendAllMetrics |
no | bool |
Send all the metrics that come out of the Prometheus exporter without any filtering. This option has no effect when using the prometheus exporter monitor directly since there is no built-in filtering, only when embedding it in other monitors. (default: false ) |
These are the metrics available for this monitor. This monitor emits all metrics by default; however, none are categorized as container/host -- they are all custom.
net_conntrack_dialer_conn_attempted_total
(cumulative)
Total number of connections attempted by the given dialer a given namenet_conntrack_dialer_conn_closed_total
(cumulative)
Total number of connections closed which originated from the dialer of a given namenet_conntrack_dialer_conn_established_total
(cumulative)
Total number of connections successfully established by the given dialer a given namenet_conntrack_dialer_conn_failed_total
(cumulative)
Total number of connections failed to dial by the dialer a given namenet_conntrack_listener_conn_accepted_total
(cumulative)
Total number of connections opened to the listener of a given namenet_conntrack_listener_conn_closed_total
(cumulative)
Total number of connections closed that were made to the listener of a given nameprometheus_api_remote_read_queries
(gauge)
The current number of remote read queries being executed or waitingprometheus_build_info
(gauge)
A metric with a constant '1' value labeled by version, revision, branch, and goversion from which prometheus was builtprometheus_config_last_reload_success_timestamp_seconds
(gauge)
Timestamp of the last successful configuration reloadprometheus_config_last_reload_successful
(gauge)
Whether the last configuration reload attempt was successfulprometheus_engine_queries
(gauge)
The current number of queries being executed or waitingprometheus_engine_queries_concurrent_max
(gauge)
The max number of concurrent queriesprometheus_engine_query_duration_seconds
(cumulative)
Query timingsprometheus_engine_query_duration_seconds_count
(cumulative)
Query timings (count)prometheus_http_request_duration_seconds
(cumulative)
Histogram of latencies for HTTP requestsprometheus_http_request_duration_seconds_bucket
(cumulative)
Histogram of latencies for HTTP requests in the respective bucketprometheus_http_request_duration_seconds_count
(cumulative)
Histogram of latencies for HTTP requests (count)prometheus_http_response_size_bytes
(cumulative)
Histogram of response size for HTTP requestsprometheus_http_response_size_bytes_bucket
(cumulative)
Histogram of response size for HTTP requests in the respective bucketprometheus_http_response_size_bytes_count
(cumulative)
Histogram of response size for HTTP requestsprometheus_notifications_alertmanagers_discovered
(gauge)
The number of alertmanagers discovered and activeprometheus_notifications_dropped_total
(cumulative)
Total number of alerts dropped due to errors when sending to Alertmanagerprometheus_notifications_queue_capacity
(gauge)
The capacity of the alert notifications queueprometheus_notifications_queue_length
(gauge)
The number of alert notifications in the queueprometheus_rule_evaluation_duration_seconds
(cumulative)
The duration for a rule to executeprometheus_rule_evaluation_duration_seconds_count
(cumulative)
The duration for a rule to execute (count)prometheus_rule_evaluation_failures_total
(cumulative)
The total number of rule evaluation failuresprometheus_rule_group_duration_seconds
(cumulative)
The duration of rule group evaluationsprometheus_rule_group_duration_seconds_count
(cumulative)
The duration of rule group evaluations (count)prometheus_rule_group_interval_seconds
(gauge)
The interval of a rule groupprometheus_rule_group_iterations_missed_total
(cumulative)
The total number of rule group evaluations missed due to slow rule group evaluationprometheus_rule_group_iterations_total
(cumulative)
The total number of scheduled rule group evaluations, whether executed or missedprometheus_rule_group_last_duration_seconds
(gauge)
The duration of the last rule group evaluationprometheus_sd_azure_refresh_duration_seconds
(cumulative)
The duration of a Azure-SD refresh in secondsprometheus_sd_azure_refresh_duration_seconds_count
(cumulative)
The duration of a Azure-SD refresh in seconds (count)prometheus_sd_azure_refresh_failures_total
(cumulative)
Number of Azure-SD refresh failuresprometheus_sd_configs_failed_total
(cumulative)
Total number of service discovery configurations that failed to loadprometheus_sd_consul_rpc_duration_seconds
(cumulative)
The duration of a Consul RPC call in secondsprometheus_sd_consul_rpc_duration_seconds_count
(cumulative)
The duration of a Consul RPC call in seconds (count)prometheus_sd_consul_rpc_failures_total
(cumulative)
The number of Consul RPC call failuresprometheus_sd_discovered_targets
(gauge)
Current number of discovered targetsprometheus_sd_dns_lookup_failures_total
(cumulative)
The number of DNS-SD lookup failuresprometheus_sd_dns_lookups_total
(cumulative)
The number of DNS-SD lookupsprometheus_sd_ec2_refresh_duration_seconds
(cumulative)
The duration of a EC2-SD refresh in secondsprometheus_sd_ec2_refresh_duration_seconds_count
(cumulative)
The duration of a EC2-SD refresh in seconds (count)prometheus_sd_ec2_refresh_failures_total
(cumulative)
The number of EC2-SD scrape failuresprometheus_sd_file_read_errors_total
(cumulative)
The number of File-SD read errorsprometheus_sd_file_scan_duration_seconds
(cumulative)
The duration of the File-SD scan in secondsprometheus_sd_file_scan_duration_seconds_count
(cumulative)
The duration of the File-SD scan in seconds (count)prometheus_sd_gce_refresh_duration
(cumulative)
The duration of a GCE-SD refresh in secondsprometheus_sd_gce_refresh_duration_count
(cumulative)
The duration of a GCE-SD refresh in seconds (count)prometheus_sd_gce_refresh_failures_total
(cumulative)
The number of GCE-SD refresh failuresprometheus_sd_kubernetes_cache_last_resource_version
(gauge)
Last resource version from the Kubernetes APIprometheus_sd_kubernetes_cache_list_duration_seconds
(cumulative)
Duration of a Kubernetes API call in secondsprometheus_sd_kubernetes_cache_list_duration_seconds_count
(cumulative)
Duration of a Kubernetes API call in seconds (count)prometheus_sd_kubernetes_cache_list_items
(cumulative)
Count of items in a list from the Kubernetes APIprometheus_sd_kubernetes_cache_list_items_count
(cumulative)
Count of items in a list from the Kubernetes API (count)prometheus_sd_kubernetes_cache_list_total
(cumulative)
Total number of list operationsprometheus_sd_kubernetes_cache_short_watches_total
(cumulative)
Total number of short watch operationsprometheus_sd_kubernetes_cache_watch_duration_seconds
(cumulative)
Duration of watches on the Kubernetes APIprometheus_sd_kubernetes_cache_watch_duration_seconds_count
(cumulative)
Duration of watches on the Kubernetes API (count)prometheus_sd_kubernetes_cache_watch_events
(cumulative)
Number of items in watches on the Kubernetes APIprometheus_sd_kubernetes_cache_watch_events_count
(cumulative)
Number of items in watches on the Kubernetes API (count)prometheus_sd_kubernetes_cache_watches_total
(cumulative)
Total number of watch operationsprometheus_sd_kubernetes_events_total
(cumulative)
The number of Kubernetes events handledprometheus_sd_marathon_refresh_duration_seconds
(cumulative)
The duration of a Marathon-SD refresh in secondsprometheus_sd_marathon_refresh_duration_seconds_count
(cumulative)
The duration of a Marathon-SD refresh in secondsprometheus_sd_marathon_refresh_failures_total
(cumulative)
The number of Marathon-SD refresh failuresprometheus_sd_openstack_refresh_duration_seconds
(cumulative)
The duration of an OpenStack-SD refresh in secondsprometheus_sd_openstack_refresh_duration_seconds_count
(cumulative)
The duration of an OpenStack-SD refresh in secondsprometheus_sd_openstack_refresh_failures_total
(cumulative)
The number of OpenStack-SD scrape failuresprometheus_sd_received_updates_total
(cumulative)
Total number of update events received from the SD providersprometheus_sd_triton_refresh_duration_seconds
(cumulative)
The duration of a Triton-SD refresh in secondsprometheus_sd_triton_refresh_duration_seconds_count
(cumulative)
The duration of a Triton-SD refresh in secondsprometheus_sd_triton_refresh_failures_total
(cumulative)
The number of Triton-SD scrape failuresprometheus_sd_updates_delayed_total
(cumulative)
Total number of update events that couldn't be sent immediatelyprometheus_sd_updates_total
(cumulative)
Total number of update events sent to the SD consumersprometheus_target_interval_length_seconds
(cumulative)
Actual intervals between scrapesprometheus_target_interval_length_seconds_count
(cumulative)
Actual intervals between scrapesprometheus_target_scrape_pool_sync_total
(cumulative)
Total number of syncs that were executed on a scrape poolprometheus_target_scrapes_exceeded_sample_limit_total
(cumulative)
Total number of scrapes that hit the sample limit and were rejectedprometheus_target_scrapes_sample_duplicate_timestamp_total
(cumulative)
Total number of samples rejected due to duplicate timestamps but different valuesprometheus_target_scrapes_sample_out_of_bounds_total
(cumulative)
Total number of samples rejected due to timestamp falling outside of the time boundsprometheus_target_scrapes_sample_out_of_order_total
(cumulative)
Total number of samples rejected due to not being out of the expected orderprometheus_target_sync_length_seconds
(cumulative)
Actual interval to sync the scrape poolprometheus_target_sync_length_seconds_count
(cumulative)
Actual interval to sync the scrape poolprometheus_treecache_watcher_goroutines
(gauge)
The current number of watcher goroutinesprometheus_treecache_zookeeper_failures_total
(cumulative)
The total number of ZooKeeper failuresprometheus_tsdb_blocks_loaded
(gauge)
Number of currently loaded data blocksprometheus_tsdb_checkpoint_creations_failed_total
(cumulative)
Total number of checkpoint creations that failedprometheus_tsdb_checkpoint_creations_total
(cumulative)
Total number of checkpoint creations attemptedprometheus_tsdb_checkpoint_deletions_failed_total
(cumulative)
Total number of checkpoint deletions that failedprometheus_tsdb_checkpoint_deletions_total
(cumulative)
Total number of checkpoint deletions attemptedprometheus_tsdb_compaction_chunk_range_seconds
(cumulative)
Final time range of chunks on their first compactionprometheus_tsdb_compaction_chunk_range_seconds_bucket
(cumulative)
Final time range of chunks on their first compaction in the respective bucketprometheus_tsdb_compaction_chunk_range_seconds_count
(cumulative)
Final time range of chunks on their first compaction (count)prometheus_tsdb_compaction_chunk_samples
(cumulative)
Final number of samples on their first compactionprometheus_tsdb_compaction_chunk_samples_bucket
(cumulative)
Final number of samples on their first compaction in the respective bucketprometheus_tsdb_compaction_chunk_samples_count
(cumulative)
Final number of samples on their first compaction (count)prometheus_tsdb_compaction_chunk_size_bytes
(cumulative)
Final size of chunks on their first compactionprometheus_tsdb_compaction_chunk_size_bytes_bucket
(cumulative)
Final size of chunks on their first compaction in the respective bucketprometheus_tsdb_compaction_chunk_size_bytes_count
(cumulative)
Final size of chunks on their first compactionprometheus_tsdb_compaction_duration_seconds
(cumulative)
Duration of compaction runsprometheus_tsdb_compaction_duration_seconds_bucket
(cumulative)
Duration of compaction runs in the respective bucketprometheus_tsdb_compaction_duration_seconds_count
(cumulative)
Duration of compaction runs (count)prometheus_tsdb_compactions_failed_total
(cumulative)
Total number of compactions that failed for the partitionprometheus_tsdb_compactions_total
(cumulative)
Total number of compactions that were executed for the partitionprometheus_tsdb_compactions_triggered_total
(cumulative)
Total number of triggered compactions for the partitionprometheus_tsdb_head_active_appenders
(gauge)
Number of currently active appender transactionsprometheus_tsdb_head_chunks
(gauge)
Total number of chunks in the head blockprometheus_tsdb_head_chunks_created_total
(cumulative)
Total number of chunks created in the headprometheus_tsdb_head_chunks_removed_total
(cumulative)
Total number of chunks removed in the headprometheus_tsdb_head_gc_duration_seconds
(cumulative)
Runtime of garbage collection in the head blockprometheus_tsdb_head_gc_duration_seconds_count
(cumulative)
Runtime of garbage collection in the head block (count)prometheus_tsdb_head_max_time
(gauge)
Maximum timestamp of the head blockprometheus_tsdb_head_min_time
(gauge)
Minimum time bound of the head blockprometheus_tsdb_head_samples_appended_total
(cumulative)
Total number of appended samplesprometheus_tsdb_head_series
(gauge)
Total number of series in the head blockprometheus_tsdb_head_series_created_total
(cumulative)
Total number of series created in the headprometheus_tsdb_head_series_not_found_total
(cumulative)
Total number of requests for series that were not foundprometheus_tsdb_head_series_removed_total
(cumulative)
Total number of series removed in the headprometheus_tsdb_head_truncations_failed_total
(cumulative)
Total number of head truncations that failedprometheus_tsdb_head_truncations_total
(cumulative)
Total number of head truncations attemptedprometheus_tsdb_lowest_timestamp
(gauge)
Lowest timestamp value stored in the databaseprometheus_tsdb_reloads_failures_total
(cumulative)
Number of times the database failed to reload block data from diskprometheus_tsdb_reloads_total
(cumulative)
Number of times the database reloaded block data from diskprometheus_tsdb_retention_cutoffs_failures_total
(cumulative)
Number of times the database failed to cut off block data from diskprometheus_tsdb_retention_cutoffs_total
(cumulative)
Number of times the database cut off block data from diskprometheus_tsdb_symbol_table_size_bytes
(gauge)
Size of symbol table on disk (in bytes)prometheus_tsdb_tombstone_cleanup_seconds
(cumulative)
The time taken to recompact blocks to remove tombstonesprometheus_tsdb_tombstone_cleanup_seconds_bucket
(cumulative)
The time taken to recompact blocks to remove tombstones in the respective bucketprometheus_tsdb_tombstone_cleanup_seconds_count
(cumulative)
The time taken to recompact blocks to remove tombstones (count)prometheus_tsdb_wal_completed_pages_total
(cumulative)
Total number of completed pagesprometheus_tsdb_wal_fsync_duration_seconds
(cumulative)
Duration of WAL fsyncprometheus_tsdb_wal_fsync_duration_seconds_count
(cumulative)
Duration of WAL fsync (count)prometheus_tsdb_wal_page_flushes_total
(cumulative)
Total number of page flushesprometheus_tsdb_wal_truncate_duration_seconds
(cumulative)
Duration of WAL truncationprometheus_tsdb_wal_truncate_duration_seconds_count
(cumulative)
Duration of WAL truncation (count)prometheus_tsdb_wal_truncations_failed_total
(cumulative)
Total number of WAL truncations that failedprometheus_tsdb_wal_truncations_total
(cumulative)
Total number of WAL truncations attemptedpromhttp_metric_handler_requests_in_flight
(gauge)
Current number of scrapes being servedpromhttp_metric_handler_requests_total
(cumulative)
Total number of scrapes by HTTP status code The agent does not do any built-in filtering of metrics coming out of this monitor.