Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get datasources metrics ? #146

Open
bgarzon94 opened this issue May 15, 2024 · 1 comment
Open

How to get datasources metrics ? #146

bgarzon94 opened this issue May 15, 2024 · 1 comment

Comments

@bgarzon94
Copy link

Hello I'm new using druid-exporter, I've successfully deployed druid-exporter on k8s with helm, but I noticed datasources metrics are not being collected on druid dashboard despite the metrics of the other druid resources are displayed on the dashboard.

Druid dashboard 1

Druid dashboard 2

I was checking https://druid.apache.org/docs/latest/operations/metrics/ but I couldn't find a metric related that retrieves datasources info like total number of datasources, etc...

Here is my druid.monitoring.monitors and druid.emitter config in druid services:

common.runtime.properties:

druid.emitter=http
druid.emitter.http.recipientBaseUrl=http://druid-exporter-prometheus-druid-exporter:8080/druid

Coordinator:

druid.monitoring.monitors=["org.apache.druid.client.cache.CacheMonitor", "org.apache.druid.java.util.metrics.JvmMonitor", "org.apache.druid.java.util.metrics.CpuAcctDeltaMonitor", "org.apache.druid.java.util.metrics.JvmThreadsMonitor", "org.apache.druid.server.metrics.EventReceiverFirehoseMonitor"]

Historical:

druid.monitoring.monitors=["org.apache.druid.client.cache.CacheMonitor", "org.apache.druid.java.util.metrics.JvmMonitor", "org.apache.druid.java.util.metrics.CpuAcctDeltaMonitor", "org.apache.druid.java.util.metrics.JvmThreadsMonitor", "org.apache.druid.server.metrics.EventReceiverFirehoseMonitor", "org.apache.druid.server.metrics.HistoricalMetricsMonitor","org.apache.druid.server.metrics.QueryCountStatsMonitor"]

Broker:

druid.monitoring.monitors=["org.apache.druid.client.cache.CacheMonitor", "org.apache.druid.java.util.metrics.JvmMonitor", "org.apache.druid.java.util.metrics.CpuAcctDeltaMonitor", "org.apache.druid.java.util.metrics.JvmThreadsMonitor", "org.apache.druid.server.metrics.EventReceiverFirehoseMonitor", "org.apache.druid.server.metrics.HistoricalMetricsMonitor","org.apache.druid.server.metrics.QueryCountStatsMonitor"]

Router:

druid.monitoring.monitors=["org.apache.druid.client.cache.CacheMonitor", "org.apache.druid.java.util.metrics.JvmMonitor", "org.apache.druid.java.util.metrics.CpuAcctDeltaMonitor", "org.apache.druid.java.util.metrics.JvmThreadsMonitor", "org.apache.druid.server.metrics.EventReceiverFirehoseMonitor", "org.apache.druid.server.metrics.QueryCountStatsMonitor"]

Middlemanager:

druid.monitoring.monitors=["org.apache.druid.client.cache.CacheMonitor", "org.apache.druid.java.util.metrics.JvmMonitor", "org.apache.druid.java.util.metrics.CpuAcctDeltaMonitor", "org.apache.druid.java.util.metrics.JvmThreadsMonitor", "org.apache.druid.server.metrics.EventReceiverFirehoseMonitor"]

And this is my metricsDimensions config:

{
  "query/time" : { "dimensions" : ["dataSource", "type"], "type" : "timer"},
  "query/bytes" : { "dimensions" : ["dataSource", "type"], "type" : "count"},
  "query/node/time" : { "dimensions" : ["server"], "type" : "timer"},
  "query/node/ttfb" : { "dimensions" : ["server"], "type" : "timer"},
  "query/node/bytes" : { "dimensions" : ["server"], "type" : "count"},
  "query/node/backpressure": { "dimensions" : ["server"], "type" : "timer"},
  "query/intervalChunk/time" : { "dimensions" : [], "type" : "timer"},

  "query/segment/time" : { "dimensions" : [], "type" : "timer"},
  "query/wait/time" : { "dimensions" : [], "type" : "timer"},
  "segment/scan/pending" : { "dimensions" : [], "type" : "gauge"},
  "query/segmentAndCache/time" : { "dimensions" : [], "type" : "timer" },
  "query/cpu/time" : { "dimensions" : ["dataSource", "type"], "type" : "timer" },

  "query/count" : { "dimensions" : [], "type" : "count" },
  "query/success/count" : { "dimensions" : [], "type" : "count" },
  "query/failed/count" : { "dimensions" : [], "type" : "count" },
  "query/interrupted/count" : { "dimensions" : [], "type" : "count" },
  "query/timeout/count" : { "dimensions" : [], "type" : "count" },

  "query/cache/delta/numEntries" : { "dimensions" : [], "type" : "count" },
  "query/cache/delta/sizeBytes" : { "dimensions" : [], "type" : "count" },
  "query/cache/delta/hits" : { "dimensions" : [], "type" : "count" },
  "query/cache/delta/misses" : { "dimensions" : [], "type" : "count" },
  "query/cache/delta/evictions" : { "dimensions" : [], "type" : "count" },
  "query/cache/delta/hitRate" : { "dimensions" : [], "type" : "count", "convertRange" : true },
  "query/cache/delta/averageBytes" : { "dimensions" : [], "type" : "count" },
  "query/cache/delta/timeouts" : { "dimensions" : [], "type" : "count" },
  "query/cache/delta/errors" : { "dimensions" : [], "type" : "count" },

  "query/cache/total/numEntries" : { "dimensions" : [], "type" : "gauge" },
  "query/cache/total/sizeBytes" : { "dimensions" : [], "type" : "gauge" },
  "query/cache/total/hits" : { "dimensions" : [], "type" : "gauge" },
  "query/cache/total/misses" : { "dimensions" : [], "type" : "gauge" },
  "query/cache/total/evictions" : { "dimensions" : [], "type" : "gauge" },
  "query/cache/total/hitRate" : { "dimensions" : [], "type" : "gauge", "convertRange" : true },
  "query/cache/total/averageBytes" : { "dimensions" : [], "type" : "gauge" },
  "query/cache/total/timeouts" : { "dimensions" : [], "type" : "gauge" },
  "query/cache/total/errors" : { "dimensions" : [], "type" : "gauge" },

  "ingest/events/thrownAway" : { "dimensions" : ["dataSource"], "type" : "count" },
  "ingest/events/unparseable" : { "dimensions" : ["dataSource"], "type" : "count" },
  "ingest/events/duplicate" : { "dimensions" : ["dataSource"], "type" : "count" },
  "ingest/events/processed" : { "dimensions" : ["dataSource", "taskType", "taskId"], "type" : "count" },
  "ingest/events/messageGap" : { "dimensions" : ["dataSource"], "type" : "gauge" },
  "ingest/rows/output" : { "dimensions" : ["dataSource"], "type" : "count" },
  "ingest/persists/count" : { "dimensions" : ["dataSource"], "type" : "count" },
  "ingest/persists/time" : { "dimensions" : ["dataSource"], "type" : "timer" },
  "ingest/persists/cpu" : { "dimensions" : ["dataSource"], "type" : "timer" },
  "ingest/persists/backPressure" : { "dimensions" : ["dataSource"], "type" : "gauge" },
  "ingest/persists/failed" : { "dimensions" : ["dataSource"], "type" : "count" },
  "ingest/handoff/failed" : { "dimensions" : ["dataSource"], "type" : "count" },
  "ingest/merge/time" : { "dimensions" : ["dataSource"], "type" : "timer" },
  "ingest/merge/cpu" : { "dimensions" : ["dataSource"], "type" : "timer" },

  "ingest/kafka/lag" : { "dimensions" : ["dataSource"], "type" : "gauge" },
  "ingest/kafka/maxLag" : { "dimensions" : ["dataSource"], "type" : "gauge" },
  "ingest/kafka/avgLag" : { "dimensions" : ["dataSource"], "type" : "gauge" },

  "task/success/count" : { "dimensions" : ["dataSource"], "type" : "count" },
  "task/failed/count" : { "dimensions" : ["dataSource"], "type" : "count" },
  "task/running/count" : { "dimensions" : ["dataSource"], "type" : "gauge" },
  "task/pending/count" : { "dimensions" : ["dataSource"], "type" : "gauge" },
  "task/waiting/count" : { "dimensions" : ["dataSource"], "type" : "gauge" },

  "taskSlot/total/count" : { "dimensions" : [], "type" : "gauge" },
  "taskSlot/idle/count" : { "dimensions" : [], "type" : "gauge" },
  "taskSlot/busy/count" : { "dimensions" : [], "type" : "gauge" },
  "taskSlot/lazy/count" : { "dimensions" : [], "type" : "gauge" },
  "taskSlot/blacklisted/count" : { "dimensions" : [], "type" : "gauge" },

  "task/run/time" : { "dimensions" : ["dataSource", "taskType"], "type" : "timer" },
  "segment/added/bytes" : { "dimensions" : ["dataSource", "taskType"], "type" : "count" },
  "segment/moved/bytes" : { "dimensions" : ["dataSource", "taskType"], "type" : "count" },
  "segment/nuked/bytes" : { "dimensions" : ["dataSource", "taskType"], "type" : "count" },

  "segment/assigned/count" : { "dimensions" : ["tier"], "type" : "count" },
  "segment/moved/count" : { "dimensions" : ["tier"], "type" : "count" },
  "segment/dropped/count" : { "dimensions" : ["tier"], "type" : "count" },
  "segment/deleted/count" : { "dimensions" : ["tier"], "type" : "count" },
  "segment/unneeded/count" : { "dimensions" : ["tier"], "type" : "count" },
  "segment/unavailable/count" : { "dimensions" : ["dataSource"], "type" : "gauge" },
  "segment/underReplicated/count" : { "dimensions" : ["dataSource", "tier"], "type" : "gauge" },
  "segment/cost/raw" : { "dimensions" : ["tier"], "type" : "count" },
  "segment/cost/normalization" : { "dimensions" : ["tier"], "type" : "count" },
  "segment/cost/normalized" : { "dimensions" : ["tier"], "type" : "count" },
  "segment/loadQueue/size" : { "dimensions" : ["server"], "type" : "gauge" },
  "segment/loadQueue/failed" : { "dimensions" : ["server"], "type" : "gauge" },
  "segment/loadQueue/count" : { "dimensions" : ["server"], "type" : "gauge" },
  "segment/dropQueue/count" : { "dimensions" : ["server"], "type" : "gauge" },
  "segment/size" : { "dimensions" : ["dataSource"], "type" : "gauge" },
  "segment/overShadowed/count" : { "dimensions" : [], "type" : "gauge" },

  "segment/max" : { "dimensions" : [], "type" : "gauge"},
  "segment/used" : { "dimensions" : ["dataSource", "tier", "priority"], "type" : "gauge" },
  "segment/usedPercent" : { "dimensions" : ["dataSource", "tier", "priority"], "type" : "gauge", "convertRange" : true },
  "segment/pendingDelete" : { "dimensions" : [], "type" : "gauge"},

  "jvm/pool/committed" : { "dimensions" : ["poolKind", "poolName"], "type" : "gauge" },
  "jvm/pool/init" : { "dimensions" : ["poolKind", "poolName"], "type" : "gauge" },
  "jvm/pool/max" : { "dimensions" : ["poolKind", "poolName"], "type" : "gauge" },
  "jvm/pool/used" : { "dimensions" : ["poolKind", "poolName"], "type" : "gauge" },
  "jvm/bufferpool/count" : { "dimensions" : ["bufferpoolName"], "type" : "gauge" },
  "jvm/bufferpool/used" : { "dimensions" : ["bufferpoolName"], "type" : "gauge" },
  "jvm/bufferpool/capacity" : { "dimensions" : ["bufferpoolName"], "type" : "gauge" },
  "jvm/mem/init" : { "dimensions" : ["memKind"], "type" : "gauge" },
  "jvm/mem/max" : { "dimensions" : ["memKind"], "type" : "gauge" },
  "jvm/mem/used" : { "dimensions" : ["memKind"], "type" : "gauge" },
  "jvm/mem/committed" : { "dimensions" : ["memKind"], "type" : "gauge" },
  "jvm/gc/count" : { "dimensions" : ["gcName", "gcGen"], "type" : "count" },
  "jvm/gc/cpu" : { "dimensions" : ["gcName", "gcGen"], "type" : "count" },

  "ingest/events/buffered" : { "dimensions" : ["serviceName", "bufferCapacity"], "type" : "gauge"},

  "sys/swap/free" : { "dimensions" : [], "type" : "gauge"},
  "sys/swap/max" : { "dimensions" : [], "type" : "gauge"},
  "sys/swap/pageIn" : { "dimensions" : [], "type" : "gauge"},
  "sys/swap/pageOut" : { "dimensions" : [], "type" : "gauge"},
  "sys/disk/write/count" : { "dimensions" : ["fsDevName"], "type" : "count"},
  "sys/disk/read/count" : { "dimensions" : ["fsDevName"], "type" : "count"},
  "sys/disk/write/size" : { "dimensions" : ["fsDevName"], "type" : "count"},
  "sys/disk/read/size" : { "dimensions" : ["fsDevName"], "type" : "count"},
  "sys/net/write/size" : { "dimensions" : [], "type" : "count"},
  "sys/net/read/size" : { "dimensions" : [], "type" : "count"},
  "sys/fs/used" : { "dimensions" : ["fsDevName", "fsDirName", "fsTypeName", "fsSysTypeName", "fsOptions"], "type" : "gauge"},
  "sys/fs/max" : { "dimensions" : ["fsDevName", "fsDirName", "fsTypeName", "fsSysTypeName", "fsOptions"], "type" : "gauge"},
  "sys/mem/used" : { "dimensions" : [], "type" : "gauge"},
  "sys/mem/max" : { "dimensions" : [], "type" : "gauge"},
  "sys/storage/used" : { "dimensions" : ["fsDirName"], "type" : "gauge"},
  "sys/cpu" : { "dimensions" : ["cpuName", "cpuTime"], "type" : "gauge"},

  "coordinator-segment/count" : { "dimensions" : ["dataSource"], "type" : "gauge" },
  "historical-segment/count" : { "dimensions" : ["dataSource", "tier", "priority"], "type" : "gauge" },

  "jetty/numOpenConnections" : { "dimensions" : [], "type" : "gauge" },
  "query/cache/caffeine/total/requests" : { "dimensions" : [], "type" : "gauge" },
  "query/cache/caffeine/total/loadTime" : { "dimensions" : [], "type" : "gauge" },
  "query/cache/caffeine/total/evictionBytes" : { "dimensions" : [], "type" : "gauge" },
  "query/cache/memcached/total" : { "dimensions" : ["[MEM] Reconnecting Nodes (ReconnectQueue)",
    "[MEM] Request Rate: All",
    "[MEM] Average Bytes written to OS per write",
    "[MEM] Average Bytes read from OS per read",
    "[MEM] Response Rate: All (Failure + Success + Retry)",
    "[MEM] Response Rate: Retry",
    "[MEM] Response Rate: Failure",
    "[MEM] Response Rate: Success"],
    "type" : "gauge" },
  "query/cache/caffeine/delta/requests" : { "dimensions" : [], "type" : "count" },
  "query/cache/caffeine/delta/loadTime" : { "dimensions" : [], "type" : "count" },
  "query/cache/caffeine/delta/evictionBytes" : { "dimensions" : [], "type" : "count" },
  "query/cache/memcached/delta" : { "dimensions" : ["[MEM] Reconnecting Nodes (ReconnectQueue)",
    "[MEM] Request Rate: All",
    "[MEM] Average Bytes written to OS per write",
    "[MEM] Average Bytes read from OS per read",
    "[MEM] Response Rate: All (Failure + Success + Retry)",
    "[MEM] Response Rate: Retry",
    "[MEM] Response Rate: Failure",
    "[MEM] Response Rate: Success"],
    "type" : "count" }
}
@Subhashini2610
Copy link

@bgarzon94 @iamabhishek-dubey I am also facing this issue. Kindly help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants