Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] smartctl_device_bytes_written not present while data_units_written is present un smartctl #240

Open
achetronic opened this issue Aug 28, 2024 · 2 comments

Comments

@achetronic
Copy link

Hello there,

I have read this topic about and issue I have.
Apparently this is solved, but it's not working

I have several NVMe drivers in a machine, the output for, for example, nvme4 is the following:

{
  "json_format_version": [
    1,
    0
  ],
  "smartctl": {
    "version": [
      7,
      3
    ],
    "svn_revision": "5338",
    "platform_info": "x86_64-linux-6.1.0-23-amd64",
    "build_info": "(local build)",
    "argv": [
      "smartctl",
      "-aj",
      "/dev/nvme4"
    ],
    "exit_status": 0
  },
  "local_time": {
    "time_t": 1724836392,
    "asctime": "Wed Aug 28 09:13:12 2024 UTC"
  },
  "device": {
    "name": "/dev/nvme4",
    "info_name": "/dev/nvme4",
    "type": "nvme",
    "protocol": "NVMe"
  },
  "model_name": "SAMSUNG MZQL27T6HBLA-00A07",
  "serial_number": "XXXREDACTED",
  "firmware_version": "GDC5902Q",
  "nvme_pci_vendor": {
    "id": 5197,
    "subsystem_id": 5197
  },
  "nvme_ieee_oui_identifier": 9528,
  "nvme_total_capacity": 7681501126656,
  "nvme_unallocated_capacity": 0,
  "nvme_controller_id": 6,
  "nvme_version": {
    "string": "1.4",
    "value": 66560
  },
  "nvme_number_of_namespaces": 32,
  "smart_support": {
    "available": true,
    "enabled": true
  },
  "smart_status": {
    "passed": true,
    "nvme": {
      "value": 0
    }
  },
  "nvme_smart_health_information_log": {
    "critical_warning": 0,
    "temperature": 39,
    "available_spare": 100,
    "available_spare_threshold": 10,
    "percentage_used": 0,
    "data_units_read": 48260767,
    "data_units_written": 32950276,
    "host_reads": 399428829,
    "host_writes": 381541751,
    "controller_busy_time": 259,
    "power_cycles": 59,
    "power_on_hours": 853,
    "unsafe_shutdowns": 50,
    "media_errors": 0,
    "num_err_log_entries": 0,
    "warning_temp_time": 0,
    "critical_comp_time": 0,
    "temperature_sensors": [
      39,
      49
    ]
  },
  "temperature": {
    "current": 39
  },
  "power_cycle_count": 59,
  "power_on_time": {
    "hours": 853
  }
}

But this metric is not being exported. Presents metrics are the following:

# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 7.933e-05
go_gc_duration_seconds{quantile="0.25"} 0.000139238
go_gc_duration_seconds{quantile="0.5"} 0.000166328
go_gc_duration_seconds{quantile="0.75"} 0.000216837
go_gc_duration_seconds{quantile="1"} 0.000497763
go_gc_duration_seconds_sum 0.161315562
go_gc_duration_seconds_count 894
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 12
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.22.0"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 3.372272e+06
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 1.414980312e+09
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 8094
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 5.778626e+06
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 2.844528e+06
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 3.372272e+06
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 5.423104e+06
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 5.554176e+06
# HELP go_memstats_heap_objects Number of allocated objects.
# TYPE go_memstats_heap_objects gauge
go_memstats_heap_objects 4045
# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
# TYPE go_memstats_heap_released_bytes gauge
go_memstats_heap_released_bytes 3.137536e+06
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 1.097728e+07
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
# TYPE go_memstats_last_gc_time_seconds gauge
go_memstats_last_gc_time_seconds 1.7248370635039225e+09
# HELP go_memstats_lookups_total Total number of pointer lookups.
# TYPE go_memstats_lookups_total counter
go_memstats_lookups_total 0
# HELP go_memstats_mallocs_total Total number of mallocs.
# TYPE go_memstats_mallocs_total counter
go_memstats_mallocs_total 5.782671e+06
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
# TYPE go_memstats_mcache_inuse_bytes gauge
go_memstats_mcache_inuse_bytes 76800
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 78000
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
# TYPE go_memstats_mspan_inuse_bytes gauge
go_memstats_mspan_inuse_bytes 300800
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 342720
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 5.256616e+06
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 2.812306e+06
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 1.572864e+06
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 1.572864e+06
# HELP go_memstats_sys_bytes Number of bytes obtained from system.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 1.8635792e+07
# HELP go_threads Number of OS threads created.
# TYPE go_threads gauge
go_threads 21
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 14.27
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 13
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 2.1532672e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.72476677555e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.268129792e+09
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes 1.8446744073709552e+19
# HELP smartctl_device Device info
# TYPE smartctl_device gauge
smartctl_device{ata_additional_product_id="unknown",ata_version="",device="nvme0",firmware_version="E2MU200",form_factor="",interface="nvme",model_family="unknown",model_name="Micron_7450_MTFDKCC960TFR",protocol="NVMe",sata_version="",scsi_product="",scsi_revision="",scsi_vendor="",scsi_version="",serial_number="XXXREDACTED"} 1
smartctl_device{ata_additional_product_id="unknown",ata_version="",device="nvme1",firmware_version="E2MU200",form_factor="",interface="nvme",model_family="unknown",model_name="Micron_7450_MTFDKCC960TFR",protocol="NVMe",sata_version="",scsi_product="",scsi_revision="",scsi_vendor="",scsi_version="",serial_number="XXXREDACTED"} 1
smartctl_device{ata_additional_product_id="unknown",ata_version="",device="nvme2",firmware_version="GDC5902Q",form_factor="",interface="nvme",model_family="unknown",model_name="SAMSUNG MZQL27T6HBLA-00A07",protocol="NVMe",sata_version="",scsi_product="",scsi_revision="",scsi_vendor="",scsi_version="",serial_number="XXXREDACTED"} 1
smartctl_device{ata_additional_product_id="unknown",ata_version="",device="nvme3",firmware_version="GDC5902Q",form_factor="",interface="nvme",model_family="unknown",model_name="SAMSUNG MZQL27T6HBLA-00A07",protocol="NVMe",sata_version="",scsi_product="",scsi_revision="",scsi_vendor="",scsi_version="",serial_number="XXXREDACTED"} 1
smartctl_device{ata_additional_product_id="unknown",ata_version="",device="nvme4",firmware_version="GDC5902Q",form_factor="",interface="nvme",model_family="unknown",model_name="SAMSUNG MZQL27T6HBLA-00A07",protocol="NVMe",sata_version="",scsi_product="",scsi_revision="",scsi_vendor="",scsi_version="",serial_number="XXXREDACTED"} 1
smartctl_device{ata_additional_product_id="unknown",ata_version="",device="nvme5",firmware_version="GDC5902Q",form_factor="",interface="nvme",model_family="unknown",model_name="SAMSUNG MZQL27T6HBLA-00A07",protocol="NVMe",sata_version="",scsi_product="",scsi_revision="",scsi_vendor="",scsi_version="",serial_number="XXXREDACTED"} 1
# HELP smartctl_device_available_spare Normalized percentage (0 to 100%) of the remaining spare capacity available
# TYPE smartctl_device_available_spare counter
smartctl_device_available_spare{device="nvme0"} 100
smartctl_device_available_spare{device="nvme1"} 100
smartctl_device_available_spare{device="nvme2"} 100
smartctl_device_available_spare{device="nvme3"} 100
smartctl_device_available_spare{device="nvme4"} 100
smartctl_device_available_spare{device="nvme5"} 100
# HELP smartctl_device_available_spare_threshold When the Available Spare falls below the threshold indicated in this field, an asynchronous event completion may occur. The value is indicated as a normalized percentage (0 to 100%)
# TYPE smartctl_device_available_spare_threshold counter
smartctl_device_available_spare_threshold{device="nvme0"} 10
smartctl_device_available_spare_threshold{device="nvme1"} 10
smartctl_device_available_spare_threshold{device="nvme2"} 10
smartctl_device_available_spare_threshold{device="nvme3"} 10
smartctl_device_available_spare_threshold{device="nvme4"} 10
smartctl_device_available_spare_threshold{device="nvme5"} 10
# HELP smartctl_device_block_size Device block size
# TYPE smartctl_device_block_size gauge
smartctl_device_block_size{blocks_type="logical",device="nvme0"} 0
smartctl_device_block_size{blocks_type="logical",device="nvme1"} 0
smartctl_device_block_size{blocks_type="logical",device="nvme2"} 0
smartctl_device_block_size{blocks_type="logical",device="nvme3"} 0
smartctl_device_block_size{blocks_type="logical",device="nvme4"} 0
smartctl_device_block_size{blocks_type="logical",device="nvme5"} 0
smartctl_device_block_size{blocks_type="physical",device="nvme0"} 0
smartctl_device_block_size{blocks_type="physical",device="nvme1"} 0
smartctl_device_block_size{blocks_type="physical",device="nvme2"} 0
smartctl_device_block_size{blocks_type="physical",device="nvme3"} 0
smartctl_device_block_size{blocks_type="physical",device="nvme4"} 0
smartctl_device_block_size{blocks_type="physical",device="nvme5"} 0
# HELP smartctl_device_capacity_blocks Device capacity in blocks
# TYPE smartctl_device_capacity_blocks gauge
smartctl_device_capacity_blocks{device="nvme0"} 0
smartctl_device_capacity_blocks{device="nvme1"} 0
smartctl_device_capacity_blocks{device="nvme2"} 0
smartctl_device_capacity_blocks{device="nvme3"} 0
smartctl_device_capacity_blocks{device="nvme4"} 0
smartctl_device_capacity_blocks{device="nvme5"} 0
# HELP smartctl_device_capacity_bytes Device capacity in bytes
# TYPE smartctl_device_capacity_bytes gauge
smartctl_device_capacity_bytes{device="nvme0"} 0
smartctl_device_capacity_bytes{device="nvme1"} 0
smartctl_device_capacity_bytes{device="nvme2"} 0
smartctl_device_capacity_bytes{device="nvme3"} 0
smartctl_device_capacity_bytes{device="nvme4"} 0
smartctl_device_capacity_bytes{device="nvme5"} 0
# HELP smartctl_device_critical_warning This field indicates critical warnings for the state of the controller
# TYPE smartctl_device_critical_warning counter
smartctl_device_critical_warning{device="nvme0"} 0
smartctl_device_critical_warning{device="nvme1"} 0
smartctl_device_critical_warning{device="nvme2"} 0
smartctl_device_critical_warning{device="nvme3"} 0
smartctl_device_critical_warning{device="nvme4"} 0
smartctl_device_critical_warning{device="nvme5"} 0
# HELP smartctl_device_media_errors Contains the number of occurrences where the controller detected an unrecovered data integrity error. Errors such as uncorrectable ECC, CRC checksum failure, or LBA tag mismatch are included in this field
# TYPE smartctl_device_media_errors counter
smartctl_device_media_errors{device="nvme0"} 0
smartctl_device_media_errors{device="nvme1"} 0
smartctl_device_media_errors{device="nvme2"} 0
smartctl_device_media_errors{device="nvme3"} 0
smartctl_device_media_errors{device="nvme4"} 0
smartctl_device_media_errors{device="nvme5"} 0
# HELP smartctl_device_num_err_log_entries Contains the number of Error Information log entries over the life of the controller
# TYPE smartctl_device_num_err_log_entries counter
smartctl_device_num_err_log_entries{device="nvme0"} 0
smartctl_device_num_err_log_entries{device="nvme1"} 0
smartctl_device_num_err_log_entries{device="nvme2"} 0
smartctl_device_num_err_log_entries{device="nvme3"} 0
smartctl_device_num_err_log_entries{device="nvme4"} 0
smartctl_device_num_err_log_entries{device="nvme5"} 0
# HELP smartctl_device_nvme_capacity_bytes NVMe device total capacity bytes
# TYPE smartctl_device_nvme_capacity_bytes gauge
smartctl_device_nvme_capacity_bytes{device="nvme0"} 9.60197124096e+11
smartctl_device_nvme_capacity_bytes{device="nvme1"} 9.60197124096e+11
smartctl_device_nvme_capacity_bytes{device="nvme2"} 7.681501126656e+12
smartctl_device_nvme_capacity_bytes{device="nvme3"} 7.681501126656e+12
smartctl_device_nvme_capacity_bytes{device="nvme4"} 7.681501126656e+12
smartctl_device_nvme_capacity_bytes{device="nvme5"} 7.681501126656e+12
# HELP smartctl_device_percentage_used Device write percentage used
# TYPE smartctl_device_percentage_used counter
smartctl_device_percentage_used{device="nvme0"} 0
smartctl_device_percentage_used{device="nvme1"} 0
smartctl_device_percentage_used{device="nvme2"} 0
smartctl_device_percentage_used{device="nvme3"} 0
smartctl_device_percentage_used{device="nvme4"} 0
smartctl_device_percentage_used{device="nvme5"} 0
# HELP smartctl_device_power_cycle_count Device power cycle count
# TYPE smartctl_device_power_cycle_count counter
smartctl_device_power_cycle_count{device="nvme0"} 30
smartctl_device_power_cycle_count{device="nvme1"} 30
smartctl_device_power_cycle_count{device="nvme2"} 12
smartctl_device_power_cycle_count{device="nvme3"} 12
smartctl_device_power_cycle_count{device="nvme4"} 12
smartctl_device_power_cycle_count{device="nvme5"} 12
# HELP smartctl_device_power_on_seconds Device power on seconds
# TYPE smartctl_device_power_on_seconds counter
smartctl_device_power_on_seconds{device="nvme0"} 1.89e+06
smartctl_device_power_on_seconds{device="nvme1"} 1.89e+06
smartctl_device_power_on_seconds{device="nvme2"} 1.8072e+06
smartctl_device_power_on_seconds{device="nvme3"} 1.8072e+06
smartctl_device_power_on_seconds{device="nvme4"} 1.8072e+06
smartctl_device_power_on_seconds{device="nvme5"} 1.8072e+06
# HELP smartctl_device_smart_status General smart status
# TYPE smartctl_device_smart_status gauge
smartctl_device_smart_status{device="nvme0"} 1
smartctl_device_smart_status{device="nvme1"} 1
smartctl_device_smart_status{device="nvme2"} 1
smartctl_device_smart_status{device="nvme3"} 1
smartctl_device_smart_status{device="nvme4"} 1
smartctl_device_smart_status{device="nvme5"} 1
# HELP smartctl_device_smartctl_exit_status Exit status of smartctl on device
# TYPE smartctl_device_smartctl_exit_status gauge
smartctl_device_smartctl_exit_status{device="nvme0"} 0
smartctl_device_smartctl_exit_status{device="nvme1"} 0
smartctl_device_smartctl_exit_status{device="nvme2"} 0
smartctl_device_smartctl_exit_status{device="nvme3"} 0
smartctl_device_smartctl_exit_status{device="nvme4"} 0
smartctl_device_smartctl_exit_status{device="nvme5"} 0
# HELP smartctl_device_temperature Device temperature celsius
# TYPE smartctl_device_temperature gauge
smartctl_device_temperature{device="nvme0",temperature_type="current"} 40
smartctl_device_temperature{device="nvme1",temperature_type="current"} 40
smartctl_device_temperature{device="nvme2",temperature_type="current"} 42
smartctl_device_temperature{device="nvme3",temperature_type="current"} 43
smartctl_device_temperature{device="nvme4",temperature_type="current"} 41
smartctl_device_temperature{device="nvme5",temperature_type="current"} 40
# HELP smartctl_devices Number of devices configured or dynamically discovered
# TYPE smartctl_devices gauge
smartctl_devices 6
# HELP smartctl_version smartctl version
# TYPE smartctl_version gauge
smartctl_version{build_info="(local build)",json_format_version="1.0",smartctl_version="7.4",svn_revision="5530"} 1

I am using the latest version of the exporter:
Screenshot from 2024-08-28 10-29-04

Shouldn't this be solved by this PR?

@k0ste
Copy link
Contributor

k0ste commented Aug 28, 2024

Yes it should. This code is not released yet. But may be, when some guy with merge button, merge PR #235

@achetronic
Copy link
Author

Yes it should. This code is not released yet. But may be, when some guy with merge button, merge PR #235

Who do we have to invoke? :) I think this is not the most used exporter but super useful and needed in some scenarios

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@k0ste @achetronic and others