Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NAS-132432 / 24.10.1 / Cherry-pick important zfs commits from upstream to stable/electriceel for 24.10.1 freeze #259

Merged
merged 25 commits into from
Nov 14, 2024

Commits on Nov 11, 2024

  1. Fix an uninitialized data access (openzfs#16511)

    zfs_acl_node_alloc allocates an uninitialized data buffer, but upstack
    zfs_acl_chmod only partially initializes it.  KMSAN reported that this
    memory remained uninitialized at the point when it was read by
    lzjb_compress, which suggests a possible kernel memory disclosure bug.
    
    The full KMSAN warning may be found in the PR.
    openzfs#16511
    
    Signed-off-by:	Alan Somers <[email protected]>
    Sponsored by:	Axcient
    Reviewed-by: Alexander Motin <[email protected]>
    Reviewed-by: Tony Hutter <[email protected]>
    asomers authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    c45a833 View commit details
    Browse the repository at this point in the history
  2. Remove extra newline from spa_set_allocator().

    zfs_dbgmsg() does not need newline at the end of the message.
    
    While there, slightly update/sync FreeBSD __dprintf().
    
    Reviewed by: Brian Behlendorf <[email protected]>
    Signed-off-by:	Alexander Motin <[email protected]>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#16536
    amotin authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    5102987 View commit details
    Browse the repository at this point in the history
  3. Avoid fault diagnosis if multiple vdevs have errors

    When multiple drives are throwing errors, it is likely not
    a drive failing but rather a failure above the drives, like
    a controller.  The active cases context of the drive's peers
    is now considered when making a diagnosis.
    
    Sponsored-by: Klara, Inc.
    Sponsored-by: Wasabi Technology, Inc.
    Reviewed by: Brian Behlendorf <[email protected]>
    Signed-off-by: Don Brady <[email protected]>
    Closes openzfs#16531
    don-brady authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    75a8c79 View commit details
    Browse the repository at this point in the history
  4. arcstat: add structural, types, states breakdown

    Add ARC structural breakdown, ARC types breakdown, ARC states
    breakdown similar to arc_summary.  Additional cleanups included.
    
    Reviewed-by: Alexander Motin <[email protected]>
    Signed-off-by: Theera K. <[email protected]>
    Closes openzfs#16509
    tkittich authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    0d4d1b2 View commit details
    Browse the repository at this point in the history
  5. zio_compress: introduce max size threshold

    Now default compression is lz4, which can stop
    compression process by itself on incompressible data.
    If there are additional size checks -
    we will only make our compressratio worse.
    
    New usable compression thresholds are:
    - less than BPE_PAYLOAD_SIZE (embedded_data feature);
    - at least one saved sector.
    
    Old 12.5% threshold is left to minimize affect
    on existing user expectations of CPU utilization.
    
    If data wasn't compressed - it will be saved as
    ZIO_COMPRESS_OFF, so if we really need to recompress
    data without ashift info and check anything -
    we can just compress it with zero threshold.
    So, we don't need a new feature flag here!
    
    Reviewed-by: Brian Behlendorf <[email protected]>
    Reviewed-by: Tony Hutter <[email protected]>
    Reviewed-by: Alexander Motin <[email protected]>
    Signed-off-by: George Melikov <[email protected]>
    Closes openzfs#9416
    gmelikov authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    3699b88 View commit details
    Browse the repository at this point in the history
  6. ZLE compression: don't use BPE_PAYLOAD_SIZE

    ZLE compressor needs additional bytes to process
    d_len argument efficiently.
    Don't use BPE_PAYLOAD_SIZE as d_len with it
    before we rework zle compressor somehow.
    
    Reviewed-by: Brian Behlendorf <[email protected]>
    Reviewed-by: Tony Hutter <[email protected]>
    Reviewed-by: Alexander Motin <[email protected]>
    Signed-off-by: George Melikov <[email protected]>
    Closes openzfs#9416
    gmelikov authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    4ecdf62 View commit details
    Browse the repository at this point in the history
  7. arc_hdr_authenticate: make explicit error

    On compression we could be more explicit here for cases
    where we can not recompress the data.
    
    Reviewed-by: Brian Behlendorf <[email protected]>
    Reviewed-by: Tony Hutter <[email protected]>
    Co-authored-by: Alexander Motin <[email protected]>
    Signed-off-by: George Melikov <[email protected]>
    Closes openzfs#9416
    gmelikov authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    3ace0a4 View commit details
    Browse the repository at this point in the history
  8. Evicting too many bytes from MFU metadata

    Without updating 'm' we evict from MFU metadata all that we wanted
    to evict from all metadata, including already evicted MRU metadata
    ('m' is the total amount of metadata we had at the beginning,
    and 'w' is the total amount of metadata we want to have). 
    
    Reviewed-by: Brian Behlendorf <[email protected]>
    Reviewed-by: Alexander Motin <[email protected]>
    Signed-off-by: Theera K. <[email protected]>
    Closes openzfs#16521
    Closes openzfs#16546
    tkittich authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    22d183c View commit details
    Browse the repository at this point in the history
  9. Properly release key in spa_keystore_dsl_key_hold_dd()

    Since dsl_crypto_key_open() references the key, 0d23f5e should
    have called dsl_crypto_key_rele() to drop it first instead of
    calling dsl_crypto_key_free() directly.  The final result should
    actually be the same, but without triggering dck_holds assertion.
    
    Reviewed-by: Brian Behlendorf <[email protected]>
    Signed-off-by:	Alexander Motin <[email protected]>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#16567
    amotin authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    e42b277 View commit details
    Browse the repository at this point in the history
  10. Restrict raidz faulted vdev count

    Specifically, a child in a replacing vdev won't count when assessing
    the dtl during a vdev_fault()
    
    Sponsored-by: Klara, Inc.
    Sponsored-by: Wasabi Technology, Inc.
    Reviewed-by: Brian Behlendorf <[email protected]>
    Reviewed-by: Tino Reichardt <[email protected]>
    Signed-off-by: Don Brady <[email protected]>
    Closes openzfs#16569
    don-brady authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    4a72da5 View commit details
    Browse the repository at this point in the history
  11. ARC: Cache arc_c value during arc_evict()

    Since arc_evict() run can take some time, arc_c change during it
    may result in undesired shift in ARC states balance. Primarily in
    case of arc_c reduction it may cause eviction from MFU data state
    despite its being below the target already.  Instead we should
    evict as originally planned and if needed do another round after.
    
    Reviewed-by: Theera K. <[email protected]>
    Reviewed-by: George Melikov <[email protected]>
    Reviewed-by: Brian Behlendorf <[email protected]>
    Signed-off-by:	Alexander Motin <[email protected]>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#16576
    Closes openzfs#16605
    amotin authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    c168599 View commit details
    Browse the repository at this point in the history
  12. Fix generation of kernel uevents for snapshot rename on linux

    `zvol_rename_minors()` needs to be given the full path not just the
    snapshot name.  Use code removed in a0bd735 as a guide
    to providing the necessary values.
    
    Add ZTS check for /dev changes after snapshot rename.  After
    renaming a snapshot with 'snapdev=visible' ensure that the /dev
    entries are updated to reflect the rename.
    
    Reviewed-by: Brian Behlendorf <[email protected]>
    Signed-off-by: James Dingwall <[email protected]>
    Closes openzfs#14223 
    Closes openzfs#16600
    JKDingwall authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    2bfbe0e View commit details
    Browse the repository at this point in the history
  13. zpool/zfs: allow --json wherever -j is allowed

    Mostly so that with the JSON formatting options are also used, they all
    look the same. To my eye, `-j --json-flat-vdevs` suggests that they are
    different or unrelated, while `--json --json-flat-vdevs` invites no
    further questions.
    
    Sponsored-by: Klara, Inc.
    Sponsored-by: Wasabi Technology, Inc.
    Reviewed-by: Umer Saleem <[email protected]>
    Reviewed-by: Brian Behlendorf <[email protected]>
    Reviewed-by: Tony Hutter <[email protected]>
    Signed-off-by: Rob Norris <[email protected]>
    Closes openzfs#16632
    robn authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    4b9e7f4 View commit details
    Browse the repository at this point in the history
  14. Pack dmu_buf_impl_t by 16 bytes

    On 64bit FreeBSD this reduces one from 296 to 280 bytes.  On small
    block workloads dbufs may consume gigabytes of ARC, and this saves
    5% of it.
    
    Reviewed-by: Tino Reichardt <[email protected]>
    Reviewed-by: Brian Atkinson <[email protected]>
    Reviewed-by: Brian Behlendorf <[email protected]>
    Signed-off-by:	Alexander Motin <[email protected]>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#16684
    amotin authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    e303a29 View commit details
    Browse the repository at this point in the history
  15. On the first vdev open ignore impossible ashift hints

    If on the first open device's logical ashift is bigger than set
    by pool's ashift property, ignore the last as unusable instead of
    creating vdev that will fail most of I/Os due to misalignment.
    
    Reviewed-by: Rob Norris <[email protected]>
    Reviewed-by: Brian Behlendorf <[email protected]>
    Reviewed-by: Ameer Hamza <[email protected]>
    Signed-off-by:  Alexander Motin <[email protected]>
    Sponsored by:   iXsystems, Inc.
    Closes openzfs#16690
    amotin authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    e8ae9fc View commit details
    Browse the repository at this point in the history
  16. vdev_disk: try harder to ensure IO alignment rules

    It seems out our notion of "properly" aligned IO was incomplete. In
    particular, dm-crypt does its own splitting, and assumes that a logical
    block will never cross an order-0 page boundary (ie, the physical page
    size, not compound size). This effectively means that it needs to be
    possible to split a BIO at any page or block size boundary and have it
    work correctly.
    
    This updates the alignment check function to enforce these rules (to the
    extent possible).
    
    Our response to misaligned data is to make some new allocation that is
    properly aligned, and copy the data into it. It turns out that
    linearising (via abd_borrow_buf()) is not enough, because we allocate eg
    4K blocks from a general purpose slab, and so may receive (or already
    have) a 4K block that crosses pages.
    
    So instead, we allocate a new ABD, which is guaranteed to be aligned
    properly to block sizes, and then copy everything into it, and back out
    on the way back.
    
    Sponsored-by: Klara, Inc.
    Sponsored-by: Wasabi Technology, Inc.
    Reviewed-by: Brian Behlendorf <[email protected]>
    Reviewed-by: Alexander Motin <[email protected]>
    Reviewed-by: Tony Hutter <[email protected]>
    Signed-off-by: Rob Norris <[email protected]>
    Closes openzfs#16687 openzfs#16631 openzfs#15646 openzfs#15533 openzfs#14533
    robn authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    9886013 View commit details
    Browse the repository at this point in the history
  17. vdev_disk: move abd return and free off the interrupt handler

    Freeing an ABD can take sleeping locks to update various stats. We
    aren't allowed to sleep on an interrupt handler. So, move the free off
    to the io_done callback.
    
    We should never have been freeing things in the interrupt handler, but
    we got away with it because we were usually freeing a linear ABD, which
    at most is returning two objects to a cache and never sleeping. Scatter
    ABDs can be used now, and those have more complex locking.
    
    Sponsored-by: Klara, Inc.
    Sponsored-by: Wasabi Technology, Inc.
    Reviewed-by: Brian Behlendorf <[email protected]>
    Reviewed-by: Alexander Motin <[email protected]>
    Reviewed-by: Tony Hutter <[email protected]>
    Signed-off-by: Rob Norris <[email protected]>
    Closes openzfs#16687
    robn authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    465f165 View commit details
    Browse the repository at this point in the history
  18. Added output to zpool online and offline

    I was surprised to discover today that `zpool online` and
    `zpool offline` don't print any information about why they failed in
    many cases, they just return 1 with no information about why.
    
    Let's improve that where we can without changing the library function.
    
    Reviewed-by: Brian Behlendorf <[email protected]>
    Reviewed-by: Tony Hutter <[email protected]>
    Reviewed-by: Alexander Motin <[email protected]>
    Reviewed-by: Allan Jude <[email protected]>
    Signed-off-by: Rich Ercolani <[email protected]>
    Closes openzfs#16244
    rincebrain authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    d5e828f View commit details
    Browse the repository at this point in the history
  19. Verify parent_dev before calling udev_device_get_sysattr_value

    Not all udev devices have parent devices.
    Calling udev_device_get_ functions yield an assertion error
    if called with a NULL pointer.
    
    Reviewed-by: Brian Behlendorf <[email protected]>
    Reviewed-by: Alexander Motin <[email protected]>
    Signed-off-by: Sietse <[email protected]>
    Co-authored-by: Sietse <[email protected]>
    Closes openzfs#16705 
    Closes openzfs#16717
    Uglymotha authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    c7a8970 View commit details
    Browse the repository at this point in the history
  20. Use simple folio migration function

    Avoids using fallback_migrate_folio, which starts unnecessary writeback
    (leading to BUG in migrate_folio_extra).
    
    Reviewed-by: Brian Behlendorf <[email protected]>
    Reviewed-by: Tony Hutter <[email protected]>
    Reviewed-by: Brian Atkinson <[email protected]>
    Signed-off-by: tstabrawa <[email protected]>
    Closes openzfs#16568
    Closes openzfs#16723
    tstabrawa authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    a51e85e View commit details
    Browse the repository at this point in the history
  21. JSON: fix user properties output for zfs list

    This commit fixes JSON output for zfs list when user properties are
    requested with -o flag. This case needed to be handled specifically
    since zfs_prop_to_name does not return property name for user
    properties, instead it is stored in pl->pl_user_prop.
    
    Reviewed-by: Ameer Hamza <[email protected]>
    Reviewed-by: Brian Behlendorf <[email protected]>
    Reviewed-by: Alexander Motin <[email protected]>
    Signed-off-by: Umer Saleem <[email protected]>
    Closes openzfs#16732
    usaleem-ix authored and ixhamza committed Nov 11, 2024
    Configuration menu
    Copy the full SHA
    5821cd9 View commit details
    Browse the repository at this point in the history

Commits on Nov 13, 2024

  1. JSON: fix user properties output for zpool list

    This commit fixes JSON output for zpool list when user properties are
    requested with -o flag. This case needed to be handled specifically
    since zpool_prop_to_name does not return property name for user
    properties, instead it is stored in pl->pl_user_prop.
    
    Reviewed-by: Brian Behlendorf <[email protected]>
    Reviewed-by: Alexander Motin <[email protected]>
    Signed-off-by: Umer Saleem <[email protected]>
    Closes openzfs#16734
    usaleem-ix authored and ixhamza committed Nov 13, 2024
    Configuration menu
    Copy the full SHA
    6b05248 View commit details
    Browse the repository at this point in the history
  2. Fix user properties output for zpool list

    In zpool_get_user_prop, when called from zpool_expand_proplist and
    collect_pool, we often have zpool_props present in zpool_handle_t equal
    to NULL. This mostly happens when only one user property is requested
    using zpool list -o <user_property>. Checking for this case and
    correctly initializing the zpool_props field in zpool_handle_t fixes
    this issue.
    
    Interestingly, this issue does not occur if we query any other property
    like name or guid along with a user property with -o flag because while
    accessing properties like guid, zpool_prop_get_int is called which
    checks for this case specifically and calls zpool_get_all_props.
    
    Reviewed-by: Brian Behlendorf <[email protected]>
    Reviewed-by: Alexander Motin <[email protected]>
    Signed-off-by: Umer Saleem <[email protected]>
    Closes openzfs#16734
    usaleem-ix authored and ixhamza committed Nov 13, 2024
    Configuration menu
    Copy the full SHA
    9641126 View commit details
    Browse the repository at this point in the history
  3. L2ARC: Move different stats updates earlier

    ..., before we make the header or the log block visible to others.
    It should fix assertion on allocated space going negative if the
    header is freed once the lock is dropped, while the write is still
    going.
    
    Reviewed-by: Brian Behlendorf <[email protected]>
    Reviewed-by: Rob Norris <[email protected]>
    Signed-off-by: Alexander Motin <[email protected]>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#16040
    Closes openzfs#16743
    amotin authored and ixhamza committed Nov 13, 2024
    Configuration menu
    Copy the full SHA
    ce92af2 View commit details
    Browse the repository at this point in the history
  4. zvol_os.c: Increase optimal IO size

    Since zvol read and write can process up to (DMU_MAX_ACCESS / 2) bytes
    in a single operation, the current optimal I/O size is too low. SCST
    directly reports this value as the optimal transfer length for the
    target SCSI device. Increasing it from the previous volblocksize results
    in performance improvement for large block parallel I/O workloads.
    
    Signed-off-by: Ameer Hamza <[email protected]>
    ixhamza committed Nov 13, 2024
    Configuration menu
    Copy the full SHA
    2ef56a7 View commit details
    Browse the repository at this point in the history