Releases: yosefe/ucx
Releases · yosefe/ucx
v1.9.0-pre41
- Add support for static build (see https://openucx.readthedocs.io/en/master/faq.html#build-user-application-with-ucx)
- Add support for different number of RoCE LAG paths on server and client
- Add support for user-provided memory handle #247
- Add support for setting local bind address on client endpoint #250
- Add support for building debug RPM #221
- Support setting UCT level parameters from UCP API #209
- Set UCX_SOCKADDR_CM_ENABLE=y by default
- Disable DevX objects by default #185 #178
- Fixes in keepalive protocol #243 #205 #196 #184
- Fixes in rendezvous cancel protocol #233 #230 #227
- Fix hang in case of device fatal error #231
- Fix simultaneous disconnect #200
- Fix low bandwidth with multi-rail eager #194
- Fix uct_rdmacm_cm_cqs hash key #237
- Multiple fixes in io_demo test application
- Logging and memory tracker improvements
v1.9.0-pre40
Features:
Bugfixes:
- ea0288b Fix string buffer grow
- 13d124c Fix err code for rdma_<establish|accept> failure
- ac26299 Fix error handling in ucp_stream_am_handler
- 0aef7b3 Fix RC scatter-to-cqe configuration
- a864933, 472a471 Remove ucp_ep_flush progress callback when endpoint is closed
- de3247c, 529eb4f Track memory usage by application
- 271e82a Do not progress rendezvous operation if endpoint already failed
- 41500a7 Move keepalive to a separate progress callback
- df2dead Do not enable KA on CM lane
- 6aeb722 Do not access UCP endpoint after it's destroyed or its error callback has been invoked
v1.9.0-pre39
v1.9.0-pre38
Bugfixes:
- Fix post_recv called without buffers to post #117
- Reduce log level of simultaneous error during disconnect #119
- Fix RoCE LAG detection on MLNX_OFED 5.x #124
- Process registration cache invalidation queue during progress, to release stale regions #123
- Fix assertion checks in am_zcopy flow #125
- Add API to return registration cache information and counters #126
v1.9.0-pre37
Features:
- Add non-blocking resource cleanup by a background process, enabled by
UCX_IB_CLEANUP_THREAD=y
- Make rdma_cm address/route resolve timeout configurable, instead of always using 1 second. Example:
UCX_RDMACM_TIMEOUT=10s
Bugfixes:
- Fix heap corruption caused by ucm_set_event_handler() in multi-threaded application
- Fix RoCE LAG detection: take GID index from iface to find associated netdev, instead of always checking gid 0
- Disable backtrace by default to avoid deadlock with malloc/free
- Fix leak of listener requests during ucp_listener_destroy()
- Added lock on rc_iface->eps access, to fix race condition between main thread and pack_cb called from progress thread
- Use ibv_pd handle instead of device name as CQ hash key, for rdma_cm temporary QP
v1.9.0-pre36
Fixes for simultaneous disconnect (peer failure during ep_close):
- Flush operation failed with assertion due to endpoint changed number of lanes (#90, #91)
- Assertion failed that not all pending operations were removed, since flush_internal exited prematurely (#92)
- ep_close operation does not complete, since when ep_flushed returned error the lane was not accounted for (#93)
v1.9.0-pre35
- Fix potential race condition when keepalive is sent from main thread, while RC endpoint is created from progress thread, and keepalive tries to use a partially initialized endpoint:
- UCT/RC: Protect rc_iface->ep_list with a spinlock
- UCT/RC: Initialize ep->connected=0 before keepalive could run
- Fix race condition between installing malloc() hooks and IB async event thread, which can lead to segfault during application start
v1.9.0-pre34
- Reserve 1 CQ credit for qp-flush NOP, to avoid disabling CQ moderation
- Fix memory leak in IO demo test
- Simplify active connection counting in IO demo test
v1.9.0-pre33
- Fix reordering when destroying RC endpoint, due to releasing CQ and RDMA_READ credits
- Enhance logging in rdma_cm
v1.9-pre32
- Add limit for memory registration cache, configured by
UCX_IB_RCACHE_MAX_REGIONS
andUCX_IB_RCACHE_MAX_SIZE
- Fix keepalive protocol - don't send when QP is in INIT state
- Fix RPM build - missing io_demo installed file