Functionality
- Introduced a new internal submission mechanism for platforms based on Linux* OS kernel versions where MMAP is no longer permitted. For more details, refer to the Intel Security Advisory. When MMAP is unavailable, the write system call is used instead. This may introduce additional overhead for small data sizes (4KB and smaller) in the Inflate functionality, but no performance implications are expected for larger data sizes or Deflate.
- Updated the QPL device search mechanism to a new default behavior. Now, the platforms with Sub-NUMA clustering configured such that not all NUMA nodes have an accelerator instance can utilize any IAA instance from the same socket for execution unless specified by the user. You still can restrict device selection to a specific NUMA node of the current thread by specifying
QPL_DEVICE_NUMA_ID_CURRENT
, or to a specific NUMA node by settingjob->numa_id = <numa_node_id>
. Additionally, you can extend the entire system by settingQPL_DEVICE_NUMA_ID_ANY
. - Added support for host fallback in the asynchronous API when using the
Auto Path
feature. - Implemented an internal mechanism to save intermediate job states in the dynamic Deflate job. This feature prevents duplicate work when executing with the synchronous API on the
Hardware Path
and encountering theQPL_STS_QUEUES_ARE_BUSY_ERR
error. In such cases, the job is resubmitted without repeating the already completed work.
Usability and Documentation
- Added support for Canned mode in QPL Benchmarks Frameworks.
- Optimized memory usage and reduced startup time for benchmarks when utilizing an exact filter.
- Introduced a new build option
-DQPL_USE_CLANG_TIDY={ON,OFF}
to enable QPL to build with clang-tidy checks. Clang-tidy support is limited to Linux* OS only and requires building QPL with the Clang* compiler. Additionally, introduced a configuration file for clang-tidy and refactored QPL to comply with the introduced clang-tidy configuration file. - Added a new example demonstrating the utilization of dictionary compression with the
Hardware Path
for compression and theSoftware Path
for decompression. - Added new test cases for Select, Scan, and Extract operations to validate the functionality of Force Array Output Modification.
- Expanded the bad argument scenarios for the Force Array Output Modification tests to include additional cases for the
Software Path
. - Added new tests to validate error handling for bad arguments when submitting jobs on the
Hardware Path
andAuto Path
.
Deprecated Functionality
- Deprecated support for canned mode with indexing on the
Software Path
to align with theHardware Path
.
Bug Fixes
- Resolved the issue with compression verification when utilizing IAA 2.0.
- Corrected the test setup for
auto_path
intb_c_api_deflate_with_dictionary.level_none
,tb_c_api_deflate_with_dictionary.hw_multi_chunk
, andtn_c_api_deflate.dynamic/fixed/static}_default_stored_block_overflow
. - Added an execution path check to ensure proper handling of unsupported paths in the Force Array Output Modification.
- Resolved potential undefined behavior by fixing uninitialized pointers in the
canned_one_chuck_hw_vs_sw.cpp
test. - Removed tests related to the unsupported
Software Path
for the canned mode with indexing. - Fixed invalid parquet generation for
tn_c_api_expand.tn_rle_input_error_handling
.
Known Limitations
-
Intel(R) QPL could be built from directly downloadable files (
.tar
,.tgz
) without tests and benchmark frameworks, using the-DQPL_BUILD_TESTS=OFF
build option. This is because it requires submodules that are not included in the archives by GitHub* during release creation. -
Known test failures are listed below. Some tests only fail under certain conditions, which are noted in parentheses
- Functional tests:
- (
software_path
,auto_path
only on platforms without IAA) ta_c_api_deflate_stateful.{dynamic/fixed/static}_default_verify - (
software_path
,auto_path
) ta_c_api_deflate_stateful.{dynamic/fixed/static}_high_verify - (
hardware_path
,auto_path
on IAA 2.0) ta_c_api_deflate_index_extended.PerformOperation - (
auto_path
) ta_c_api_huffman_only{_verify./.}{dynamic/static}_be - (
auto_path
) ta_c_api_inflate_huffman_only.generated_data - (
auto_path
) ta_c_api_deflate_index.{dynamic/static}_blocks_default_level_verify - (
auto_path
) tb_c_api_expand.source_errors - (
auto_path
) ta_c_api_deflate_inflate_canned_in_loops.default_level
- (
- Functional tests:
-
Compression verification on the
qpl_path_software
works only with indexing mode and data of size smaller than 32KB in other modes. -
Inflate does not report the error code
QPL_STS_BIG_HEADER_ERR
when a header is too big to fit in the input buffer. -
The implementation of
QPL_FLAG_CRC32C
is in progress. -
When using
qpl_path_hardware
, the compression and decompression with indexing mode on IAA 2.0 are limited to data sizes smaller than 4KB.
Thanks to the Contributors
The release includes contributions from the project team and @aekoroglu, @fwph, and @Permanence-AI-Coder.