Documentation for hipBLAS is available at https://rocm.docs.amd.com/projects/hipBLAS/en/latest/.
- Level 2 functions have additional ILP64 API for both C and FORTRAN (_64 name suffix) with int64_t function arguments
- Level 1 "_ex" functions have additional ILP64 API
- install.sh script invokes rmake.py script, along with various improvements within the build scripts
- Library dependencies in install.sh script have been changed from "rocblas" and "rocsolver" to the development packages "rocblas-dev" and "rocsolver-dev"
- Linux AOCL dependency updated to release 4.2 gcc build
- Windows vcpkg dependencies updated to release 2024.02.14
- New build option to automatically use hipconfig --platform to determine HIP platform
- Level 1 functions have additional ILP64 API for both C and Fortran (
_64
name suffix) with int64_t function arguments - New functions hipblasGetMathMode and hipblasSetMathMode
- USE_CUDA build option; use HIP_PLATFORM=amd or HIP_PLATFORM=nvidia to override hipconfig
- Some Level 2 function argument names have changed from
m
ton
to match legacy BLAS; there was no change in implementation. - Updated client code to use YAML-based testing
- Renamed
.doxygen
and.sphinx
folders todoxygen
andsphinx
, respectively - Added CMake support for documentation
- New option to define
HIPBLAS_USE_HIP_BFLOAT16
to switch API to use thehip_bfloat16
type - New
hipblasGemmExWithFlags
API
hipblasDatatype_t
; usehipDataType
insteadhipblasComplex
; usehipComplex
insteadhipblasDoubleComplex
; usehipDoubleComplex
instead- Use of
hipblasDatatype_t
forhipblasGemmEx
for compute-type; usehipblasComputeType_t
instead
hipblasXtrmm
(calculates B <- alpha * op(A) * B) has been replaced withhipblasXtrmm
(calculates C <- alpha * op(A) * B)
- Updated documentation requirements
- rocSOLVER now depends on rocSPARSE
- Added const qualifier to hipBLAS functions (swap, sbmv, spmv, symv, trsm) where missing
hipblasInt8Datatype_t enum
hipblasSetInt8Datatype
hipblasGetInt8Datatype functions
- In-place trmm will be replaced by trmm that includes both in-place and out-of-place functionality
- Optional opt-in to use the
__half for hipblasHalf
type (with C++) when you defineHIPBLAS_USE_HIP_HALF
- added scripts to plot performance for multiple functions
- data driven hipblas-bench and hipblas-test execution via external yaml format data files
- client smoke test added for quick validation using command hipblas-test --yaml hipblas_smoke.yaml
- Data type conversion functions support more rocBLAS and cuBLAS data types
geqrf
now returns successfully when nullptrs are passed with n == 0 || m == 0getrs
now returns successfully when given nullptrs with corresponding size = 0getrs
gives info = -1 when transpose is not an expected typegels
now returns successfully when given nullptrs with corresponding size = 0gels
now gives info = -1 when transpose is not in ('N', 'T') for real cases and not in ('N', 'C') for complex cases
- Changed reference code for Windows to OpenBLAS
- hipBLAS client executables all now begin with the
hipblas-
prefix
- Allow for selection of int8 data type
- Added support for
hipblasXgels
andhipblasXgelsStridedBatched
operations (with s,d,c,z precisions), only supported with rocBLAS backend - Added support for
hipblasXgelsBatched
operations (with s,d,c,z precisions)
- New
--cudapath
option ininstall.sh
, which allows you to specify the CUDA build you want to use - New
--installcuda
option ininstall.sh
to install CUDA using a package manager (this can also be used with the new--installcudaversion
option that allows you to specify the CUDA version you want to install)
#includes
now support a compiler version- Fixed client dependency support in
install.sh
- New packages for test and benchmark executables on all supported operating systems using CPack
- Added file and folder reorganization changes with backward compatibility support for
rocm-cmake
wrapper functions - Added user-specified initialization option to
hipblas-bench
- Version gathering in performance-measuring script
hipblas-test
output now has library version and device information- New
--rocsolver-path
command line option that you can use to specify a path (absolute or relative) to the pre-built rocSOLVER - Added
--cmake_install
command line option to update CMake to the minimum version - Added
cmake-arg
parameter to pass in cmake arguments while building - ReadtheDocs infrastructure support for the hipBLAS documentation
- Added
hipblasVersionMinor
(hipblaseVersionMinor
remains for backwards compatibility) - Doxygen warnings in
hipblas.h
header file
rocblas-path
command line option can be specified as absolute or relative path- Help message improvements in
install.sh
andrmake.py
- Updated GoogleTest dependency from 1.10.0 to 1.11.0
hipblas-bench
rocSOLVER functions- Added
ROCM_MATHLIBS_API_USE_HIP_COMPLEX
to opt-in to usehipFloatComplex
andhipDoubleComplex
- Compilation warning for future trmm changes
hipblas.h
documentation- Added option to forgo pivoting for getrf and getri when ipiv is nullptr
- Code coverage option
- Use of incorrect
HIP_PATH
when building from source - Windows packaging
- Allowing negative increments in
hipblas-bench
- Removed boost dependency
- Additional support for
hipblas-bench
HIPBLAS_STATUS_UNKNOWN
for unsupported backend status codes
- Avoid large offset overflow for
gemv
andhemv
inhipblas-test
- Packaging has been split into a runtime package (
hipblas
) and a development package (hipblas-devel
): The development package depends on the runtime package. When installing the runtime package, the package manager will suggest the installation of the development package to aid users transitioning from the previous version's combined package. This suggestion by package manager is for all supported operating systems (except CentOS 7) to aid in the transition. Thesuggestion
feature in the runtime package is introduced as a deprecated feature and will be removed in a future ROCm release.
hipblasStatusToString
- Added
catch()
blocks around API calls to prevent the leak of C++ exceptions
- Updates for rocBLAS
gemm_ex
changes: When using the rocBLAS backend, hipBLAS queries the preferable layout of int8 data passed togemm_ex
and passes in the resulting flag (you must specify your preferred data format when callinggemm_ex
with a rocBLAS backend) - Added
hipblas-bench
with support forcopy
,swap
, andscal
- Added the following functions, which include batched and strided-batched support with the rocBLAS
backend:
axpy_ex
dot_ex
nrm2_ex
rot_ex
scal_ex
- Complex unit test bug caused by incorrect
caxpy
andzaxpy
function signatures
- Added a changelog
hipblas-bench
, with support forgemv
,trsm
, andgemm
- rocSOLVER is now a CPack dependency
hipblasSetAtomicsMode
andhipblasGetAtomicsMode
- Build doesn't look for CUDA backend unless
--cuda
flag is passed
- Device memory reallocates on demand
--static
build flag allows the creation of a static library
--rocblas-path
command line option to choose path to pre-built rocBLASsgetriBatched
dgetriBatched
cgetriBatched
zgetriBatched
TrsmEx
TrsmBatchedEx
TrsmStridedBatchedEx
hipblasSetVectorAsync
hipblasGetVectorAsync
hipblasSetMatrixAsync
hipblasGetMatrixAsync
- Fortran support for
getrf
,getrs
,geqrf
, and all variants thereof
- Added the following functions, which include batched and strided-batched support with the rocBLAS
backend:
stbsv
,dtbsv
,ctbsv
,ztbsv
ssymm
,dsymm
,csymm
,zsymm
cgeam
,zgeam
chemm
,zhemm
strtri
,dtrtri
,ctrtri
, `ztrtrisdgmm
,ddgmm
,cdgmm
,zdgmm
GemmBatchedEx
andGemmStridedBatchedEx
- Fortran support for BLAS functions
- Added the following functions, which include batched and strided-batched support with the rocBLAS
backend:
sgbmv
,dgbmv
,cgbmv
,zgbmv
chemv
,zhemv
stbmv
,dtbmv
,ctbmv
,ztbmv
strmv
,trmv
,ctrmv
,ztrmv
chbmv
,zhbmv
cher
,zher
cher2
,zher2
chpmv
,zhpmv
chpr
,zhpr
chpr2
,zhpr2
ssbmv
,dsbmv
sspmv
,dspmv
ssymv
,dsymv
,csymv
,zsymv
stpmv
,dtpmv
,ctpmv
,ztpmv
cgeru
,cgerc
,zgeru
,zgerc
sspr
,dspr
,cspr
,zspr
sspr2
,dspr2
csyr
,zsyr
ssyr2
,dsyr2
,csyr2
,zsyr2
stpsv
,dtpsv
,ctpsv
,ztpsv
ctrsv
,ztrsv
cherk
,zherk
cherkx
,zherkx
cher2k
,zher2k
ssyrk
,dsyrk
,csyrk
,zsyrk
ssyr2k
,dsyr2k
,csyr2k
,zsyr2k
ssyrkx
,dsyrkx
,csyrkx
,zsyrkx
ctrmm
,ztrmm
ctrsm
,ztrsm