Releases · jipolanco/NonuniformFFTs.jl

26 Nov 13:46

github-actions

v0.6.7

5e13c96

v0.6.7 Latest

Latest

NonuniformFFTs v0.6.7

Diff since v0.6.6

Fixed

Avoid error when creating high-accuracy GPU plans. This affected plans that cannot be treated using the :shared_memory method (because they require large memory buffers), such as plans with ComplexF64 data associated to a large kernel width (e.g. HalfSupport(8)). Such plans can still be computed using the :global_memory method, but this failed up to now.

Merged pull requests:

CompatHelper: bump compat for Atomix to 1, (keep existing compat) (#48) (@github-actions[bot])

Assets 2

25 Nov 21:00

github-actions

v0.6.6

8f0673e

v0.6.6

NonuniformFFTs v0.6.6

Diff since v0.6.5

Improve parallel performance of set_points! with CPU backend. (#47)

Merged pull requests:

Improve parallel performance of set_points! on CPU (#47) (@jipolanco)

Contributors

jipolanco

Assets 2

18 Nov 16:19

github-actions

v0.6.5

e934e90

v0.6.5

NonuniformFFTs v0.6.5

Diff since v0.6.4

Fixed

Fix scalar indexing error on latest AMDGPU.jl (v1.1.1). Not sure exactly if it's a recent change in AMDGPU.jl, or maybe in GPUArrays.jl, which caused the error.

Merged pull requests:

Bump codecov/codecov-action from 4 to 5 (#46) (@dependabot[bot])

Contributors

dependabot

Assets 2

17 Nov 15:42

github-actions

v0.6.4

1d89159

v0.6.4

NonuniformFFTs v0.6.4

Diff since v0.6.3

Changed

Avoid large GPU allocation in type-2 transforms when using the CUDA backend. The allocation was due to CUDA.jl creating a copy of the input in complex-to-real FFTs (see CUDA.jl#2249).

Merged pull requests:

Avoid GPU allocation in CUDA type-2 NUFFTs (#45) (@jipolanco)

Contributors

jipolanco

Assets 2

17 Nov 14:58

github-actions

v0.6.3

358b2f7

v0.6.3

NonuniformFFTs v0.6.3

Diff since v0.6.2

Merged pull requests:

CompatHelper: bump compat for StructArrays to 0.7, (keep existing compat) (#44) (@github-actions[bot])

Assets 2

04 Nov 09:34

github-actions

v0.6.2

b33bd8f

v0.6.2

NonuniformFFTs v0.6.2

Diff since v0.6.1

Changed

Improve performance of atomic operations (affecting type-1 transforms) on AMD GPUs by using @atomic :monotonic.
Change a few defaults on AMD GPUs to improve performance. This is based on experiments with an AMD MI210, where the new defaults should give better performance. We now default to fast polynomial approximation of kernel functions and to the backwards Kaiser-Bessel kernel (as in the CPU).

Assets 2

29 Oct 13:36

github-actions

v0.6.1

b22800f

v0.6.1

NonuniformFFTs v0.6.1

Diff since v0.6.0

Fixed

Fix type-2 transforms on the GPU when performing multiple transforms at once (ntransforms > 1) and when gpu_method = :shared_memory (which is not currently the default).

Merged pull requests:

Fix type-2 GPU shared memory with ntransforms > 1 (#43) (@jipolanco)

Contributors

jipolanco

Assets 2

29 Oct 08:19

github-actions

v0.6.0

96215d1

v0.6.0

NonuniformFFTs v0.6.0

Diff since v0.5.6

Added

Add alternative implementation of GPU transforms based on shared-memory arrays. This is disabled by default, and can be enabled by passing gpu_method = :shared_memory when creating a plan (default is :global_memory).
Add possibility to switch between fast approximation of kernel functions (previously the default and only choice) and direct evaluation (previously not implemented). These correspond to the new kernel_evalmode plan creation option. Possible values are FastApproximation() and Direct(). The default depends on the actual backend. Currently, FastApproximation() is used on CPUs and Direct() on GPUs, where it is sometimes faster.
The AbstractNFFTs.plan_nfft function is now implemented for full compatibility with the AbstractNFFTs.jl interface.

Changed

BREAKING: Change default precision of transforms. By default, transforms on Float64 or ComplexF64 now have a relative precision of the order of $10^{-7}$. This corresponds to setting m = HalfSupport(4) and oversampling factor σ = 2.0. Previously, the default was m = HalfSupport(8) and σ = 2.0, corresponding to a relative precision of the order of $10^{-14}$.
BREAKING: The PlanNUFFT constructor can no longer be used to create plans compatible with AbstractNFFTs.jl / NFFT.jl. Instead, a separate (and unexported) NonuniformFFTs.NFFTPlan type is now defined which may be used for this purpose. Alternatively, one can now use the AbstractNFFTs.plan_nfft function.
On GPUs, we now default to direct evaluation of kernel functions (e.g. Kaiser-Bessel) instead of polynomial approximations, as this seems to be faster and uses far fewer GPU registers.
On CUDA and AMDGPU, the default kernel is now KaiserBesselKernel instead of BackwardsKaiserBesselKernel. The direct evaluation of the KB kernel (based on Bessel functions) seems to be a bit faster than backwards KB, both on CUDA and AMDGPU. Accuracy doesn't change much since both kernels have similar precisions.

Merged pull requests:

CompatHelper: bump compat for GPUArraysCore to 0.2, (keep existing compat) (#36) (@github-actions[bot])
Add shared-memory GPU implementations of spreading and interpolation (#37) (@jipolanco)
Change default accuracy of transforms (#38) (@jipolanco)
Use direct evaluation of kernel functions on GPU (#39) (@jipolanco)
Allow choosing the kernel evaluation method (#40) (@jipolanco)
Automatically determine batch size in shared-memory GPU transforms (#41) (@jipolanco)
Define AbstractNFFTs.plan_nfft and create separate plan type (#42) (@jipolanco)

Contributors

jipolanco

Assets 2

17 Oct 07:55

github-actions

v0.5.6

523ca32

v0.5.6

NonuniformFFTs v0.5.6

Diff since v0.5.5

Merged pull requests:

Simplify main GPU kernels using Adapt (#34) (@jipolanco)
Minor optimisations to GPU kernels (#35) (@jipolanco)

Contributors

jipolanco

Assets 2

04 Oct 10:38

github-actions

v0.5.5

19e04d9

v0.5.5

NonuniformFFTs v0.5.5

Diff since v0.5.4

Merged pull requests:

Make things work on AMD GPUs (#33) (@jipolanco)

Contributors

jipolanco

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NonuniformFFTs v0.6.7

Fixed

NonuniformFFTs v0.6.6

Contributors

NonuniformFFTs v0.6.5

Fixed

Contributors

NonuniformFFTs v0.6.4

Changed

Contributors

NonuniformFFTs v0.6.3

NonuniformFFTs v0.6.2

Changed

NonuniformFFTs v0.6.1

Fixed

Contributors

NonuniformFFTs v0.6.0

Added

Changed

Contributors

NonuniformFFTs v0.5.6

Contributors

NonuniformFFTs v0.5.5

Contributors

Releases: jipolanco/NonuniformFFTs.jl

v0.6.7

NonuniformFFTs v0.6.7

Fixed

v0.6.6

NonuniformFFTs v0.6.6

Contributors

v0.6.5

NonuniformFFTs v0.6.5

Fixed

Contributors

v0.6.4

NonuniformFFTs v0.6.4

Changed

Contributors

v0.6.3

NonuniformFFTs v0.6.3

v0.6.2

NonuniformFFTs v0.6.2

Changed

v0.6.1

NonuniformFFTs v0.6.1

Fixed

Contributors

v0.6.0

NonuniformFFTs v0.6.0

Added

Changed

Contributors

v0.5.6

NonuniformFFTs v0.5.6

Contributors

v0.5.5

NonuniformFFTs v0.5.5

Contributors