[enhancement]: Ship a correctly rounded threaded OpenBLAS as an Artifact #131

orkolorko · 2023-01-31T23:59:23Z

Feature description

I think it would be a good idea to ship a version of OpenBLAS with the CONSISTENT_FPCSR=1 flag enabled together with the library as an Artifact, or compile during installation.

The main reason is that the system (or Julia) OpenBLAS distribution may not have this flag enabled.
While Julia may be started with only 1 thread, unless explicitly stated, OpenBLAS may run with multiple thread enabled and have different rounding modes on each thread.

Currently, a fix that allows consistent rounding is to call Julia with the

OPENBLAS_NUM_THREADS=1

but this affects performance.

See
Julia Threads + BLAS Threads
Using directed rounding in Octave/Matlab

The text was updated successfully, but these errors were encountered:

lucaferranti · 2023-02-07T03:51:22Z

Hi @orkolorko , apologies for the delay in answering.

This sounds very interesting!

Exploiting BLAS multithreading is also what makes Rump multiplication algorithm faster. We can use matrix multiplication as a benchmark to see how this affects performance

OlivierHnt · 2024-08-24T06:22:49Z

Glancing at the code, it seems that OpenBLASConsistentFPCSR_jll is being used; so I am curious what is the missing piece preventing this issue to be closed?

Also, do I understand correctly that this resolves the issue raised in "Parallel Implementation of Interval Matrix Multiplication", by N. Revol and P. Théveny? I quote (cf. page 2):

The difficulties one has to face when implementing an interval matrix multiplica- tion are manifold. Implementing interval arithmetic through floating-point arithmetic relies on changes of the rounding modes, either rounding downwards and upwards with the representation by endpoints, or rounding to nearest and upwards with the so-called midpoint-radius representation, using the midpoints and radii. Whether the rounding mode is kept unchanged or modified by BLAS routines is undocumented. Furthermore, whether the rounding mode is properly saved and restored at context switches, as in a multithreaded execution, is not documented either [...].

If so this means Rump's algorithms can be safely used for rigorous numerics which are faster than the algorithms presented in the article.

orkolorko · 2024-08-24T06:48:42Z

Hi @OlivierHnt, together with @lucaferranti we implemented some of Rump's algorithms in https://github.com/JuliaBallArithmetic/BallArithmetic.jl, with the idea of bringing this back to IntervalLinearAlgebra.jl in the future if you want to have a look

OlivierHnt · 2024-08-25T09:49:46Z

Thx for the link. Although I am not sure if these algorithms are rigorous since I did not see how you impose that $A \cdot B$ and $|A| \cdot |B|$ are both computed in the same order.

I glanced at JuliaPackaging/Yggdrasil#6215 which added OpenBLASConsistentFPCSR_jll. You did not support arrch64 at the time. I have a M1 chip, so maybe I could give you a hand with debugging this.
Do you have instructions I could follow to do so?

orkolorko · 2024-08-26T00:21:59Z

It is a long time since I did this, I don't remember really well. Is this flag supported on aarch64?

I understand what you mean by order as in Theorem 3.4 pag. 42. So, probably, the best thing is to fallback to single thread computation to guarantee, I will think about this.

Edit: I checked, at the time when we compiled the library, CONSISTENT_FPCSR was not supported on aarch64, there was an open issue on OpenBLAS... I don't know if now if it is supported now.

Edit 2: It seems like it was added in Version 0.3.22 26-Mar-2023

OlivierHnt · 2024-08-26T01:39:35Z

I understand what you mean by order as in Theorem 3.4 pag. 42. So, probably, the best thing is to fallback to single thread computation to guarantee, I will think about this.

The other (and better in terms of performance) option is to not use this theorem, at the cost of having an additional matrix multiplication as described in "Fast interval matrix multiplication" by Rump (e.g. Algorithm 4.5 instead of Algorithm 4.7).

Edit 2: It seems like it was added in Version 0.3.22 26-Mar-2023

Ah that's good to hear. How does one update the artefact?

orkolorko · 2024-08-26T02:36:16Z

I will update BallArithmetic.jl with MMul4 as default, thanks!

About the artifact, I think what is needed is to update
this line
so that the platform list contains aarch64.

Something like

platforms = expand_gfortran_versions(supported_platforms(; exclude=p -> (arch(p) != "x86_64" && arch(p) != "aarch64")))

In BallArithmetic.jl there are some smoking gun tests to check that everything is working fine, if you want.

orkolorko added the enhancement New feature or request label Jan 31, 2023

orkolorko mentioned this issue Feb 7, 2023

[OpenBLAS] CONSISTENT_FPCSR=1 for certified computation JuliaPackaging/Yggdrasil#6214

Closed

orkolorko mentioned this issue Feb 7, 2023

Added OpenBLASConsistentFPCSR JuliaPackaging/Yggdrasil#6215

Merged

OlivierHnt mentioned this issue Sep 27, 2024

Improve support for linear algebra JuliaIntervals/IntervalArithmetic.jl#682

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[enhancement]: Ship a correctly rounded threaded OpenBLAS as an Artifact #131

[enhancement]: Ship a correctly rounded threaded OpenBLAS as an Artifact #131

orkolorko commented Jan 31, 2023 •

edited

Loading

lucaferranti commented Feb 7, 2023

OlivierHnt commented Aug 24, 2024

orkolorko commented Aug 24, 2024

OlivierHnt commented Aug 25, 2024

orkolorko commented Aug 26, 2024 •

edited

Loading

OlivierHnt commented Aug 26, 2024

orkolorko commented Aug 26, 2024 •

edited

Loading

[enhancement]: Ship a correctly rounded threaded OpenBLAS as an Artifact #131

[enhancement]: Ship a correctly rounded threaded OpenBLAS as an Artifact #131

Comments

orkolorko commented Jan 31, 2023 • edited Loading

Feature description

lucaferranti commented Feb 7, 2023

OlivierHnt commented Aug 24, 2024

orkolorko commented Aug 24, 2024

OlivierHnt commented Aug 25, 2024

orkolorko commented Aug 26, 2024 • edited Loading

OlivierHnt commented Aug 26, 2024

orkolorko commented Aug 26, 2024 • edited Loading

orkolorko commented Jan 31, 2023 •

edited

Loading

orkolorko commented Aug 26, 2024 •

edited

Loading

orkolorko commented Aug 26, 2024 •

edited

Loading