Reductions and NaN values #467

nspark · 2021-05-06T19:04:36Z

Description

Currently, the Specification does not specify the handling of NaN values in reductions over floating types.

Per C18 §7.12.14-1, a NaN value is unordered with respect to a numeric value or another NaN. For example, it is not clear what the result of shmem_double_max_reduce or shmem_double_max_to_all should be in the presence of NaN values.

In C, NaN values can be initially unintuitive; for example:

#include <math.h>
#define MAX(a, b) ((a) > (b) ? (a) : (b))

MAX(1.0, NAN) == NAN;
MAX(NAN, 1.0) == 1.0;

C provides fmax and fmin to handle these situations gracefully:

#include <math.h>
fmax(1.0, NAN) == 1.0;
fmax(NAN, 1.0) == 1.0;
fmax(NAN, NAN) == NAN;

In the tests I've performed on OpenSHMEM implementations readily accessible to me (certainly not all that exist), none handle min/max reductions correctly for NaN values. They did seem to handle sum reductions correctly.

Suggestions

MAX and MIN reductions with semantic equivalence to C's fmax and fmin functions.
- Thus, such reductions will effectively drop NaN values unless all input values are NaN, in which case the result should be NaN.
SUM and PROD reductions should produce a NaN value if any input value is NaN.

Considerations

Some OpenSHMEM implementations leverage hardware-accelerated reductions. I am not aware of the state of NaN handling for such hardware.

The text was updated successfully, but these errors were encountered:

nspark · 2021-05-06T19:50:36Z

I expected MPI might have plenty to say about handling NaN values, but all I see is the following:

According to IEEE specifications, the “NaN” (not a number) is system dependent. It should not be interpreted within MPI as anything other than “NaN.”

Advice to implementors. The MPI treatment of “NaN” is similar to the approach used in XDR (see ftp://ds.internic.net/rfc/rfc1832.txt). (End of advice to implementors.)

nspark · 2021-05-24T20:58:56Z

To implementors/vendors: Are there performance concerns if the result (e.g., of a MAX reduction where some entries are NaN values) is implementation defined but required to be single-valued? That is, all PEs would be expected to return the same value. (This result is not currently the case on all implementations.)

jdinan · 2021-06-15T21:23:01Z

Not for MAX, but for an arithmetic operation like SUM, there could be an associativity requirement in order to ensure that all PEs get identical results.

nspark · 2021-06-25T19:50:56Z

Some concerns that have been raised (off issue, obviously) are that "proper" NaN behavior may require an additional collective (to test for NaNs), which is not desirable.

Again, in my (limited) testing, implementations handled the sum-reduction properly in the face of NaN values. The max (and, presumably, min) reductions were what were not necessarily returning the same value on all PEs.

manjugv added this to the OpenSHMEM 1.6 milestone Jun 25, 2021

manjugv assigned nspark Jun 25, 2021

nspark mentioned this issue Mar 23, 2022

Add inclusive and exclusive scan (prefix sum) operations #488

Merged

kwaters4 mentioned this issue Aug 30, 2024

reductions: fix text for associative binary ops davidozog/openshmem-specification#12

Closed

4 tasks

jdinan modified the milestones: OpenSHMEM 1.6, OpenSHMEM 1.7 Sep 26, 2024

davidozog self-assigned this Nov 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reductions and NaN values #467

Reductions and NaN values #467

nspark commented May 6, 2021 •

edited

Loading

nspark commented May 6, 2021

nspark commented May 24, 2021

jdinan commented Jun 15, 2021

nspark commented Jun 25, 2021

Reductions and NaN values #467

Reductions and NaN values #467

Comments

nspark commented May 6, 2021 • edited Loading

Description

Suggestions

Considerations

nspark commented May 6, 2021

nspark commented May 24, 2021

jdinan commented Jun 15, 2021

nspark commented Jun 25, 2021

nspark commented May 6, 2021 •

edited

Loading