Clarify that clip() behavior is undefined when min or max is outside the bounds of x #814

asmeurer · 2024-06-13T22:30:22Z

As discussed in today's consortium meeting.

See the discussion at numpy/numpy#24976.

…the bounds of x As discussed in today's consortium meeting. See the discussion at numpy/numpy#24976.

kgryte · 2024-09-19T06:14:54Z

@asmeurer You opened this PR, but, based on your comments in numpy/numpy#24976 (comment), it is not clear whether you still support making this change. Are you okay with this PR moving forward, assuming that the current standardized behavior remains the same (i.e., no type promotion)?

asmeurer · 2024-09-19T19:12:37Z

Yes, I think this text is especially important given that there is no promotion. If there were promotion this text would serve no purpose, because x would just promote to the common data type and there would be no "out-of-bounds" values.

fancidev · 2024-10-22T02:35:04Z

I find another corner case that seems not elaborated by the spec. Suppose x is float32, min and max are float64, min < max, but min and max are so close together that they are between two adjacent float32 numbers. Then there is no float32 number whose value is between min and max. The spec doesn’t specify the behavior in this case.

Maybe this and the case min > max could be combined into one rule:

”If there exists no element in the data type of x whose value is greater than or equal to min and less than or equal to max, the behavior is undefined.”

I’d also like to take the chance to try clarify the precise meaning of “clamp … x to the range [min, max]”, which seems a bit vague to me. Something like the following might be helpful:

”If the value of min and max are exactly representable by the data type of x and min <= max, then clip(x, min, max) has the same value as maximum(minimum(x, max), min) and the same type as x.”

If this definition is deemed undesirable because some implementation imposed a stricter requirement, the following more strict version might help:

”If min and max have the same data type as x and min <= max, then clip(x, min, max) is equivalent to maximum(minimum(x, max), min). Otherwise, the behavior is implementation defined.”

Btw, the parameter doc of min and max seems to contain a typo: x1 should be x if I read correctly.

asmeurer · 2024-10-22T18:15:32Z

The more general issue with a min/max float64 is that rounding might work against you when downcasting to float32. This does not require min and max to be near each other, but rather just near x. See numpy/numpy#24976 (comment). (this is why I personally believe the non-promoting rule in clip was a bad idea. This sort of thing is exactly why type promotion behavior exists)

”If min and max have the same data type as x and min <= max, then clip(x, min, max) is equivalent to maximum(minimum(x, max), min). Otherwise, the behavior is implementation defined.”

Only defining clip for the case where x, min, and max have the same data type would be a significant change from the existing text, and probably not what people would want.

fancidev · 2024-10-22T21:42:22Z

The referenced numpy issue is an interesting one. It shows that Array API conformant semantics cannot be implemented by promotion followed by downcasting.

A general definition of clip that is consistent with the currently proposed semantics could be the following:

Let T be the data type of x. If there exists a value in T that is greater than or equal to min and less than or equal to max, then clip(x, min, max) returns (1) the smallest element in T that is greater than or equal to min if x < min, (2) the largest element in T that is less than or equal to max if x > max, or (3) x otherwise.

In math notation, this becomes

$$ \mathrm{clip}(x,a,b):= {\arg\min}_{ t \in T\cap [ a , b]} | t-x | $$

This definition consistently accounts for any mixture of data type. It is certainly implementable, but since some implementations already impose a stricter requirement on the input types, it may not be suitable for a standard. That’s why I proposed two more restrictive versions in the previous comment, one assuming min and max are representable in T, and one assuming min and max have the type T.

src/array_api_stubs/_2023_12/elementwise_functions.py

src/array_api_stubs/_draft/elementwise_functions.py

kgryte · 2024-10-31T05:25:09Z

this is why I personally believe the non-promoting rule in clip was a bad idea. This sort of thing is exactly why type promotion behavior exists

As a general principle, yes, but clip doesn't fall neatly into the same category as, e.g., add, mul, etc, where type promotion makes more intuitive sense.

If we wanted to allow type promotion wiggle room for some libraries while also allowing other libraries to make the reasonable choice of raising an exception for undefined behavior, we could limit portable clip behavior to only the scenario in which x, min, and max all have the same data type. We'd need to tweak the language of the spec a bit to open the door for implementing libraries to implement non-portable behavior without running afoul of the standard.

Were we to circumscribe portable behavior, this would require downstream libraries building on the Array API to be explicit about what behavior they want (e.g., by wrapping x in astype or min and max in astype). E.g., if they always want type promotion semantics, then use result_type to determine the desired output array dtype and then use astype on x, min, and max. If they want min and max to always be casted to the dtype of x (either upcast or downcast), then use astype on min and max. Et cetera.

Clarify that clip() behavior is undefined when min or max is outside …

5f2bb2e

…the bounds of x As discussed in today's consortium meeting. See the discussion at numpy/numpy#24976.

asmeurer mentioned this pull request Jun 13, 2024

ENH: Ensure that output of np.clip has the same dtype as the main array numpy/numpy#24976

Open

kgryte added this to the v2024 milestone Sep 19, 2024

kgryte added Narrative Content Narrative documentation content. topic: Type Promotion Type promotion. Needs Discussion Needs further discussion. labels Sep 19, 2024

kgryte reviewed Oct 31, 2024

View reviewed changes

src/array_api_stubs/_2023_12/elementwise_functions.py Outdated Show resolved Hide resolved

kgryte reviewed Oct 31, 2024

View reviewed changes

src/array_api_stubs/_draft/elementwise_functions.py Outdated Show resolved Hide resolved

Apply suggestions from code review

55d2a19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify that clip() behavior is undefined when min or max is outside the bounds of x #814

Clarify that clip() behavior is undefined when min or max is outside the bounds of x #814

asmeurer commented Jun 13, 2024

kgryte commented Sep 19, 2024

asmeurer commented Sep 19, 2024

fancidev commented Oct 22, 2024 •

edited

Loading

asmeurer commented Oct 22, 2024

fancidev commented Oct 22, 2024 •

edited

Loading

kgryte commented Oct 31, 2024 •

edited

Loading

Clarify that clip() behavior is undefined when min or max is outside the bounds of x #814

Are you sure you want to change the base?

Clarify that clip() behavior is undefined when min or max is outside the bounds of x #814

Conversation

asmeurer commented Jun 13, 2024

kgryte commented Sep 19, 2024

asmeurer commented Sep 19, 2024

fancidev commented Oct 22, 2024 • edited Loading

asmeurer commented Oct 22, 2024

fancidev commented Oct 22, 2024 • edited Loading

kgryte commented Oct 31, 2024 • edited Loading

fancidev commented Oct 22, 2024 •

edited

Loading

fancidev commented Oct 22, 2024 •

edited

Loading

kgryte commented Oct 31, 2024 •

edited

Loading