Handle IEEE Floating Point Special Values #105

julmb · 2023-07-06T16:44:37Z

This PR adds several changes primarily aimed at addressing issue #104.

Expand Test Suite

I added tests for the combinators diff, grad, jacobian, and hessian. These include both basic correctness tests as well as specific regression tests for issue #104. Some of the tests were taken directly from the package documentation. For instance, the module Numeric.AD.Mode.Reverse lists several examples for its combinators.

The test suite is currently instantiated for the modes Numeric.AD.Mode.Reverse and Numeric.AD.Mode.Reverse.Double but could easily be extended to cover the other modes. However, there are failing tests in some of the other modes and I am not sure if adjusting those should be part of this pull request. I am also not sure if/when I will have time to look at these, so the test suite is restricted to Numeric.AD.Mode.Reverse and Numeric.AD.Mode.Reverse.Double for now.

It is worth noting that with the current setup, it is not possible to test both the pure and the +ffi version of Numeric.AD.Mode.Reverse.Double at the same time, since they are implemented within the same module and activated via preprocessor switch. This is especially troublesome since the trickier +ffi version is not active by default. It would be nice if the +ffi version could be its own module, but I do not know the rationale behind the original preprocessor switch design.

Adjust `+ffi` Version of `Numeric.AD.Mode.Reverse.Double`

There are three changes.

As discussed in issue ReverseDouble with +ffi produces different result for square root of product #104, the current implementation uses index 0 and value 0.0 for the unused second operand in tape entries corresponding to unary operations. I changed this to use index -1 so that these unused operands can be distinguished from actual operands with index 0 and value 0.0. This is a pure refactoring and should not change any behavior.
Instead of checking if the value of an operand is 0.0 in order to decide whether to skip this operand for backpropagation, it now checks if the index is >= 0. This allows processing tape entries properly in the presence of NaN and Infinity values and makes the tests tests.reverse-double.issue-104.inside pass. However, it no longer skips over entries where the backpropagation really would not have an effect on the partial derivative.
Instead of skipping the iteration if the current partial derivative is 0.0, it now checks if the increments to each of the partial derivatives are 0.0. This allows correct processing of NaN and Infinity values and makes the tests under tests.reverse-double.issue-104.outside pass. It also restores some of the skipping lost in change 2. However, there is still a performance impact compared to the original behavior of skipping the entire update if the corresponding partial derivative is 0.0.

For the bench/BlackScholes.hs benchmark (adjusted for Numeric.AD.Mode.Reverse.Double), I was not able to measure any performance impact of these changes. For two of my own applications, I measured an increase in runtime by a factor of 1.07 and 1.21 respectively.

cartazio · 2023-07-06T18:13:41Z

Question: why are we not having the check be if v*y != 0 etc?

cartazio · 2023-07-06T18:17:10Z

The zero handling affects the space complexity of reverse mode. Shouldn’t we keep that and instead have the code do both the product v y != 0 and the -1 trick?

RyanGlScott · 2023-07-06T18:46:14Z

The CI is stuck due to the GitHub Actions setup using an unsupported Ubuntu version. I've pushed a commit to master (54b1751) which should address this, so rebasing on top of that commit should get the CI un-stuck.

julmb · 2023-07-07T12:50:39Z

The CI is stuck due to the GitHub Actions setup using an unsupported Ubuntu version. I've pushed a commit to master (54b1751) which should address this, so rebasing on top of that commit should get the CI un-stuck.

Done, thank you! I am confused by what is going on with the CI on ghc-9.0.2 though. I looks like it cannot infer the Typeable constraint on s? Although I am very surprised that this happens only on 9.0.2 and neither on older nor newer versions of GHC.

Question: why are we not having the check be if v*y != 0 etc?

We are. It used to be double x = pTape->val[idx*2]; and then buffer[i] += v*x;. Now we have double x = v * pTape->val[idx*2]; and then buffer[i] += x;, so the multiplication is already part of x and the check if (x != 0) is equivalent to what would have been if (v*x != 0). If we are confident that this common subexpression elimination is done by the compiler anyways, we can also go back to the v*x way of doing things.

The zero handling affects the space complexity of reverse mode. Shouldn’t we keep that and instead have the code do both the product v y != 0 and the -1 trick?

I am not sure what you mean by this. The space complexity should be unchanged, and I have not seen any indication of it being different. The tape is still the same (except for containing -1 instead of 0 in some entries) and the buffer array of partial derivatives is also still the same.

julmb · 2023-07-07T13:10:47Z

It looks like the failure on ghc-9.0.2 is related to the simplified subsumption change and can be worked around via eta-expansion. Then ghc-9.2.4 introduces the DeepSubsumption language extension that reestablishes backwards compatibility, so it works again in subsequent versions of GHC. I will add the eta-expansion since that seems to be the indended way now.

cartazio · 2023-07-07T14:29:22Z

cbits/tape.c

      {
-        buffer[i] += v*x;
+        double x = v * pTape->val[idx*2];
+        if (x != 0) buffer[i] += x;


oh i see what you mean, i didn't realize this new x was the old x * v stuff

cartazio · 2023-07-07T14:30:48Z

i was just not reading the code correctly, thanks for clarifying

julmb · 2023-08-01T12:19:07Z

Alright, it has been a few weeks, what is the verdict on this? Is more feedback needed? Is more work needed on my part?

It would be nice if some people using the +ffi version of ReverseDouble could test the performance impact on their applications. As far as I am concerned, the performance impact is of course not great, but an acceptable price to pay for actual correctness in the presence of NaN and Infinity values.

RyanGlScott · 2023-08-01T12:37:06Z

Sorry for forgetting about this—this dropped off my radar.

A handful of observations:

I'm honestly not sure who is actively using the +ffi version of ReverseDouble right now. @sofusmortensen (the creator of +ffi) is an obvious candidate, but I've had trouble reaching @sofusmortensen for comment. I'm not sure if there are others who use it.
Do you have a sense as to why you notice a performance regression in your own applications but not in the BlackScholes benchmark in the ad repo? Would it be worth adding another benchmark that is more sensitive to the changes in this PR as a better way to measure this in the future?
I wonder if it is worth adding a "fast-math" version of ReverseDouble that achieves better performance at the expense of not being correct with respect to special IEEE floating-point values. This would be more work, but it would give a better migration story for anyone whose applications' performance might be negatively impacted by these changes.

cartazio · 2023-08-01T16:53:30Z

I think this change will not severely impact performance, since branches shouldn't be the bottleneck

cartazio · 2023-08-01T16:53:41Z

otoh, i could be missing something

julmb · 2023-08-22T14:52:27Z

Sorry for the delay, the last few weeks have been very busy for me.

Do you have a sense as to why you notice a performance regression in your own applications but not in the BlackScholes benchmark in the ad repo? Would it be worth adding another benchmark that is more sensitive to the changes in this PR as a better way to measure this in the future?

I am not 100% sure, but I believe that this is due to the function in my application having more independent parts. Although the Jacobian in my application does not contain any zeroes (meaning that each parameter can influence every entry in the output vector), there are still subexpressions that arise as part of the calculation that only influence a subset of the entries in the output vector. Thus, the partial derivatives of some output entries with respect to these subexpressions could be zero. Since these zero entries are no longer skipped entirely, the performance is impacted.

My guess is that this occurs to a lesser degree in the BlackScholes benchmark, so the performance impact is smaller.

It might be a good idea to add a more decomposable benchmark, but I currently do not have the time to do this.

I wonder if it is worth adding a "fast-math" version of ReverseDouble that achieves better performance at the expense of not being correct with respect to special IEEE floating-point values. This would be more work, but it would give a better migration story for anyone whose applications' performance might be negatively impacted by these changes.

It could be a good idea, but I will leave that up to you, since it is mainly a matter of library design and maintenance burden considerations.

RyanGlScott · 2023-08-23T11:31:20Z

Thanks, that information is helpful to know.

Since we don't really have a good way to gauge if there are any other users of this +ffi functionality besides yourself, I'm inclined to prioritize correctness here. And since time is limited, I support landing this without first implementing a fast-math version of ReverseDouble, since it's unclear how much work that would involve.

That being said, it would be nice to track this fast-math idea in an issue so that we can revisit it later if someone comes along later and asks for a more performant ReverseDouble. Can you:

Open an issue about idea (3) from Handle IEEE Floating Point Special Values #105 (comment), and
Cite that issue in the relevant parts of your PR that check for special floating-point values?

I'd be happy to merge this afterwards. Thanks!

julmb · 2023-08-23T11:56:48Z

Open an issue about idea (3) from Handle IEEE Floating Point Special Values #105 (comment), and

Cite that issue in the relevant parts of your PR that check for special floating-point values?

Done and done. Once this is merged, I can also reference the relevant code location from #106 so that it is easy to find the starting point for working on this issue.

RyanGlScott · 2023-08-23T13:19:32Z

Thanks, @julmb! Let me know if you'd like a new ad release with these changes soon. If not, I'll wait until my next round of Hackage releases before getting around to doing so.

julmb · 2023-08-23T13:36:59Z

Thank you, and I'm not in any rush, no worries.

julmb · 2024-01-24T09:24:59Z

Thanks, @julmb! Let me know if you'd like a new ad release with these changes soon. If not, I'll wait until my next round of Hackage releases before getting around to doing so.

Alright, it's been a while, and I could actually use a new release with these changes as my test suite fails without them. Would it be possible to do a release on Hackage?

RyanGlScott · 2024-01-28T20:28:29Z

Sorry for the delay on this! I've uploaded ad-4.5.5 to Hackage with these changes.

julmb · 2024-01-28T21:58:53Z

No worries, and thank you very much!

julmb added 4 commits July 7, 2023 14:26

expand regression tests

bcc958b

use index -1 for absent operand in binary entry

4051a7e

check index instead of value for backpropagation

c47889e

adjust zero check to work with special floats

b892764

eta expand to accomodate simplified subsumption

e387c27

cartazio reviewed Jul 7, 2023

View reviewed changes

julmb mentioned this pull request Aug 23, 2023

Implement fast-math Version of ReverseDouble with +ffi #106

Open

reference issue #106 in +ffi implementation

9bf74af

RyanGlScott approved these changes Aug 23, 2023

View reviewed changes

RyanGlScott merged commit 18efa2c into ekmett:master Aug 23, 2023

julmb deleted the special branch September 20, 2023 14:03

RyanGlScott added a commit that referenced this pull request Jan 28, 2024

Mention #105 in the CHANGELOG

29d4df3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle IEEE Floating Point Special Values #105

Handle IEEE Floating Point Special Values #105

julmb commented Jul 6, 2023

cartazio commented Jul 6, 2023

cartazio commented Jul 6, 2023

RyanGlScott commented Jul 6, 2023

julmb commented Jul 7, 2023 •

edited

Loading

julmb commented Jul 7, 2023

cartazio Jul 7, 2023

cartazio commented Jul 7, 2023

julmb commented Aug 1, 2023

RyanGlScott commented Aug 1, 2023

cartazio commented Aug 1, 2023 •

edited

Loading

cartazio commented Aug 1, 2023

julmb commented Aug 22, 2023

RyanGlScott commented Aug 23, 2023

julmb commented Aug 23, 2023 •

edited

Loading

RyanGlScott commented Aug 23, 2023

julmb commented Aug 23, 2023

julmb commented Jan 24, 2024 •

edited

Loading

RyanGlScott commented Jan 28, 2024

julmb commented Jan 28, 2024

Handle IEEE Floating Point Special Values #105

Handle IEEE Floating Point Special Values #105

Conversation

julmb commented Jul 6, 2023

Expand Test Suite

Adjust +ffi Version of Numeric.AD.Mode.Reverse.Double

cartazio commented Jul 6, 2023

cartazio commented Jul 6, 2023

RyanGlScott commented Jul 6, 2023

julmb commented Jul 7, 2023 • edited Loading

julmb commented Jul 7, 2023

cartazio Jul 7, 2023

Choose a reason for hiding this comment

cartazio commented Jul 7, 2023

julmb commented Aug 1, 2023

RyanGlScott commented Aug 1, 2023

cartazio commented Aug 1, 2023 • edited Loading

cartazio commented Aug 1, 2023

julmb commented Aug 22, 2023

RyanGlScott commented Aug 23, 2023

julmb commented Aug 23, 2023 • edited Loading

RyanGlScott commented Aug 23, 2023

julmb commented Aug 23, 2023

julmb commented Jan 24, 2024 • edited Loading

RyanGlScott commented Jan 28, 2024

julmb commented Jan 28, 2024

Adjust `+ffi` Version of `Numeric.AD.Mode.Reverse.Double`

julmb commented Jul 7, 2023 •

edited

Loading

cartazio commented Aug 1, 2023 •

edited

Loading

julmb commented Aug 23, 2023 •

edited

Loading

julmb commented Jan 24, 2024 •

edited

Loading