define_unit override argument #239

rjfarber · 2022-07-08T10:58:11Z

I'm not sure this is fully desirable (or maybe this capability exists and I'm just missing it) but I'm currently in the situation where I messed up a custom defined unit and wanted to simply override the definition (similar to how derived_fields can be redefined).

So I added an optional argument in def define_unit(...,override=False) and don't trigger the raise if override is True. I have to run to a meeting but thought I might as well kick off this discussion rather than spend a lot of time on making this a complete PR right now in case this is not desirable.

Anyway happy Friday!

…y. Not nearly a complete PR since I'm not fully sure this is desirable beyond me (and possibly I should be doing this another way...)

neutrinoceros

Hi @rjfarber
This looks like a reasonable request to me and your implementation is sound. I'll take a deeper look tomorrow to make sure this is the simplest way to do it (but it seems likely).
For now I just have a couple minor suggestions:

force_override or allow_override would make slightly more sense to me as an argument name, because override=True reads as something that's never a no-op.
I'd like that any new arguments was keyword only, because it's much easier to maintain for us in the future.

neutrinoceros · 2022-07-12T16:45:58Z

So I don't think there's any documented/supported way to override pre-defined units at the moment.
Within my admittedly limited understanding of the code involved, I also cannot foresee any fundamental problem with this approach.
Here's what I could grasp: under the hood, this function updates a UnitRegistry instance, which is a relatively flat data structure: I don't think that it preserves a hierarchy of dependencies between units. My conclusion is that there should be no risk of getting stuck with a corrupt state with cyclic references and whatnot.

If you want to pursue this, feel free to. Most importantly we would need some documentation and tests.

jzuhone · 2022-07-15T19:16:54Z

So the UnitRegistry class has a modify method:

unyt/unyt/unit_registry.py

Lines 199 to 229 in 10ab887

    
               def modify(self, symbol, base_value): 
        
                   """ 
        
                   Change the base value of a unit symbol.  Useful for adjusting code 
        
                   units after parsing parameters. 
        
                   Parameters 
        
                   ---------- 
        
                   symbol : str 
        
                      The name of the symbol to modify 
        
                   base_value : float 
        
                      The new base_value for the symbol. 
        
                   """ 
        
                   self._unit_system_id = None 
        
                   if symbol not in self.lut: 
        
                       raise SymbolNotFoundError( 
        
                           "Tried to modify the symbol '%s', but it does not exist " 
        
                           "in this registry." % symbol 
        
                       ) 
        
                   if hasattr(base_value, "in_base"): 
        
                       new_dimensions = base_value.units.dimensions 
        
                       base_value = base_value.in_base("mks") 
        
                       base_value = base_value.value 
        
                   else: 
        
                       new_dimensions = self.lut[symbol][1] 
        
                   self.lut[symbol] = (float(base_value), new_dimensions) + self.lut[symbol][2:]

I'm not sure if this does what you want it to or not.

neutrinoceros · 2022-07-16T07:46:51Z

Oh I missed that, thanks for pointing it out !
So UnitRegistry.modify only works for previously defined units, while define_unit currently only works for adding new units. I think making define_unit the go-to function to do both is a good idea and solidifies the API.

unyt.define_unit is a top level user-facing function which encapsulates lower level concepts such as UnitRegistry management, it may make sense to plug a call to UnitRegistry.modify in the override implementation.

jzuhone · 2022-07-16T12:40:40Z

I guess I have a question about the use case--@rjfarber, if you messed up the unit, is there a reason we don't just fix the unit in the script and then re-run? Or is there a reason we should start with an incorrect value and then change it?

I'm only a bit hesitant to introduce this functionality because in normal cases we actually don't want to redefine units--a kpc is going to be a kpc, and if the standard changes then we change the hard-coded value instead. UnitRegistry.modify is there mainly for code units. I worry making it too easy to change units could lead to unpredictable behavior.

However, maybe I am not understanding the use case well enough.

rjfarber · 2022-07-16T21:21:07Z

Hi John, Regarding use case, if you're running in a jupyter notebook and already have a decent amount of data loaded (can take minutes to load) it's rather annoying to shutdown the script, reload the data, etc. Really slows down the development process. And I ran into this for dealing with non-dimensional units (cloud radii, cloud mass etc) - which is a problem for dimensionalized codes like FLASH as much as for code-unit codes like Athena. It does sound like UnitRegistry.modify can handle this case, but I agree with Clément that I think it'd be nicer to have it all under define_unit. Although I'll definitely to the judgement of you two and the other long-time yt devs here :) Regarding making it too easy to change units, well if the user decides to use a "force_override" method hopefully they know what they're doing. Personally I'm a fan of giving the user absolute power to `sudo rm -rf /` vs. coddling the user, but that gets philosophical I'll admit :) Also sorry everyone for being MIA! I'll plan to return to this tomorrow :) Best, -------- Ryan

…

On Sat, Jul 16, 2022 at 2:40 PM John ZuHone ***@***.***> wrote: I guess I have a question about the use ***@***.*** <https://github.com/rjfarber>, if you messed up the unit, is there a reason we don't just fix the unit in the script and then re-run? Or is there a reason we should start with an incorrect value and then change it? I'm only a bit hesitant to introduce this functionality because in normal cases we actually don't want to redefine units--a kpc is going to be a kpc, and if the standard changes then we change the hard-coded value instead. UnitRegistry.modify is there mainly for code units. I worry making it too easy to change units could lead to unpredictable behavior. However, maybe I am not understanding the use case well enough. — Reply to this email directly, view it on GitHub <#239 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGNJQVZ2ZNCVDPFHDSNYPPTVUKUVFANCNFSM53ATGXSA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

neutrinoceros · 2022-07-16T21:28:03Z

Furthermore, giving that power to users is perfectly in line with how yt.derived_field works. One could argue that yt and unyt don't have to follow the exact same design principles but they do share a significant fraction of their users ...

jzuhone · 2022-07-19T14:47:11Z

@rjfarber @neutrinoceros ok, so I am now convinced of the utility of this (and thanks for the specific Jupyter example, I totally get it).

so I actually think what would be preferable in this situation is a new function, modify_unit, that wraps UnitRegistry.modify and does whatever else needs to be done. It's an extra function, but the name also makes clear what's happening and we can limit what's going on inside of it to unit modification.

What do you guys think? One issue that might be tricky to solve (as I discovered recently myself) is making sure that when you modify a unit it propagates to the units that you can get from the top level, i.e. from unyt import Zsun.

I'm happy to take a crack at this if you're interested, or you can feel free to do so!

neutrinoceros · 2022-07-19T15:01:42Z

so I actually think what would be preferable in this situation is a new function, modify_unit, that wraps UnitRegistry.modify and does whatever else needs to be done. It's an extra function, but the name also makes clear what's happening and we can limit what's going on inside of it to unit modification.
What do you guys think?

TBH I'm not a fan, it'd be less discoverable and even if you know about it, you'd need to change your code to run it again.
In the context of a Jupyter notebook, you want to be able to write code that runs without interrupting your workflow, no matter how many times you (re)run the cell (at least, ideally). I think that's only achieved with a allow_override-like argument.

What do you guys think? One issue that might be tricky to solve (as I discovered recently myself) is making sure that when you modify a unit it propagates to the units that you can get from the top level, i.e. from unyt import Zsun

Indeed this sounds like a major challenge, regardless the approach. Unit objects are intentionally immutable so I guess you need to replace any existing one with it ?

jzuhone · 2022-09-07T01:14:50Z

@rjfarber I haven't forgotten about this, sorry for the delay. I think in a couple of weeks we should try getting it merged in

rjfarber · 2022-09-07T16:06:16Z

Okay cool! Did you and Clément come to a decision/understanding about a "allow_override" argument vs. modify_unit new function regarding implementation? Best, -------- Ryan

…

On Wed, Sep 7, 2022 at 3:15 AM John ZuHone ***@***.***> wrote: @rjfarber <https://github.com/rjfarber> I haven't forgotten about this, sorry for the delay. I think in a couple of weeks we should try getting it merged in — Reply to this email directly, view it on GitHub <#239 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGNJQV5CX3HMYFG7DP44FX3V47UBJANCNFSM53ATGXSA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

neutrinoceros · 2022-09-08T12:16:49Z

We haven't discussed it in other channels. I'm still inclined to think an override-like argument to an existing function is the best option.

jzuhone · 2022-09-14T19:32:26Z

@rjfarber so I've come around to this idea--but I would like to see a couple of things (potentially) added here:

It would be nice if we could indicate that a unit is being overridden somehow. But we don't have a logger, and I am hesitant to add one, especially since we don't have more use cases. And we shouldn't use a print statement or issue a warning either. So not sure if there is a good way to log this, open to suggestions.
We just need a short bit in the docs about this, one or two sentences is probably fine. If you need help with that let me know.

neutrinoceros · 2022-09-14T19:45:47Z

we don't have a logger, and I am hesitant to add one, especially since we don't have more use cases. And we shouldn't use a print statement or issue a warning either. So not sure if there is a good way to log this, open to suggestions.

Agreed that adding a logger for this seems like a long shot, and a plain print statement would be terrible style, but what's wrong with a warning ? If you fear this would be too noisy, I can see two ways we could add a warning and keep it "quiet" in most cases:

we could subclass (or use) builtin ResourceWarning, which aren't showed by default but will be treated as errors in CI here and downstream, greatly reducing the risk that this ends up in production code by accident https://docs.python.org/3/library/exceptions.html#ResourceWarning
or we could add another option to silence the warning. I must say I'm not a fan of this option.

rjfarber · 2022-09-15T09:52:00Z

Yup, I'll definitely add to the docs! Regarding adding some indication to the user that an override is taking place, I agree a print statement would not be ideal. I'm cool with Clément's suggestion of using a warning. I guess you both will be horrified to hear this, but one my favorite stackoverflow answers is: I don't condone it, but you could just *suppress all warnings* with this: import warnings warnings.filterwarnings("ignore") (https://stackoverflow.com/questions/14463277/how-to-disable-python-warnings ) Best, -------- Ryan

…

On Wed, Sep 14, 2022 at 9:46 PM Clément Robert ***@***.***> wrote: we don't have a logger, and I am hesitant to add one, especially since we don't have more use cases. And we shouldn't use a print statement or issue a warning either. So not sure if there is a good way to log this, open to suggestions. Agreed that adding a logger for this seems like a long shot, and a plain print statement would be terrible style, but what's wrong with a warning ? If you fear this would be too noisy, I can see two ways we could add a warning and keep it "quiet" in most cases: - we could subclass (or use) builtin ResourceWarning, which aren't showed by default but will be treated as errors in CI here and downstream, greatly reducing the risk that this ends up in production code by accident https://docs.python.org/3/library/exceptions.html#ResourceWarning - or we could add *another* option to silence the warning. I must say I'm not a fan of this option. — Reply to this email directly, view it on GitHub <#239 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGNJQV7YXZQ76AYK7RDPAE3V6ITPPANCNFSM53ATGXSA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

neutrinoceros · 2022-09-15T10:18:19Z

I guess you both will be horrified
to hear this, but one my favorite stackoverflow answers is:

I don't condone it, but you could just suppress all warnings with this:

First of all: 😱
Second of all: we're well aware that's a thing, and we'd prefer to not put our users in a position where they would need or want this.

…the official test_define_unit.py one. Modified docstring and removed an if in define_unit. Commiting to push and then reinstall to test.

rjfarber · 2022-09-16T14:55:27Z

Don't worry I mostly do that because of matplotlib I think 😅 Anyway, I made modifications to the define_unit docstring and now I understand John's concern better I think - since I guess the define_unit was intended as creating brand new units rather than modifying them, so I had to change the wording kind of repetitively. I thought this modification had worked in my original use case, but I'm not sure where that script is. And when trying to write a non-Jupyter example (to add to the examples in the docstring), it doesn't seem like my unit actually gets updated 🤔 Although also the tests/define_unit_test.py is failing now on the "Test custom registry" part. Just thought I'd update you all that I'm back at this! Hopefully I can figure the above problems out with fresh eyes tomorrow :) Best, -------- Ryan

…

On Thu, Sep 15, 2022 at 12:18 PM Clément Robert ***@***.***> wrote: I guess you both will be horrified to hear this, but one my favorite stackoverflow answers is: I don't condone it, but you could just *suppress all warnings* with this: First of all: 😱 Second of all: we're well aware that's a thing, and we'd prefer to not put our users in a position where they would need or want this. — Reply to this email directly, view it on GitHub <#239 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGNJQV63CZXUSZ6JHFLC6O3V6LZXNANCNFSM53ATGXSA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

starting discussion on allowing override for unit existing in registr…

966319a

…y. Not nearly a complete PR since I'm not fully sure this is desirable beyond me (and possibly I should be doing this another way...)

neutrinoceros reviewed Jul 10, 2022

View reviewed changes

Added a temporary test file I'd like to delete and a modification to …

1f20977

…the official test_define_unit.py one. Modified docstring and removed an if in define_unit. Commiting to push and then reinstall to test.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

define_unit override argument #239

define_unit override argument #239

rjfarber commented Jul 8, 2022

neutrinoceros left a comment

neutrinoceros commented Jul 12, 2022

jzuhone commented Jul 15, 2022

neutrinoceros commented Jul 16, 2022

jzuhone commented Jul 16, 2022

rjfarber commented Jul 16, 2022 via email

neutrinoceros commented Jul 16, 2022 •

edited

Loading

jzuhone commented Jul 19, 2022

neutrinoceros commented Jul 19, 2022

jzuhone commented Sep 7, 2022

rjfarber commented Sep 7, 2022 via email

neutrinoceros commented Sep 8, 2022

jzuhone commented Sep 14, 2022

neutrinoceros commented Sep 14, 2022

rjfarber commented Sep 15, 2022 via email

neutrinoceros commented Sep 15, 2022

rjfarber commented Sep 16, 2022 via email

define_unit override argument #239

Are you sure you want to change the base?

define_unit override argument #239

Conversation

rjfarber commented Jul 8, 2022

neutrinoceros left a comment

Choose a reason for hiding this comment

neutrinoceros commented Jul 12, 2022

jzuhone commented Jul 15, 2022

neutrinoceros commented Jul 16, 2022

jzuhone commented Jul 16, 2022

rjfarber commented Jul 16, 2022 via email

neutrinoceros commented Jul 16, 2022 • edited Loading

jzuhone commented Jul 19, 2022

neutrinoceros commented Jul 19, 2022

jzuhone commented Sep 7, 2022

rjfarber commented Sep 7, 2022 via email

neutrinoceros commented Sep 8, 2022

jzuhone commented Sep 14, 2022

neutrinoceros commented Sep 14, 2022

rjfarber commented Sep 15, 2022 via email

neutrinoceros commented Sep 15, 2022

rjfarber commented Sep 16, 2022 via email

neutrinoceros commented Jul 16, 2022 •

edited

Loading