Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support the online correlator in the cli #263

Open
harveydevereux opened this issue Aug 9, 2024 · 3 comments
Open

Support the online correlator in the cli #263

harveydevereux opened this issue Aug 9, 2024 · 3 comments
Assignees
Labels
enhancement New/improved feature or request

Comments

@harveydevereux
Copy link
Collaborator

We can support requesting the built-in correlations from the cli, so that it is not just an interface for direct python users. I'll need to look into how this is done elsewhere in janus.

We could have user composeable correlations using dicts?

--correlate {a: stress_xy, b: stress_xy, blocks: 1, points: 10, ...}

which should be easily translatable.

Or even things simply like

--vaf

but that may clash with post-process?


Questions

  • How can we handle correlator parameters for cli requested correlations? It looks like we can parse dicts from cli.
  • How far shall we go, do we want "high-level" things like the VAF, SAF (stress auto-correlation function) etc. or allow for more complex arguments like above so a user may compose observables together. Or both.
@harveydevereux harveydevereux self-assigned this Aug 9, 2024
@ElliottKasoar ElliottKasoar added the enhancement New/improved feature or request label Aug 9, 2024
@ElliottKasoar
Copy link
Member

Questions

  • How can we handle correlator parameters for cli requested correlations? It looks like we can parse dicts from cli.

Yes, there's a TyperDict class that we currently use, which allows dictionaries to be passed as --option "{'key': value}", which you can see examples of for all kwargs. Note that they need an extra level parsing via janus_core.cli.utils.parse_typer_dicts to convert them to actual dictionaries.

With configs, these are also parsed as expected with nesting e.g.

key1:
  key2: value

I would image this is the most flexible thing to use, and means we don't pollute the options too much, but there may be other alternatives.

  • How far shall we go, do we want "high-level" things like the VAF, SAF (stress auto-correlation function) etc. or allow for more complex arguments like above so a user may compose observables together. Or both.

With a kwargs input as described above, is there anything we couldn't do, in general?

The only immediate limitation I can think of is of the observables, which I think will be limited to named functions that we look up, as we've discussed previously, but in theory anything that the function takes should be generally passable through (potentially nested) kwargs, shouldn't it?

@harveydevereux
Copy link
Collaborator Author

With a kwargs input as described above, is there anything we couldn't do, in general?

The only immediate limitation I can think of is of the observables, which I think will be limited to named functions that we look up, as we've discussed previously, but in theory anything that the function takes should be generally passable through (potentially nested) kwargs, shouldn't it?

We will have to limit to the named built-ins. It can all be done from parsing the kwargs there is no problem. Just this could lead to (powerful) but extremely long cli agrument inputs though e.g.

--correlation_kwargs "[{'a': ShearStress, 'b': ShearStress, 'name': 'shearstress', 'blocks': 4, 'points': 100, 'averaging': 2, 'update_frequency': 10}, {'a': Velocity, 'b': Velocity, 'name': 'vaf', 'blocks': 1, 'points': 5000, 'averaging': 2, 'update_frequency': 2}, ...]"

Looking at the cli examples most things are very simple, --arch mace_mp requires only a little explanation for example.

I suppose I'm asking is there any appetite for common shorthands?

That could be fraught with difficulties so maybe it is best left verbose. But E.g. if someones wants the VAF for 5 ps there could be some special option --vaf 5 or --vaf "{'max_lag': {5, 'units': 'ps'}}" or something. It could even maybe have an option to do post-process or online.

Max correlation lag is of course all controllable from the parameters blocks, points, averaging and update_frequency. And I suppose this complicates things more for cli users, so maybe I should not worry about it.

But looking at the cli examples most things are very simple, --arch mace_mp requires only a little explanation for example.

One thing that would help is perhaps sensible defaults for the parameters, which I assume can be done if the dict if partial, and also only needing to specify 'a': xxx for an auto-correlation.

@ElliottKasoar
Copy link
Member

We will have to limit to the named built-ins. It can all be done from parsing the kwargs there is no problem. Just this could lead to (powerful) but extremely long cli agrument inputs though e.g.

--correlation_kwargs "[{'a': ShearStress, 'b': ShearStress, 'name': 'shearstress', 'blocks': 4, 'points': 100, 'averaging': 2, 'update_frequency': 10}, {'a': Velocity, 'b': Velocity, 'name': 'vaf', 'blocks': 1, 'points': 5000, 'averaging': 2, 'update_frequency': 2}, ...]"

Is there an alternative? In theory I suppose you could create an interface for the observables that allows them to be passed via a tuple/list or something as well as their keywords, to simplify the input if you pass them in the correct order (along the lines of ("ShearStress", 'shearstress', 4, 100, 2, 10)), but if all of that information is needed, I'm not sure there's a way around something relatively complicated. A default name might be inferable though.

Looking at the cli examples most things are very simple, --arch mace_mp requires only a little explanation for example.

I suppose I'm asking is there any appetite for common shorthands?

That could be fraught with difficulties so maybe it is best left verbose. But E.g. if someones wants the VAF for 5 ps there could be some special option --vaf 5 or --vaf "{'max_lag': {5, 'units': 'ps'}}" or something. It could even maybe have an option to do post-process or online.

Max correlation lag is of course all controllable from the parameters blocks, points, averaging and update_frequency. And I suppose this complicates things more for cli users, so maybe I should not worry about it.

But looking at the cli examples most things are very simple, --arch mace_mp requires only a little explanation for example.

One thing that would help is perhaps sensible defaults for the parameters, which I assume can be done if the dict if partial, and also only needing to specify 'a': xxx for an auto-correlation.

I wrote the above before I read the rest of this... I think we can include shorthands/defaults where it makes sense, but I think at least for now they should be nested within correlation_kwargs. If, within correlation_kwargs, shorthands make sense, that's fine with me, but I can't think of anything that would make sense as a top level option that wouldn't also make sense within correlation_kwargs.

Ideally I think we want to be able to control as much as possible, as generally we end up coming back to add it later if we don't immediately, but that doesn't mean they all always need to be specified.

I think we also have to accept that this is inevitably more complicated than some other options, but yes defaults and shorthands might simplify some things.

I tend to unpack the dicts when they go to their relevant Python function, so none of them ever have to be complete at the moment - they almost provide a way to update defaults.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New/improved feature or request
Projects
None yet
Development

No branches or pull requests

2 participants