Support the online correlator in the cli #263

harveydevereux · 2024-08-09T15:26:45Z

We can support requesting the built-in correlations from the cli, so that it is not just an interface for direct python users. I'll need to look into how this is done elsewhere in janus.

We could have user composeable correlations using dicts?

--correlate {a: stress_xy, b: stress_xy, blocks: 1, points: 10, ...}

which should be easily translatable.

Or even things simply like

--vaf

but that may clash with post-process?

Questions

How can we handle correlator parameters for cli requested correlations? It looks like we can parse dicts from cli.
How far shall we go, do we want "high-level" things like the VAF, SAF (stress auto-correlation function) etc. or allow for more complex arguments like above so a user may compose observables together. Or both.

The text was updated successfully, but these errors were encountered:

ElliottKasoar · 2024-08-13T16:08:11Z

Questions

How can we handle correlator parameters for cli requested correlations? It looks like we can parse dicts from cli.

Yes, there's a TyperDict class that we currently use, which allows dictionaries to be passed as --option "{'key': value}", which you can see examples of for all kwargs. Note that they need an extra level parsing via janus_core.cli.utils.parse_typer_dicts to convert them to actual dictionaries.

With configs, these are also parsed as expected with nesting e.g.

key1:
  key2: value

I would image this is the most flexible thing to use, and means we don't pollute the options too much, but there may be other alternatives.

How far shall we go, do we want "high-level" things like the VAF, SAF (stress auto-correlation function) etc. or allow for more complex arguments like above so a user may compose observables together. Or both.

With a kwargs input as described above, is there anything we couldn't do, in general?

The only immediate limitation I can think of is of the observables, which I think will be limited to named functions that we look up, as we've discussed previously, but in theory anything that the function takes should be generally passable through (potentially nested) kwargs, shouldn't it?

harveydevereux · 2024-08-14T08:25:17Z

With a kwargs input as described above, is there anything we couldn't do, in general?

The only immediate limitation I can think of is of the observables, which I think will be limited to named functions that we look up, as we've discussed previously, but in theory anything that the function takes should be generally passable through (potentially nested) kwargs, shouldn't it?

We will have to limit to the named built-ins. It can all be done from parsing the kwargs there is no problem. Just this could lead to (powerful) but extremely long cli agrument inputs though e.g.

--correlation_kwargs "[{'a': ShearStress, 'b': ShearStress, 'name': 'shearstress', 'blocks': 4, 'points': 100, 'averaging': 2, 'update_frequency': 10}, {'a': Velocity, 'b': Velocity, 'name': 'vaf', 'blocks': 1, 'points': 5000, 'averaging': 2, 'update_frequency': 2}, ...]"

Looking at the cli examples most things are very simple, --arch mace_mp requires only a little explanation for example.

I suppose I'm asking is there any appetite for common shorthands?

That could be fraught with difficulties so maybe it is best left verbose. But E.g. if someones wants the VAF for 5 ps there could be some special option --vaf 5 or --vaf "{'max_lag': {5, 'units': 'ps'}}" or something. It could even maybe have an option to do post-process or online.

Max correlation lag is of course all controllable from the parameters blocks, points, averaging and update_frequency. And I suppose this complicates things more for cli users, so maybe I should not worry about it.

But looking at the cli examples most things are very simple, --arch mace_mp requires only a little explanation for example.

One thing that would help is perhaps sensible defaults for the parameters, which I assume can be done if the dict if partial, and also only needing to specify 'a': xxx for an auto-correlation.

ElliottKasoar · 2024-08-16T11:01:17Z

We will have to limit to the named built-ins. It can all be done from parsing the kwargs there is no problem. Just this could lead to (powerful) but extremely long cli agrument inputs though e.g.

--correlation_kwargs "[{'a': ShearStress, 'b': ShearStress, 'name': 'shearstress', 'blocks': 4, 'points': 100, 'averaging': 2, 'update_frequency': 10}, {'a': Velocity, 'b': Velocity, 'name': 'vaf', 'blocks': 1, 'points': 5000, 'averaging': 2, 'update_frequency': 2}, ...]"

Is there an alternative? In theory I suppose you could create an interface for the observables that allows them to be passed via a tuple/list or something as well as their keywords, to simplify the input if you pass them in the correct order (along the lines of ("ShearStress", 'shearstress', 4, 100, 2, 10)), but if all of that information is needed, I'm not sure there's a way around something relatively complicated. A default name might be inferable though.

Looking at the cli examples most things are very simple, --arch mace_mp requires only a little explanation for example.

I suppose I'm asking is there any appetite for common shorthands?

That could be fraught with difficulties so maybe it is best left verbose. But E.g. if someones wants the VAF for 5 ps there could be some special option --vaf 5 or --vaf "{'max_lag': {5, 'units': 'ps'}}" or something. It could even maybe have an option to do post-process or online.

Max correlation lag is of course all controllable from the parameters blocks, points, averaging and update_frequency. And I suppose this complicates things more for cli users, so maybe I should not worry about it.

But looking at the cli examples most things are very simple, --arch mace_mp requires only a little explanation for example.

One thing that would help is perhaps sensible defaults for the parameters, which I assume can be done if the dict if partial, and also only needing to specify 'a': xxx for an auto-correlation.

I wrote the above before I read the rest of this... I think we can include shorthands/defaults where it makes sense, but I think at least for now they should be nested within correlation_kwargs. If, within correlation_kwargs, shorthands make sense, that's fine with me, but I can't think of anything that would make sense as a top level option that wouldn't also make sense within correlation_kwargs.

Ideally I think we want to be able to control as much as possible, as generally we end up coming back to add it later if we don't immediately, but that doesn't mean they all always need to be specified.

I think we also have to accept that this is inevitably more complicated than some other options, but yes defaults and shorthands might simplify some things.

I tend to unpack the dicts when they go to their relevant Python function, so none of them ever have to be complete at the moment - they almost provide a way to update defaults.

harveydevereux self-assigned this Aug 9, 2024

ElliottKasoar added the enhancement New/improved feature or request label Aug 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support the online correlator in the cli #263

Support the online correlator in the cli #263

harveydevereux commented Aug 9, 2024

ElliottKasoar commented Aug 13, 2024

Questions

harveydevereux commented Aug 14, 2024

ElliottKasoar commented Aug 16, 2024

Support the online correlator in the cli #263

Support the online correlator in the cli #263

Comments

harveydevereux commented Aug 9, 2024

Questions

ElliottKasoar commented Aug 13, 2024

Questions

harveydevereux commented Aug 14, 2024

ElliottKasoar commented Aug 16, 2024