Split ERT 3 parameters into separate records #1746

pinkwah · 2021-06-08T15:56:58Z

Issue
Resolves #1720

Todo

Write test which uploads a set of parameters and then checks that it is correct
Check that all realisations inside a EnsembleRecord contain the same index and contain valid data
~~Check that all parameter EnsembleRecords are of the Mapping[str, Numeric] type rather than Iterable[Numeric], which is another alternative.~~ We handle this case too now.

pinkwah · 2021-06-08T16:34:54Z

ert3/storage/_storage.py

markusdregi · 2021-06-10T07:08:04Z

ert3/engine/_record.py

@@ -47,4 +47,5 @@ def sample_record(
        record_name=record_name,
        ensemble_record=ensrecord,
        experiment_name=experiment_name,
+        is_parameter=True,


Why does this have to leak out here? Is this not double configuration in the sense that the record is already registered as a parameter upon initialization of experiment?

ERT Storage is informed of the record containing parameters only once. So in that sense it's not a double configuration.

The alternative is to ask ERT Storage again, which is adds an unnecessary request, or to store the information in a global variable. We already know it's a parameter at this point, so being specific at this point is fine. Also we avoid a O(log n) to determine if it's a parameter.

Agree with Zohar, but also understand that is_parameter seems like providing the same information twice. What about making it more general specifying is_compound=True (my favorite word these days! :) ) or just split=True, meaning this record becomes split into split into several ones.

I don't like the word "compound" wrt. parameters since it creates this separation between "parameter" and "compound parameter". We don't need this complexity. What you're calling a "compound parameter" I call a "parameter record", ie. a record that contains one or more parameters.

In my naming convention, both ERT2 and ERT3 publish "parameter records". ERT3 contains multiple parameters within the records, while ERT2 only contains 1 per record. ERT Storage only lets you fetch whole "parameter records". If you want to fetch a single parameter, we require you to fetch the entire record and select the column you want.

Using "compound parameter" kind of makes it sound like ERT Storage needs to consider "parameter" records and "compound parameter" records as separate types, which separate logic for each.

Two comments here :)

First, there is a discrepancy between the notion of a record and a parameter in ert3 and ert-storage. The responsibility of handling this discrepancy should reside entirely on ert3.storage. Afterwards, when we have working software, we should align on the language and definition of a record and parameter. But as long as within the ert3 module (besides ert3.storage), the ert3.data.Record should serve as the only definition of what a record is. And there there is no "compound" nor "parameter" record present there.

Second, upon initialising the experiment ert3 is passing the list of names of all the parameters (or now a dict also containing the keys). That contains the information needed and since the notion of a parameter is not really well-defined in ert3 I would prefer to keep it as much as possible out of the implementation. In particular, it is only used in the export code for now... Also, if we move in the direction currently suggested in #1731 this distinction between a "parameter" or not a "parameter" should disappear again very soon. And then it is also nice to keep the current introduction of the concept as local as possible (currently only for initialising experiment and export).

The distinction between what a parameter and non-parameters will still be an issue even with #1731 . In fact, ERT Storage already uses a similar format for matrix data (what you call "numerical data"). This distinction is very relevant for webviz-ert, which needs to have logic for visualising parameters, which necessitates specifying which records contain parameter data (and then also tying them to a parameter prior distribution if it exists).

markusdregi · 2021-06-10T07:09:01Z

ert3/engine/_run.py

    ert3.storage.init_experiment(
        workspace=workspace_root,
        experiment_name=experiment_name,
-        parameters=parameter_names,
+        parameters=parameters,


Should we call it parameter_layout or similar?

Layout isn't descriptive (ie, the word doesn't mean anything by itself, except in GUIs). parameter_records?

parameter_variables?

According to ert3 each element of the structure is providing the name of a record, together with the index of that particular record. That is the currently used notation within ert3 and I think we should stick with that until we agree on something else ™️

record_index makes sense, parameter_index can make sense. parameter or record followed by structure, layout, blueprint, schema or a similar term might also make sense...

markusdregi · 2021-06-10T07:11:52Z

ert3/storage/_storage.py

 ) -> None:
    if experiment_name is None:
        experiment_name = f"{workspace}.{_ENSEMBLE_RECORDS}"

-    _add_numerical_data(workspace, experiment_name, record_name, ensemble_record)
+    if is_parameter:


Would be nice to see whether we could unify the code path for parameters and not parameters a bit more. But that can be left as a refactoring exercise when everything is working :)

The way we upload parameters and non-parameters is big in part due to the concrete issue this PR is solving, and equinor/ert-storage#126 .

xjules · 2021-06-10T07:15:28Z

ert3/storage/_storage.py

-    return _get_numerical_data(workspace, experiment_name, record_name)
+    param_names = _get_experiment_parameters(workspace, experiment_name)
+    if record_name in param_names:
+        ensemble_records = [


I would change name here, like for instance parameter_instance, record_instance, record_element in order to be clear that we want to build the record from elements

ert3/storage/_storage.py

markusdregi

This is starting to look good! 👏🏻

markusdregi · 2021-06-21T08:34:08Z

ert3/engine/_run.py

+        indices: Set[Union[str, int]] = set()
+        for record in ensemble_record.records:
+            assert record.index is not None
+            indices |= set(record.index)


I think set.extend reads nicer.

set has no extend. The alternative is indices = indices.union(set(record.index)) which I decided against.

Sorry, I meant update 🤷🏻

Updated 👍

markusdregi · 2021-06-21T08:34:47Z

ert3/engine/_run.py

+    parameters_config: ert3.config.ParametersConfig,
+) -> List[str]:
+    if record_source[0] == "storage":
+        assert len(record_source) == 2


Should we move source and name extraction to the top of the function body instead of indexing record_source various places?

markusdregi · 2021-06-21T08:36:26Z

ert3/engine/_run.py

+    parameters: Dict[str, List[str]] = {}
+    for input_record in ensemble.input:
+        record_name = input_record.record
+        record_source = input_record.source.split(".")


You don't have to do it in this PR, but I think it is time to create a delimiter variable for this. It is now present in the ert3.storage module as well. Could you create an issue if you leave it unresolved?

markusdregi · 2021-06-21T08:36:47Z

ert3/storage/_storage.py

        json={
-            "parameter_names": list(parameters),
+            "parameter_names": [
+                f"{record}.{param}"


ref the delimiter comment

markusdregi · 2021-06-21T08:39:18Z

ert3/storage/_storage.py

-) -> ert3.data.Record:
+def _response2records(
+    response_content: bytes, record_type: ert3.data.RecordType
+) -> List[ert3.data.Record]:


This is exactly what the EnsembleRecord type is to represent, perhaps we should return that instead of just the list?

markusdregi · 2021-06-21T09:18:25Z

ert3/storage/_storage.py


-    return _get_numerical_data(workspace, experiment_name, record_name)
+    param_names = _get_experiment_parameters(workspace, experiment_name)
+    if record_name in param_names:


I think it would make sense to separate this entire if-clause into a separate function...

I separated only the combining part.

markusdregi · 2021-06-21T09:51:25Z

ert3/storage/_storage.py



 def get_ensemble_record_names(
-    *, workspace: Path, experiment_name: Optional[str] = None
+    *, workspace: Path, experiment_name: Optional[str] = None, _flatten: bool = True


Are we sure we want to expose this implementation detail only for the purpose of testing? 🤔

markusdregi

Please address my comment on the hanging ".". Besides that this looks really good 🦾

markusdregi · 2021-06-22T11:39:10Z

ert3/storage/_storage.py

+        raise ert3.exceptions.StorageError(response.text)
+    parameters: MutableMapping[str, List[str]] = {}
+    for name in response.json():
+        key, val = name.split(".")


pinkwah · 2021-06-28T07:40:30Z

Jenkins test this please

xjules reviewed Jun 9, 2021

View reviewed changes

ert3/storage/_storage.py Outdated Show resolved Hide resolved

xjules reviewed Jun 9, 2021

View reviewed changes

ert3/storage/_storage.py Outdated Show resolved Hide resolved

pinkwah changed the base branch from master to main June 9, 2021 14:09

markusdregi reviewed Jun 10, 2021

View reviewed changes

xjules reviewed Jun 10, 2021

View reviewed changes

pinkwah force-pushed the ert3-split-params branch from 0495b10 to 585bb0b Compare June 10, 2021 13:29

pinkwah self-assigned this Jun 11, 2021

pinkwah force-pushed the ert3-split-params branch 4 times, most recently from 5f6c64b to 93ab252 Compare June 15, 2021 11:13

pinkwah marked this pull request as ready for review June 15, 2021 11:14

xjules reviewed Jun 18, 2021

View reviewed changes

ert3/storage/_storage.py Outdated Show resolved Hide resolved

xjules reviewed Jun 18, 2021

View reviewed changes

ert3/storage/_storage.py Outdated Show resolved Hide resolved

pinkwah force-pushed the ert3-split-params branch 5 times, most recently from 43d2b7c to 306e35d Compare June 18, 2021 12:27

markusdregi reviewed Jun 21, 2021

View reviewed changes

pinkwah force-pushed the ert3-split-params branch 4 times, most recently from 422ed10 to dec2f37 Compare June 21, 2021 11:26

markusdregi approved these changes Jun 22, 2021

View reviewed changes

markusdregi force-pushed the ert3-split-params branch 2 times, most recently from d26c243 to dec2f37 Compare June 24, 2021 07:07

markusdregi self-assigned this Jun 24, 2021

markusdregi force-pushed the ert3-split-params branch from dec2f37 to aa7119e Compare June 25, 2021 13:58

Split ERT 3 parameters into separate records

bc61f89

markusdregi force-pushed the ert3-split-params branch from aa7119e to bc61f89 Compare June 25, 2021 14:08

markusdregi merged commit 2683d06 into equinor:main Jun 28, 2021

pinkwah deleted the ert3-split-params branch February 15, 2022 08:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split ERT 3 parameters into separate records #1746

Split ERT 3 parameters into separate records #1746

pinkwah commented Jun 8, 2021 •

edited

Loading

pinkwah commented Jun 8, 2021

markusdregi Jun 10, 2021

pinkwah Jun 10, 2021

xjules Jun 10, 2021 •

edited

Loading

pinkwah Jun 11, 2021

markusdregi Jun 11, 2021

pinkwah Jun 14, 2021

markusdregi Jun 10, 2021

pinkwah Jun 10, 2021

xjules Jun 10, 2021

markusdregi Jun 11, 2021

markusdregi Jun 10, 2021

pinkwah Jun 10, 2021

xjules Jun 10, 2021

markusdregi left a comment

markusdregi Jun 21, 2021

pinkwah Jun 21, 2021

markusdregi Jun 21, 2021

pinkwah Jun 21, 2021

markusdregi Jun 21, 2021

markusdregi Jun 21, 2021

markusdregi Jun 21, 2021

markusdregi Jun 21, 2021

markusdregi Jun 21, 2021

pinkwah Jun 21, 2021

markusdregi Jun 21, 2021

markusdregi left a comment

markusdregi Jun 22, 2021

pinkwah commented Jun 28, 2021

Split ERT 3 parameters into separate records #1746

Split ERT 3 parameters into separate records #1746

Conversation

pinkwah commented Jun 8, 2021 • edited Loading

pinkwah commented Jun 8, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xjules Jun 10, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

markusdregi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

markusdregi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pinkwah commented Jun 28, 2021

pinkwah commented Jun 8, 2021 •

edited

Loading

xjules Jun 10, 2021 •

edited

Loading