Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for categorical parameter types #9513

Open
xjules opened this issue Dec 11, 2024 · 0 comments
Open

Add support for categorical parameter types #9513

xjules opened this issue Dec 11, 2024 · 0 comments

Comments

@xjules
Copy link
Contributor

xjules commented Dec 11, 2024

Update from meeting (13.12.2024):

One idea that was discussed is to decouple the parameter group concept and make it purely as metadata attribute (see. ScalarParameter). This would essentially make every parameter "independent". Additionally, it requires a different approach when storing parameters into storage, wherein for testing purposes we could start with either pandas or polars.

from dataclasses import dataclass


parameter_configs = List[FieldParameter, SurfaceParameter, ScalarParameter]


@dataclass
class UniformSettings:    
    name: Literal["uniform"] = uniform
    min: float  
    max: float


@dataclass
class PolarsData:
    name: Literal["polars"]
    data_set_file: Path


@dataclass 
class ScalarParameter:
    name: str
    group: None | str = None
    distribution: Union[UniformSettings, ...]
    active: bool
    input_source: Literal["design_matrix", "sampled"]
    dataset_file: Union[PolarsData, XArrayData]

@yngve-sk also suggested potentially to encode categories as integers and store the mapping into a stand-alone entry in parameters.json.


Currently all GEN_KW group param types assume to be numbers (float), which prevents a direct read / write of categorical data from / to the storage.

def save_parameters(

Another issue relates to the actual design matrix group. This often contains a mixture of numerical and categorical parameters, which when being loaded from excel sheet as a DESIGN_MATRIX parameter group would then render all parameters as strings (into the storage that is). Nevertheless when storing such a parameter group, ert utilizes xr.Datasets concept of variables:

ds = xr.Dataset(

and netcdf to actually store the parameters:

def save_parameters(

This again is based on variables, wherein a single Variable equals a single Parameter group, which needs to be of the same type.

What we need is a support for a mixture of types in a single param group or to come up with another strategy of how to do it.

@xjules xjules added this to SCOUT Dec 11, 2024
@xjules xjules converted this from a draft issue Dec 11, 2024
@xjules xjules added sensitivity needs-discussion Issues requiring further discussions labels Dec 11, 2024
@xjules xjules changed the title Add support for categorical parameters Add support for categorical parameter types Dec 11, 2024
@xjules xjules removed the needs-discussion Issues requiring further discussions label Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

No branches or pull requests

2 participants