Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple variables in a signle grib file successfully loaded into dataset and dataframe #290

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

vasarabharat
Copy link

Trying to get multiple parameters in a single grib file using cds api, but failed to load in dataset because of following errors, applied some fixes as describe bellow.

When more then one variable found in grib file and conflicts with existing keys like "time" and "step" etc,. we will add special identifier to that key with _NumberSequence
The Error That will be resolved by this fix:
Traceback (most recent call last):
File "/home/bharat/.local/lib/python3.8/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components
dict_merge(variables, coord_vars)
File "/home/bharat/.local/lib/python3.8/site-packages/cfgrib/dataset.py", line 591, in dict_merge
raise DatasetBuildError(
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='time' value=Variable(dimensions=(), data=1640944800) new_value=Variable(dimensions=(), data=1640908800)

output dataset will have time, time_0, time_1 as a identifier names.

When file containing multiple variables it might possible that we have multiple grib versions in a same file to handle it we are combining those values in a single string.
The Error That will be resolved by this fix:
Traceback (most recent call last):
File "gribtest.py", line 9, in
ds = xr.open_dataset(gribFileName, engine="cfgrib")
File "/home/bharat/Documents/GitHub/xarray/xarray/backends/api.py", line 495, in open_dataset
backend_ds = backend.open_dataset(
File "/home/bharat/.local/lib/python3.8/site-packages/cfgrib/xarray_plugin.py", line 99, in open_dataset
store = CfGribDataStore(
File "/home/bharat/.local/lib/python3.8/site-packages/cfgrib/xarray_plugin.py", line 39, in init
self.ds = opener(filename, **backend_kwargs)
File "/home/bharat/.local/lib/python3.8/site-packages/cfgrib/dataset.py", line 764, in open_file
return open_from_index(index, read_keys, time_dims, extra_coords, **kwargs)
File "/home/bharat/.local/lib/python3.8/site-packages/cfgrib/dataset.py", line 706, in open_from_index
dimensions, variables, attributes, encoding = build_dataset_components(
File "/home/bharat/.local/lib/python3.8/site-packages/cfgrib/dataset.py", line 675, in build_dataset_components
attributes = build_dataset_attributes(index, filter_by_keys, encoding)
File "/home/bharat/.local/lib/python3.8/site-packages/cfgrib/dataset.py", line 599, in build_dataset_attributes
attributes = enforce_unique_attributes(index, GLOBAL_ATTRIBUTES_KEYS, filter_by_keys)
File "/home/bharat/.local/lib/python3.8/site-packages/cfgrib/dataset.py", line 273, in enforce_unique_attributes
raise DatasetBuildError("multiple values for key %r" % key, key, fbks)
cfgrib.dataset.DatasetBuildError: multiple values for key 'edition'

Final Output dataset will looks like this (downloaded all variables for 2018-Jan-01 10:00):

Dimensions: (latitude: 1801, longitude: 3600)
Coordinates:
number int64 ...
time datetime64[ns] ...
step timedelta64[ns] ...
surface float64 ...

  • latitude (latitude) float64 90.0 89.9 89.8 ... -89.9 -90.0
  • longitude (longitude) float64 0.0 0.1 0.2 ... 359.7 359.8 359.9
    valid_time datetime64[ns] ...
    depthBelowLandLayer float64 ...
    Data variables: (12/98)
    u10 (latitude, longitude) float32 ...
    v10 (latitude, longitude) float32 ...
    d2m (latitude, longitude) float32 ...
    t2m (latitude, longitude) float32 ...
    evabs (latitude, longitude) float32 ...
    evaow (latitude, longitude) float32 ...
    ... ...
    depthBelowLandLayer_4 float64 ...
    swvl3 (latitude, longitude) float32 ...
    time_20 datetime64[ns] ...
    step_20 timedelta64[ns] ...
    depthBelowLandLayer_5 float64 ...
    swvl4 (latitude, longitude) float32 ...
    Attributes:
    GRIB_edition: 1 AND 2
    GRIB_centre: ecmf
    GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts
    GRIB_subCentre: 0
    Conventions: CF-1.7
    institution: European Centre for Medium-Range Weather Forecasts
    history: 2022-03-03T15:23 GRIB to CDM+CF via cfgrib-0.9.1...

if we convert this into dataframe the then it will looks like
latitude longitude number time step surface
0 90.0 0.0 0 2018-01-01 0 days 10:00:00 0.0
1 90.0 0.1 0 2018-01-01 0 days 10:00:00 0.0
2 90.0 0.2 0 2018-01-01 0 days 10:00:00 0.0
3 90.0 0.3 0 2018-01-01 0 days 10:00:00 0.0
4 90.0 0.4 0 2018-01-01 0 days 10:00:00 0.0
... ... ... ... ... ... ...
6483595 -90.0 359.5 0 2018-01-01 0 days 10:00:00 0.0
6483596 -90.0 359.6 0 2018-01-01 0 days 10:00:00 0.0
6483597 -90.0 359.7 0 2018-01-01 0 days 10:00:00 0.0
6483598 -90.0 359.8 0 2018-01-01 0 days 10:00:00 0.0
6483599 -90.0 359.9 0 2018-01-01 0 days 10:00:00 0.0

             valid_time       u10       v10         d2m         t2m  \

0 2018-01-01 10:00:00 NaN NaN NaN NaN
1 2018-01-01 10:00:00 NaN NaN NaN NaN
2 2018-01-01 10:00:00 NaN NaN NaN NaN
3 2018-01-01 10:00:00 NaN NaN NaN NaN
4 2018-01-01 10:00:00 NaN NaN NaN NaN
... ... ... ... ... ...
6483595 2018-01-01 10:00:00 -6.011181 1.568486 244.020462 247.884399
6483596 2018-01-01 10:00:00 -6.011181 1.568486 244.020462 247.884399
6483597 2018-01-01 10:00:00 -6.011181 1.568486 244.020462 247.884399
6483598 2018-01-01 10:00:00 -6.011181 1.568486 244.020462 247.884399
6483599 2018-01-01 10:00:00 -6.011181 1.568486 244.020462 247.884399

     evabs  evaow  evatc  evavt   fal              time_0 step_0  \

0 NaN NaN NaN NaN NaN 2018-01-01 10:00:00 0 days
1 NaN NaN NaN NaN NaN 2018-01-01 10:00:00 0 days
2 NaN NaN NaN NaN NaN 2018-01-01 10:00:00 0 days
3 NaN NaN NaN NaN NaN 2018-01-01 10:00:00 0 days
4 NaN NaN NaN NaN NaN 2018-01-01 10:00:00 0 days
... ... ... ... ... ... ... ...
6483595 0.0 0.0 0.0 0.0 0.85 2018-01-01 10:00:00 0 days
6483596 0.0 0.0 0.0 0.0 0.85 2018-01-01 10:00:00 0 days
6483597 0.0 0.0 0.0 0.0 0.85 2018-01-01 10:00:00 0 days
6483598 0.0 0.0 0.0 0.0 0.85 2018-01-01 10:00:00 0 days
6483599 0.0 0.0 0.0 0.0 0.85 2018-01-01 10:00:00 0 days

           lblt              time_1 step_1  licd              time_2  \

0 NaN 2018-01-01 10:00:00 0 days NaN 2018-01-01 10:00:00
1 NaN 2018-01-01 10:00:00 0 days NaN 2018-01-01 10:00:00
2 NaN 2018-01-01 10:00:00 0 days NaN 2018-01-01 10:00:00
3 NaN 2018-01-01 10:00:00 0 days NaN 2018-01-01 10:00:00
4 NaN 2018-01-01 10:00:00 0 days NaN 2018-01-01 10:00:00
... ... ... ... ... ...
6483595 277.130127 2018-01-01 10:00:00 0 days 3.0 2018-01-01 10:00:00
6483596 277.130127 2018-01-01 10:00:00 0 days 3.0 2018-01-01 10:00:00
6483597 277.130127 2018-01-01 10:00:00 0 days 3.0 2018-01-01 10:00:00
6483598 277.130127 2018-01-01 10:00:00 0 days 3.0 2018-01-01 10:00:00
6483599 277.130127 2018-01-01 10:00:00 0 days 3.0 2018-01-01 10:00:00

    step_2        lict              time_3 step_3      lmld  \

0 0 days NaN 2018-01-01 10:00:00 0 days NaN
1 0 days NaN 2018-01-01 10:00:00 0 days NaN
2 0 days NaN 2018-01-01 10:00:00 0 days NaN
3 0 days NaN 2018-01-01 10:00:00 0 days NaN
4 0 days NaN 2018-01-01 10:00:00 0 days NaN
... ... ... ... ... ...
6483595 0 days 251.126831 2018-01-01 10:00:00 0 days 0.009766
6483596 0 days 251.126831 2018-01-01 10:00:00 0 days 0.009766
6483597 0 days 251.126831 2018-01-01 10:00:00 0 days 0.009766
6483598 0 days 251.126831 2018-01-01 10:00:00 0 days 0.009766
6483599 0 days 251.126831 2018-01-01 10:00:00 0 days 0.009766

                 time_4 step_4        lmlt              time_5 step_5  \

0 2018-01-01 10:00:00 0 days NaN 2018-01-01 10:00:00 0 days
1 2018-01-01 10:00:00 0 days NaN 2018-01-01 10:00:00 0 days
2 2018-01-01 10:00:00 0 days NaN 2018-01-01 10:00:00 0 days
3 2018-01-01 10:00:00 0 days NaN 2018-01-01 10:00:00 0 days
4 2018-01-01 10:00:00 0 days NaN 2018-01-01 10:00:00 0 days

Continue..

when more then one variable found in grib file and conflicts with existing keys like "time" and "step" we will add special identifier to that key with _NumberSequence

When file containing multiple variables it might possible that we have multiple grib versions in a same file to handle it we are combining those values in a single string.
@FussyDuck
Copy link

FussyDuck commented Mar 3, 2022

CLA assistant check
All committers have signed the CLA.

@vasarabharat
Copy link
Author

Hi @alexamici, can you review this PR which fixes issue for loading grib file with multiple weather variables.

@iainrussell
Copy link
Member

Hi @vasarabharat, testing your branch on my local data does not give similar results to you (i.e. GRIB file with multiple variables on different steps still does not load) - could you provide the request you used to get data from the CDS, or else post a link to a GRIB file that illustrates the problem? Thanks! It may be that this is not the preferred solution to the issue, but I'd like to assess it better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants