Added icon to nwp providers #72

gabrielelibardi · 2024-10-24T17:55:32Z

Pull Request

Description

I added icon to the nwp providers. Specifically the changes should allow to an xarray lazily from a list of .zarr or .zarr.zip paths downloaded from here https://huggingface.co/datasets/openclimatefix/dwd-icon-eu.
This pull request is created to address this issue #66 (comment).
In principle this should work even if ones uses remote paths directly to the the .zarr.zip files however because of the many request made to the hugging face server in a short time this may result in a 429 Error. There are ways around this as mentioned in the issue, that have not yet been implemented.

Fixes #

How Has This Been Tested?

see the test_load_icon_eu added to ocf-data-sampler/tests/load/test_load_nwp.py

[X ] Yes

Checklist:

My code follows OCF's coding style guidelines
I have performed a self-review of my own code
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
I have checked my code and corrected any misspellings

Sukh-P · 2024-10-28T11:04:01Z

ocf_data_sampler/load/nwp/providers/icon.py

+
+from ocf_data_sampler.load.nwp.providers.utils import open_zarr_paths
+
+def transform_to_channels(nwp : xr.Dataset):


Am I right in thinking that the input here is an xarray Dataset which has multiple data variables for each NWP variable and we want to go from that to a DataArray (e.g. one data variable but an extra channel dimension?)

I think a simpler approach might be to do something like what is done here https://github.com/openclimatefix/ocf_datapipes/blob/main/ocf_datapipes/load/nwp/providers/gfs.py#L26 where we use to_array() on the Dataset to convert it to a DataArray and then rename the variable dimension which is created with to_array() to channel

But I may have misunderstood the intention/need for this function

you are perfectly right, I deleted this and use to_array() instead, thx for pointing it out!

Sukh-P · 2024-10-28T11:10:22Z

Thanks for creating this PR and the great work already done on trying to support ICON data in this library!

Something to note is that if this is added in as is that people may assume this library already supports ICON data but without some normalisation constants added and ICON listed as an NWP provider here creating samples from it won't work, so my suggestion is that either this is added in this PR, or in a subsequent PR or this outstanding work is clearly documented in a Github issue or README, thanks!

gabrielelibardi · 2024-10-29T20:46:48Z

Thanks for creating this PR and the great work already done on trying to support ICON data in this library!

Something to note is that if this is added in as is that people may assume this library already supports ICON data but without some normalisation constants added and ICON listed as an NWP provider here creating samples from it won't work, so my suggestion is that either this is added in this PR, or in a subsequent PR or this outstanding work is clearly documented in a Github issue or README, thanks!

I can compute the std and mean constants, do you have a script to do this for others NWP? How large of a sample do you take?

Sukh-P · 2024-11-01T14:38:12Z

I can compute the std and mean constants, do you have a script to do this for others NWP? How large of a sample do you take?

Thanks, that would be great! So I don't think we have a script in Github currently so just created this gist to share some example code of how I have calculated some of the normalisation stats previously, in the example I used 200 samples I think that would be fine for this too

gabrielelibardi added 4 commits October 23, 2024 15:37

icon data for the load test

c00fc9c

added icon to providers

7fddd03

added test_load_icon_eu

f4828a6

added path to icon-eu dataset

0240bda

Sukh-P reviewed Oct 28, 2024

View reviewed changes

gabrielelibardi added 3 commits October 29, 2024 21:40

got rid of transform_to_channels using to_array instead

a68183f

cutting the steps to only include data with hourly granularity

3efd9f1

adjusted test for icon_eu

dee99fb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added icon to nwp providers #72

Added icon to nwp providers #72

gabrielelibardi commented Oct 24, 2024

Sukh-P Oct 28, 2024

gabrielelibardi Oct 29, 2024

Sukh-P commented Oct 28, 2024

gabrielelibardi commented Oct 29, 2024

Sukh-P commented Nov 1, 2024 •

edited

Loading


		from ocf_data_sampler.load.nwp.providers.utils import open_zarr_paths

		def transform_to_channels(nwp : xr.Dataset):

Added icon to nwp providers #72

Are you sure you want to change the base?

Added icon to nwp providers #72

Conversation

gabrielelibardi commented Oct 24, 2024

Pull Request

Description

How Has This Been Tested?

Checklist:

Sukh-P Oct 28, 2024

Choose a reason for hiding this comment

gabrielelibardi Oct 29, 2024

Choose a reason for hiding this comment

Sukh-P commented Oct 28, 2024

gabrielelibardi commented Oct 29, 2024

Sukh-P commented Nov 1, 2024 • edited Loading

Sukh-P commented Nov 1, 2024 •

edited

Loading