-
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added icon to nwp providers #72
base: main
Are you sure you want to change the base?
Added icon to nwp providers #72
Conversation
|
||
from ocf_data_sampler.load.nwp.providers.utils import open_zarr_paths | ||
|
||
def transform_to_channels(nwp : xr.Dataset): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I right in thinking that the input here is an xarray Dataset
which has multiple data variables for each NWP variable and we want to go from that to a DataArray
(e.g. one data variable but an extra channel dimension?)
I think a simpler approach might be to do something like what is done here https://github.com/openclimatefix/ocf_datapipes/blob/main/ocf_datapipes/load/nwp/providers/gfs.py#L26 where we use to_array()
on the Dataset to convert it to a DataArray
and then rename the variable
dimension which is created with to_array()
to channel
But I may have misunderstood the intention/need for this function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are perfectly right, I deleted this and use to_array() instead, thx for pointing it out!
Thanks for creating this PR and the great work already done on trying to support ICON data in this library! Something to note is that if this is added in as is that people may assume this library already supports ICON data but without some normalisation constants added and ICON listed as an NWP provider here creating samples from it won't work, so my suggestion is that either this is added in this PR, or in a subsequent PR or this outstanding work is clearly documented in a Github issue or README, thanks! |
I can compute the std and mean constants, do you have a script to do this for others NWP? How large of a sample do you take? |
Thanks, that would be great! So I don't think we have a script in Github currently so just created this gist to share some example code of how I have calculated some of the normalisation stats previously, in the example I used 200 samples I think that would be fine for this too |
Pull Request
Description
I added icon to the nwp providers. Specifically the changes should allow to an xarray lazily from a list of .zarr or .zarr.zip paths downloaded from here https://huggingface.co/datasets/openclimatefix/dwd-icon-eu.
This pull request is created to address this issue #66 (comment).
In principle this should work even if ones uses remote paths directly to the the .zarr.zip files however because of the many request made to the hugging face server in a short time this may result in a 429 Error. There are ways around this as mentioned in the issue, that have not yet been implemented.
Fixes #
How Has This Been Tested?
see the test_load_icon_eu added to ocf-data-sampler/tests/load/test_load_nwp.py
Checklist: