Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open Source PVnet #279

Open
7 tasks
peterdudfield opened this issue Nov 25, 2024 · 0 comments
Open
7 tasks

Open Source PVnet #279

peterdudfield opened this issue Nov 25, 2024 · 0 comments

Comments

@peterdudfield
Copy link
Contributor

peterdudfield commented Nov 25, 2024

The idea is to make sure PVnet is accessible and usable for Open source user and contributors.

Current problems are lots of the NWP data is private.

Other context, we are moving over from ocf-datapipes to ocf-datasampler, so I would vote we try to use ocf-datasample at all points.

Here's a rough list of task lists that need

  • Identify open source gridded NWP that is already in zarr format. Need to make sure we it has enough variables for solar forecasts and enough years (>2) of data. Note that OCF publish satellite data already, which could be used. We try to use at least 1 year for training and 1 year for testing. Better to have more liek 5 years of training, but lets see what we can do.
  • Target data. I would suggest using the UK PVlive national solar generation as target data. This can be retrieved by API and is fairly easy to use. Context: At OCF we actually predict GSP level solar generation and use a small ML model to sum it up. I suggest for this project we start simple, with just the UK national. Also note that the capacity changes, so we need to collect that to, this is possible via API above.
  • pipeline to make batches. use or create a pipeline in ocf-datasample to make batches. This could be the site pipeline we have already, but I can imagine it needs adapting for this problem (but maybe not)
  • Decide on train / test split. I would usggest on 1 year test, and 1-5 years training, depending on if we have the data e.t.c
  • clear script to make batches, do we need extra compute for this? We might have this already.
  • clear script to train model + evaluate model. We might have this already, but we should have clear docs for this.
  • Compare and benchmark to OCF results. We can probably share some highlevel benchmarks for what we can achieve with paid for NWP data, and a slighlt complicated GSP + National models

If all these steps are complete, then it will be ready to use for different countries and different geographies

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant