You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the batch_size is not taken into account when creating batches for sites in the save_batches script, resulting in one example in each netcdf file so this has to be accounted for when creating number of batches (it will actually be number of examples so if batch size is 8 need to create x8 number of examples)
To note this also affects batch creation for WindNet.
To Reproduce
Run the save_batches script but with the renewable variable in the config set to a value such as "pv _india" that uses the pvnet_site_datapipe function
Expected behavior
For each netcdf file ouptutted to train and val folders to have batch_size number of examples in it, rather than one example per netcdf filfe.
Additional context
This can be changed in this repo or by editing the pvnet_site.py file in the pvnet_site_datapipe function in the ocf_datapipes repo to take into batch_size as a parameter. However there would need to be testing that loading batches for model training through this function does not break given multiple examples now in each batch.
The text was updated successfully, but these errors were encountered:
Was the conclusiong of this, that its easier to save examples, rather than batches, becasue th eexamples are .netcdf files and batches are numpy file.
Do we need to just renmae things from batches to examples to make this clear?
So I don't think this is urgent but I wanted there to be some visibility of this deviation from previous behaviour, it only impacts batch creation/number of examples rather than since batches are recreated from samples using batch size correctly and if you're aware that it's number of samples rather than number of batches it's fine. Renaming would be a little tricky because this is just the site/India PVNet side but we could just add some comments somewhere to make this clearer.
But IMO to keep things consistent with UK PVNet we should make the changes so that batches created are saved with the number of examples specified in the batch size, just will take a bit of work to make sure that doesn't break the conversion into numpy batches that happens when training.
Describe the bug
Currently the batch_size is not taken into account when creating batches for sites in the save_batches script, resulting in one example in each netcdf file so this has to be accounted for when creating number of batches (it will actually be number of examples so if batch size is 8 need to create x8 number of examples)
To note this also affects batch creation for WindNet.
To Reproduce
Run the save_batches script but with the renewable variable in the config set to a value such as "pv _india" that uses the pvnet_site_datapipe function
Expected behavior
For each netcdf file ouptutted to train and val folders to have batch_size number of examples in it, rather than one example per netcdf filfe.
Additional context
This can be changed in this repo or by editing the pvnet_site.py file in the pvnet_site_datapipe function in the ocf_datapipes repo to take into batch_size as a parameter. However there would need to be testing that loading batches for model training through this function does not break given multiple examples now in each batch.
The text was updated successfully, but these errors were encountered: