Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training on other dataset + Error on using run.py #33

Open
ZHANGZ1YUE opened this issue Sep 6, 2022 · 8 comments
Open

Training on other dataset + Error on using run.py #33

ZHANGZ1YUE opened this issue Sep 6, 2022 · 8 comments
Labels
bug Something isn't working

Comments

@ZHANGZ1YUE
Copy link

Describe the bug
Hi! Thank you very much for providing this implementation of dgmr. The model and blocks look very organized and straightforward!
However, I have encountered some issues running your code, mostly due to the complexity of the"run.py" code as it is very complicated to understand the logic (Most likely due to the fact that I do not understand how the dataset looks like).
1: Could you please explain a little bit about how you preprocess the dataset? I hope to run the model on my own dataset so I need to prepare it such that it matches the way you preprocess it. (By the way, if I want to visualize any of the data frames of rainfall, what should I do?)
2: I have encountered error when using run.py. The problem is exactly the same with the following issue (#32 (comment)). I have changed the number of GPU to 1, and the problem still remains. I could not find a solution from the previous issue as the conversation looks a bit confusing. Do I have to manually download something from GCP bucket on my machine? If that's the case, what shall I download and how should I use it?

To Reproduce
python run.py

Expected behavior
Error same with #32 (comment) pumps out

Thank you again for your great work.

@ZHANGZ1YUE ZHANGZ1YUE added the bug Something isn't working label Sep 6, 2022
@jacobbieker
Copy link
Member

Hi,

  1. Its preprocessed the same as in the DGMR code and paper, as in the preprocessing is copy and pasted from their open sourced code, with the only changes being those needed to turn it into PyTorch format. So I would refer to the paper for that.
  2. What is the error exactly? If its the first error in that issue, it seems that the dataset script cannot download the data from GCP for some reason. They originally fixed that by moving to a GPU machine and it was fixed, somehow? But if you are planning on using the model with your own dataset, then you can ignore this, and just swap out the dataset loader in run.py with your dataset loader and the error should go away.

@ZHANGZ1YUE
Copy link
Author

ZHANGZ1YUE commented Sep 6, 2022

Thank you for the quick response!

1: I have understood and will look into their code again.
2: It is indeed the first error in that issue.
123

Before running my own dataset, I hope to see something with the provided dataset from the original code. I am using a Nvidia GPU machine with cuda available. But the error persists. I did not change anything (except gpu from 6 -> 1) but using python run.py directly

@jacobbieker
Copy link
Member

Ah okay, I probably won't have much time for the next month or so to add this, but if you are familiar with HuggingFace datasets, I've converted and uploaded the validation and test sets of the full dataset to here: https://huggingface.co/datasets/openclimatefix/nimrod-uk-1km-validation and https://huggingface.co/datasets/openclimatefix/nimrod-uk-1km-test the training set is a lot larger, so has been taking a lot longer. But you should be able to use those and possibly train on the validation set to see how well it works for you?

@peterdudfield
Copy link
Contributor

@all-contributors please add @jacobbieker for code

@allcontributors
Copy link
Contributor

@peterdudfield

I've put up a pull request to add @jacobbieker! 🎉

@peterdudfield
Copy link
Contributor

@all-contributors please add @ZHANGZ1YUE for bug

@allcontributors
Copy link
Contributor

@peterdudfield

I've put up a pull request to add @ZHANGZ1YUE! 🎉

@ZHANGZ1YUE
Copy link
Author

Thank you very much for providing new information to the dataset! I will be checking on that later when I got time, and Im currently building my own data class with your model code. (Just to say, the code for the model structure is brilliant!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants