About def get_different_scale(self) and RuntimeError: .....size of tensor must match #81

sgflower66 · 2019-08-29T07:53:48Z

I met the same problem after epoch 15. (pytorch1.0, python 3.6.3, my own data, 4 gpus)

through reading previous problems and solutions, I have a guess(uncertain） , the problem is in the dataset.py line53:
def get_different_scale(self):
if self.seen < 4000self.batch_size:
wh = 1332 # 416
elif self.seen < 8000*self.batch_size:
wh = (random.randint(0,3) + 13)32 # 416, 480
elif self.seen < 12000self.batch_size:
wh = (random.randint(0,5) + 12)*32 # 384, ..., 544
.....
so maybe we get different shape in the same batch(dataset.py line 14):
def custom_collate(batch):
data = torch.stack([item[0] for item in batch], 0)
[X,X,416,X] and [X,X,317,X]

although shape transfer happended after self.seen < xx*self.batch_size, maybe the errror due to multi-gpu?
I just have this guess， but I don't know how to solve it, I found there are many people have same question, maybe the problem is important, looking forward to your reply~

Originally posted by @sgflower66 in #55 (comment)

sgflower66 · 2019-08-30T02:32:26Z

Now, I set the sample amount to be an integer multiple of batchsize, and it seems working well (it has been worked until now for more than10 epochs, rather than happened error in 5 epochs )
I hope this way can solve this problem. N

Ginbor · 2019-10-18T06:37:48Z

Helped set the sample amount to be an integer multiple of batchsize + reset model.seen parameter

Ginbor · 2019-11-02T12:14:58Z

in my case, the problem disappeared when I didn't use savemodel() function. I suppose that the problem appears after cur_model.save_weights(). also in my case i have train dataset that len(train_dataset)%batch_size is 0

sgflower66 changed the title ~~I met the same problem after epoch 15. (pytorch1.0, python 3.6.3, my own data, 4 gpus)~~ About def get_different_scale(self) and RuntimeError: .....size of tensor must match Aug 29, 2019

sgflower66 closed this as completed Aug 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About def get_different_scale(self) and RuntimeError: .....size of tensor must match #81

About def get_different_scale(self) and RuntimeError: .....size of tensor must match #81

sgflower66 commented Aug 29, 2019 •

edited

Loading

sgflower66 commented Aug 30, 2019

Ginbor commented Oct 18, 2019

Ginbor commented Nov 2, 2019 •

edited

Loading

About def get_different_scale(self) and RuntimeError: .....size of tensor must match #81

About def get_different_scale(self) and RuntimeError: .....size of tensor must match #81

Comments

sgflower66 commented Aug 29, 2019 • edited Loading

sgflower66 commented Aug 30, 2019

Ginbor commented Oct 18, 2019

Ginbor commented Nov 2, 2019 • edited Loading

sgflower66 commented Aug 29, 2019 •

edited

Loading

Ginbor commented Nov 2, 2019 •

edited

Loading