You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[2024-11-20 00:29:31,165][INFO] - workflow: [('train', 1)], max: 24 epochs
[2024-11-20 00:29:31,165][INFO] - Checkpoints will be saved to /dssg/home/acct-meyl/meyl-1/XAZ/projects/Occupancy/SparseOcc/outputs/SparseOcc/test_11_20 by HardDiskBackend.
[2024-11-20 00:29:43,765][INFO] - Epoch [1/24][1/3517] loss: 438.24, eta: 12 days, 7:02:32, time: 12.58s, data: 7520ms, mem: 20194M
Traceback (most recent call last):
File "train.py", line 181, in
main()
File "train.py", line 177, in main
runner.run([train_loader], [('train', 1)])
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 136, in run
epoch_runner(data_loaders[i], **kwargs)
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 49, in train
for i, data_batch in enumerate(self.data_loader):
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 681, in next
data = self._next_data()
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1356, in _next_data
return self._process_data(data)
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1402, in _process_data
data.reraise()
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/_utils.py", line 461, in reraise
raise exception
TypeError: Caught TypeError in DataLoader worker process 2.
Original Traceback (most recent call last):
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/dssg/home/acct-meyl/meyl-1/XAZ/mmlabs/mmdet3d_1.0.0rc6/mmdetection3d/mmdet3d/datasets/custom_3d.py", line 435, in getitem
data = self.prepare_train_data(idx)
File "/dssg/home/acct-meyl/meyl-1/XAZ/mmlabs/mmdet3d_1.0.0rc6/mmdetection3d/mmdet3d/datasets/custom_3d.py", line 229, in prepare_train_data
example = self.pipeline(input_dict)
File "/dssg/home/acct-meyl/meyl-1/XAZ/mmlabs/mmdet3d_1.0.0rc6/mmdetection3d/mmdet3d/datasets/pipelines/compose.py", line 49, in call
data = t(data)
File "/dssg/home/acct-meyl/meyl-1/XAZ/projects/Occupancy/SparseOcc/loaders/pipelines/transforms.py", line 229, in call
img = Image.fromarray(np.uint8(results['img'][i]))
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
[2024-11-20 00:29:46,034][INFO] - Epoch [1/24][2/3517] loss: 237.76, eta: 2 days, 5:14:18, time: 2.27s, data: 16ms, mem: 21039M
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 2968278 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 2968279 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 2968281 closing signal SIGTERM
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 2 (pid: 2968280) of binary: /dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/bin/python
Traceback (most recent call last):
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/bin/torchrun", line 33, in
sys.exit(load_entry_point('torch==1.12.1', 'console_scripts', 'torchrun')())
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 345, in wrapper
return f(*args, **kwargs)
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
请问这个报错是怎么回事呢?
The text was updated successfully, but these errors were encountered:
[2024-11-20 00:29:31,165][INFO] - workflow: [('train', 1)], max: 24 epochs
[2024-11-20 00:29:31,165][INFO] - Checkpoints will be saved to /dssg/home/acct-meyl/meyl-1/XAZ/projects/Occupancy/SparseOcc/outputs/SparseOcc/test_11_20 by HardDiskBackend.
[2024-11-20 00:29:43,765][INFO] - Epoch [1/24][1/3517] loss: 438.24, eta: 12 days, 7:02:32, time: 12.58s, data: 7520ms, mem: 20194M
Traceback (most recent call last):
File "train.py", line 181, in
main()
File "train.py", line 177, in main
runner.run([train_loader], [('train', 1)])
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 136, in run
epoch_runner(data_loaders[i], **kwargs)
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 49, in train
for i, data_batch in enumerate(self.data_loader):
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 681, in next
data = self._next_data()
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1356, in _next_data
return self._process_data(data)
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1402, in _process_data
data.reraise()
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/_utils.py", line 461, in reraise
raise exception
TypeError: Caught TypeError in DataLoader worker process 2.
Original Traceback (most recent call last):
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/dssg/home/acct-meyl/meyl-1/XAZ/mmlabs/mmdet3d_1.0.0rc6/mmdetection3d/mmdet3d/datasets/custom_3d.py", line 435, in getitem
data = self.prepare_train_data(idx)
File "/dssg/home/acct-meyl/meyl-1/XAZ/mmlabs/mmdet3d_1.0.0rc6/mmdetection3d/mmdet3d/datasets/custom_3d.py", line 229, in prepare_train_data
example = self.pipeline(input_dict)
File "/dssg/home/acct-meyl/meyl-1/XAZ/mmlabs/mmdet3d_1.0.0rc6/mmdetection3d/mmdet3d/datasets/pipelines/compose.py", line 49, in call
data = t(data)
File "/dssg/home/acct-meyl/meyl-1/XAZ/projects/Occupancy/SparseOcc/loaders/pipelines/transforms.py", line 229, in call
img = Image.fromarray(np.uint8(results['img'][i]))
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
[2024-11-20 00:29:46,034][INFO] - Epoch [1/24][2/3517] loss: 237.76, eta: 2 days, 5:14:18, time: 2.27s, data: 16ms, mem: 21039M
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 2968278 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 2968279 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 2968281 closing signal SIGTERM
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 2 (pid: 2968280) of binary: /dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/bin/python
Traceback (most recent call last):
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/bin/torchrun", line 33, in
sys.exit(load_entry_point('torch==1.12.1', 'console_scripts', 'torchrun')())
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 345, in wrapper
return f(*args, **kwargs)
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/dssg/home/acct-meyl/meyl-1/.conda/envs/mm3d/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
请问这个报错是怎么回事呢?
The text was updated successfully, but these errors were encountered: