You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am following quite closely the readme file, but I cannot get the benchmark to run. It seems that a few data sets / files are missing? How can I include them?
Here is the result when I run the docker cmd. Everything writes "successful" but first, the whole thing runs in 2-3 sec, and second, results cannot be fetched ("something wrong").
~/deeplearning-benchmark/pytorch$ docker run \
--rm --shm-size=128g \
--gpus '"device=0"' \
-v ~/DeepLearningExamples/PyTorch:/workspace/benchmark \
-v ~/data:/data \
-v $(pwd)"/scripts":/scripts \
-v $(pwd)"/results":/results \
nvcr.io/nvidia/${NAME_NGC} \
/bin/bash -c "cp -r /scripts/* /workspace; ./run_benchmark.sh 3090_v1 resnet50 1500"
=============
== PyTorch ==
=============
NVIDIA Release 22.10 (build 46164382)
PyTorch Version 1.13.0a0+d0d6b1f
Container image Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Copyright (c) 2014-2022 Facebook Inc.
Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert)
Copyright (c) 2012-2014 Deepmind Technologies (Koray Kavukcuoglu)
Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu)
Copyright (c) 2011-2013 NYU (Clement Farabet)
Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston)
Copyright (c) 2006 Idiap Research Institute (Samy Bengio)
Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz)
Copyright (c) 2015 Google Inc.
Copyright (c) 2015 Yangqing Jia
Copyright (c) 2013-2016 The Caffe contributors
All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting termcolor
Downloading termcolor-2.3.0-py3-none-any.whl (6.9 kB)
Installing collected packages: termcolor
Successfully installed termcolor-2.3.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting git+https://github.com/NVIDIA/dllogger
Cloning https://github.com/NVIDIA/dllogger to /tmp/pip-req-build-xx58adtb
Running command git clone -q https://github.com/NVIDIA/dllogger /tmp/pip-req-build-xx58adtb
Resolved https://github.com/NVIDIA/dllogger to commit 0540a43971f4a8a16693a9de9de73c1072020769
Building wheels for collected packages: DLLogger
Building wheel for DLLogger (setup.py): started
Building wheel for DLLogger (setup.py): finished with status 'done'
Created wheel for DLLogger: filename=DLLogger-1.0.0-py3-none-any.whl size=5670 sha256=9a5ccfa3c6907044bbf0ca1d6de8aa439dc4c5d50a432f74a47ab6a000696e85
Stored in directory: /tmp/pip-ephem-wheel-cache-gx4hhblw/wheels/ad/94/cf/8f3396cb8d62d532695ec557e193fada55cd366e14fd9a02be
Successfully built DLLogger
Installing collected packages: DLLogger
Successfully installed DLLogger-1.0.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
cp: cannot create regular file 'benchmark/LanguageModeling/BERT': No such file or directory
cp: cannot create regular file 'benchmark/SpeechSynthesis/Tacotron2': No such file or directory
3090_v1
PyTorch_resnet50_AMP started:
/workspace /workspace
./benchmark_pytorch.sh: line 240: cd: benchmark/Classification/ConvNets: No such file or directory
************************************************************
/data/imagenet --arch resnet50 --amp --static-loss-scale 256 --epochs 2 --prof 100 --batch-size 448 --raport-file benchmark.json --print-freq 1 --training-only --data-backend synthetic
************************************************************
python: can't open file './multiproc.py': [Errno 2] No such file or directory
PyTorch_resnet50_AMP ended.
/workspace
PyTorch_resnet50_FP32 started:
/workspace /workspace
./benchmark_pytorch.sh: line 240: cd: benchmark/Classification/ConvNets: No such file or directory
************************************************************
/data/imagenet --arch resnet50 --epochs 2 --prof 100 --batch-size 224 --raport-file benchmark.json --print-freq 1 --training-only --data-backend synthetic
************************************************************
python: can't open file './multiproc.py': [Errno 2] No such file or directory
PyTorch_resnet50_FP32 ended.
/workspace
Check results folder : /results/3090_v1
['PyTorch_SSD_AMP', 'PyTorch_SSD_FP32', 'PyTorch_bert_base_squad_FP16', 'PyTorch_bert_base_squad_FP32', 'PyTorch_bert_large_squad_FP16', 'PyTorch_bert_large_squad_FP32', 'PyTorch_gnmt_FP16', 'PyTorch_gnmt_FP32', 'PyTorch_ncf_FP16', 'PyTorch_ncf_FP32', 'PyTorch_resnet50_AMP', 'PyTorch_resnet50_FP32', 'PyTorch_tacotron2_FP16', 'PyTorch_tacotron2_FP32', 'PyTorch_transformerxlbase_FP16', 'PyTorch_transformerxlbase_FP32', 'PyTorch_transformerxllarge_FP16', 'PyTorch_transformerxllarge_FP32', 'PyTorch_waveglow_FP16', 'PyTorch_waveglow_FP32', 'summary.txt', 'sys_pytorch.txt']
PyTorch_SSD_AMP : sucessful
PyTorch_SSD_FP32 : sucessful
PyTorch_bert_base_squad_FP16 : sucessful
PyTorch_bert_base_squad_FP32 : sucessful
PyTorch_bert_large_squad_FP16 : sucessful
PyTorch_bert_large_squad_FP32 : sucessful
PyTorch_gnmt_FP16 : sucessful
PyTorch_gnmt_FP32 : sucessful
PyTorch_ncf_FP16 : sucessful
PyTorch_ncf_FP32 : sucessful
PyTorch_resnet50_AMP : sucessful
PyTorch_resnet50_AMP : sucessful
PyTorch_resnet50_AMP : sucessful
PyTorch_resnet50_FP32 : sucessful
PyTorch_resnet50_FP32 : sucessful
PyTorch_resnet50_FP32 : sucessful
PyTorch_tacotron2_FP16 : sucessful
PyTorch_tacotron2_FP32 : sucessful
PyTorch_transformerxlbase_FP16 : sucessful
PyTorch_transformerxlbase_FP32 : sucessful
PyTorch_transformerxllarge_FP16 : sucessful
PyTorch_transformerxllarge_FP32 : sucessful
PyTorch_waveglow_FP16 : sucessful
PyTorch_waveglow_FP32 : sucessful
The text was updated successfully, but these errors were encountered:
I am following quite closely the readme file, but I cannot get the benchmark to run. It seems that a few data sets / files are missing? How can I include them?
Here is the result when I run the docker cmd. Everything writes "successful" but first, the whole thing runs in 2-3 sec, and second, results cannot be fetched ("something wrong").
The text was updated successfully, but these errors were encountered: