You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MODEL_DIR=/workspace/tf-max-series-resnet50v1-5-inference
PRECISION=fp32
OUTPUT_DIR=/home/user/nvmepool/resnet50-log
performance
Running with default batch size of 1024
Precision is fp32
resnet50 v1.5 int8 inference
lspci: Unable to load libkmod resources: error -2
2024-04-16 18:51:30.419047: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-04-16 18:51:30.420511: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-16 18:51:30.450607: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-16 18:51:30.450999: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 AVX_VNNI AMX_TILE AMX_INT8 AMX_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-04-16 18:51:30.994173: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-04-16 18:51:31.290250: I itex/core/wrapper/itex_cpu_wrapper.cc:42] Intel Extension for Tensorflow* AVX512 CPU backend is loaded.
2024-04-16 18:51:31.763276: I itex/core/wrapper/itex_gpu_wrapper.cc:35] Intel Extension for Tensorflow* GPU backend is loaded.
2024-04-16 18:51:31.800082: W itex/core/ops/op_init.cc:58] Op: _QuantizedMaxPool3D is already registered in Tensorflow
2024-04-16 18:51:31.823464: I itex/core/devices/gpu/itex_gpu_runtime.cc:129] Selected platform: Intel(R) Level-Zero
2024-04-16 18:51:31.823492: I itex/core/devices/gpu/itex_gpu_runtime.cc:154] number of sub-devices is zero, expose root device.
2024-04-16 18:51:31.823496: I itex/core/devices/gpu/itex_gpu_runtime.cc:154] number of sub-devices is zero, expose root device.
2024-04-16 18:51:31.823499: I itex/core/devices/gpu/itex_gpu_runtime.cc:154] number of sub-devices is zero, expose root device.
2024-04-16 18:51:31.823502: I itex/core/devices/gpu/itex_gpu_runtime.cc:154] number of sub-devices is zero, expose root device.
using default data type: float32
Run inference
Inference with dummy data.
WARNING:tensorflow:From /workspace/tf-max-series-resnet50v1-5-inference/models/image_recognition/tensorflow/resnet50v1_5/inference/gpu/int8/eval_image_classifier_inference.py:186: FastGFile.__init__ (from tensorflow.python.platform.gfile) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.gfile.GFile.
WARNING:tensorflow:From /usr/local/lib/python3.10/dist-packages/tensorflow/python/tools/strip_unused_lib.py:84: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This API was designed for TensorFlow v1. See https://www.tensorflow.org/guide/migrate for instructions on how to migrate your code to TensorFlow v2.
WARNING:tensorflow:From /usr/local/lib/python3.10/dist-packages/tensorflow/python/tools/optimize_for_inference_lib.py:112: remove_training_nodes (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This API was designed for TensorFlow v1. See https://www.tensorflow.org/guide/migrate for instructions on how to migrate your code to TensorFlow v2.
2024-04-16 18:52:12.985296: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:303] Could not identify NUMA node of platform XPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-04-16 18:52:12.985330: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:303] Could not identify NUMA node of platform XPU ID 1, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-04-16 18:52:12.985335: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:303] Could not identify NUMA node of platform XPU ID 2, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-04-16 18:52:12.985339: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:303] Could not identify NUMA node of platform XPU ID 3, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-04-16 18:52:12.985364: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:269] Created TensorFlow device (/job:localhost/replica:0/task:0/device:XPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: XPU, pci bus id: <undefined>)
2024-04-16 18:52:12.985909: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:269] Created TensorFlow device (/job:localhost/replica:0/task:0/device:XPU:1 with 0 MB memory) -> physical PluggableDevice (device: 1, name: XPU, pci bus id: <undefined>)
2024-04-16 18:52:12.986271: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:269] Created TensorFlow device (/job:localhost/replica:0/task:0/device:XPU:2 with 0 MB memory) -> physical PluggableDevice (device: 2, name: XPU, pci bus id: <undefined>)
2024-04-16 18:52:12.986843: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:269] Created TensorFlow device (/job:localhost/replica:0/task:0/device:XPU:3 with 0 MB memory) -> physical PluggableDevice (device: 3, name: XPU, pci bus id: <undefined>)
2024-04-16 18:52:12.989942: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:303] Could not identify NUMA node of platform XPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-04-16 18:52:12.989963: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:303] Could not identify NUMA node of platform XPU ID 1, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-04-16 18:52:12.989968: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:303] Could not identify NUMA node of platform XPU ID 2, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-04-16 18:52:12.989971: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:303] Could not identify NUMA node of platform XPU ID 3, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-04-16 18:52:12.989982: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:269] Created TensorFlow device (/job:localhost/replica:0/task:0/device:XPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: XPU, pci bus id: <undefined>)
2024-04-16 18:52:12.989992: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:269] Created TensorFlow device (/job:localhost/replica:0/task:0/device:XPU:1 with 0 MB memory) -> physical PluggableDevice (device: 1, name: XPU, pci bus id: <undefined>)
2024-04-16 18:52:12.989998: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:269] Created TensorFlow device (/job:localhost/replica:0/task:0/device:XPU:2 with 0 MB memory) -> physical PluggableDevice (device: 2, name: XPU, pci bus id: <undefined>)
2024-04-16 18:52:12.990006: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:269] Created TensorFlow device (/job:localhost/replica:0/task:0/device:XPU:3 with 0 MB memory) -> physical PluggableDevice (device: 3, name: XPU, pci bus id: <undefined>)
2024-04-16 18:52:12.990775: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:375] MLIR V1 optimization pass is not enabled
2024-04-16 18:52:12.998438: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type XPU is enabled.
2024-04-16 18:52:15.343963: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type XPU is enabled.
Iteration 1: 43.758534 sec
Iteration 2: 0.000638 sec
...
Drivers all are reporting functionality is correct. NUMA is working properly in the host OS (Ubuntu 22.04) with kernel updated. I've tried running the Docker container with the --cap-add SYS_NICE flag. Still the same output on logs. FP32 and FP16 all perform as expected. INT8, oddly, performs slower than either.
Any ideas?
Thanks!
The text was updated successfully, but these errors were encountered:
Running the following (pointing to pre-processed data)
I get the following output:
Drivers all are reporting functionality is correct. NUMA is working properly in the host OS (Ubuntu 22.04) with kernel updated. I've tried running the Docker container with the
--cap-add SYS_NICE
flag. Still the same output on logs. FP32 and FP16 all perform as expected. INT8, oddly, performs slower than either.Any ideas?
Thanks!
The text was updated successfully, but these errors were encountered: