You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[1,0]<stderr>:Traceback (most recent call last):
[1,0]<stderr>: File "/ws/HugeCTR/sparse_operation_kit/SOK_DLRM_Benchmark/main.py", line 129, in <module>
[1,0]<stderr>: trainer = Trainer(
[1,0]<stderr>: File "/ws/HugeCTR/sparse_operation_kit/SOK_DLRM_Benchmark/trainer.py", line 161, in __init__
[1,0]<stderr>: self._embedding_optimizer = tf.keras.mixed_precision.LossScaleOptimizer(
[1,0]<stderr>: File "/usr/local/lib/python3.10/dist-packages/keras/mixed_precision/loss_scale_optimizer.py", line 343, in __call__
[1,0]<stderr>: raise TypeError(msg)
[1,0]<stderr>:TypeError: "inner_optimizer" must be an instance of `tf.keras.optimizers.Optimizer` or `tf.keras.optimizers.experimental.Optimizer`, but got: <sparse_operation_kit.optimizer.OptimizerWrapperV2 object at 0x7f1b15b44910>.
To Reproduce
Steps to reproduce the behavior:
How to build including docker pull & docker run commands
How to run including the JSON config file used
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Environment (please complete the following information):
OS: [e.g. Ubuntu xx.yy]
Graphic card: [e.g. a single NVIDIA V100 or NVIDIA DGX A100]
The optimizer in SOK is not a TensorFlow optimizer, so you cannot wrap it with tf.keras.mixed_precision.LossScaleOptimizer. Instead, you can get the scale value from dense part's optimizer , then adjust the gradients accordingly the scale and input them into the SOK optimizer.
Describe the bug
To Reproduce
Steps to reproduce the behavior:
docker pull & docker run
commandsExpected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Environment (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: