RNNT training on CPU #95

Peach-He · 2022-01-20T09:29:51Z

Appreciate for the job on supporting RNN-T training on CPU (models/language_modeling/pytorch/rnnt/training/cpu), just quick evaluated the training code and found that WER would keep in 1.00 after even training 10+ epoches.
And I found this issue related on loss function used in training HawkAaron/warp-transducer#93
The grad in cpu is incorrect, is this a know issue? Or have we ever gotten the final WER of 0.058 rather than 1.0?

* update TFX dockerfiles requests version * Fix syntax

sramakintel · 2024-03-25T16:21:39Z

@Peach-He: The RNNT CPU training scripts have been updated recently. Can you try again to see if it resolves your issues? You can refer to latest optimizations here: https://www.intel.com/content/www/us/en/developer/articles/containers/cpu-reference-model-containers.html

ashahba pushed a commit that referenced this issue Apr 1, 2022

Update requests version in TFX container (#95)

a4411d3

* update TFX dockerfiles requests version * Fix syntax

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RNNT training on CPU #95

RNNT training on CPU #95

Peach-He commented Jan 20, 2022

sramakintel commented Mar 25, 2024 •

edited

Loading

RNNT training on CPU #95

RNNT training on CPU #95

Comments

Peach-He commented Jan 20, 2022

sramakintel commented Mar 25, 2024 • edited Loading

sramakintel commented Mar 25, 2024 •

edited

Loading