Skip to content

Benchmarks 2024 02 11 TFLM GCC

Philipp van Kempen edited this page Feb 11, 2024 · 8 revisions

Setup

Simulator

Toolchains

Models

Package Versions

  • MLonMCU : main

  • TFLM : a549448bb234cf3fed15ad5dabf83d06f82326ce

  • Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a

  • Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2

Miscellaneous

  • Used -Os flag for compilation.
  • Benchmarks generated using MLonMCU deployment tool with minimal efforts.
  • Memory metrics are reported in Bytes

Results (Framework: tflm, Backend: tflmi, Toolchain: gcc)

Audio Wake Words (aww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Auto-Vectorization
174698646
( 0.1x )
132407
( 0.878 )
36204
( 1.0 )
0 TFLM Reference RV32GC -
174698646
( 0.1x )
132413
( 0.878 )
36204
( 1.0 )
128 TFLM Reference RV32GCV Loop+SLP
174698646
( 0.1x )
132413
( 0.878 )
36204
( 1.0 )
1024 TFLM Reference RV32GCV Loop+SLP
157549999
( 0.1x )
144774
( 0.96 )
36148
( 0.998 )
0 TFLM Reference RV32GCP -
16644695
( Base )
150798
( Base )
36212
( Base )
0 muRISCV-NN Scalar RV32GC -
16644695
( 1.0x )
150804
( 1.0 )
36212
( 1.0 )
128 muRISCV-NN Scalar RV32GCV Loop+SLP
16644695
( 1.0x )
150804
( 1.0 )
36212
( 1.0 )
1024 muRISCV-NN Scalar RV32GCV Loop+SLP
7002129
( 2.4x )
151322
( 1.003 )
36212
( 1.0 )
128 muRISCV-NN Vector RV32GCV -
2571441
( 6.5x )
151322
( 1.003 )
36212
( 1.0 )
1024 muRISCV-NN Vector RV32GCV -
13498251
( 1.2x )
162366
( 1.077 )
36156
( 0.998 )
0 muRISCV-NN Scalar RV32GCP -
15939667
( 1.0x )
164764
( 1.093 )
36156
( 0.998 )
0 muRISCV-NN Packed RV32GCP -

Notes

  • TFLM Reference kernels perform extremely bad with GCC (LLVM is more than 2-4x faster here)
  • AutoVectorization disabled in GCC? (Check MLonMCU config!)
  • Packed muRISC-V kernels have little speedup (1-1.2x) for CNNs (Even Scalar + RVP GCC performs better..., DNNs are fine)

Image Classification (resnet)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Auto-Vectorization
745801113
( 0.1x )
172997
( 0.934 )
68968
( 1.0 )
0 TFLM Reference RV32GC -
745801113
( 0.1x )
173013
( 0.934 )
68968
( 1.0 )
128 TFLM Reference RV32GCV Loop+SLP
745801113
( 0.1x )
173013
( 0.934 )
68968
( 1.0 )
1024 TFLM Reference RV32GCV Loop+SLP
697912970
( 0.1x )
185266
( 1.0 )
68912
( 0.999 )
0 TFLM Reference RV32GCP -
80995160
( Base )
185298
( Base )
68960
( Base )
0 muRISCV-NN Scalar RV32GC -
80995160
( 1.0x )
185330
( 1.0 )
68960
( 1.0 )
128 muRISCV-NN Scalar RV32GCV Loop+SLP
80995160
( 1.0x )
185330
( 1.0 )
68960
( 1.0 )
1024 muRISCV-NN Scalar RV32GCV Loop+SLP
29960342
( 2.7x )
186654
( 1.007 )
68960
( 1.0 )
128 muRISCV-NN Vector RV32GCV -
8347502
( 9.7x )
186654
( 1.007 )
68960
( 1.0 )
1024 muRISCV-NN Vector RV32GCV -
62976509
( 1.3x )
196964
( 1.063 )
68904
( 0.999 )
0 muRISCV-NN Scalar RV32GCP -
68431268
( 1.2x )
199958
( 1.079 )
68904
( 0.999 )
0 muRISCV-NN Packed RV32GCP -

Anomaly Detection (toycar)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Auto-Vectorization
3094956
( 0.6x )
333908
( 0.989 )
19432
( 1.0 )
0 TFLM Reference RV32GC -
3094956
( 0.6x )
333914
( 0.989 )
19432
( 1.0 )
128 TFLM Reference RV32GCV Loop+SLP
3094956
( 0.6x )
333914
( 0.989 )
19432
( 1.0 )
1024 TFLM Reference RV32GCV Loop+SLP
3097895
( 0.6x )
346264
( 1.026 )
19380
( 0.997 )
0 TFLM Reference RV32GCP -
1969300
( Base )
337546
( Base )
19432
( Base )
0 muRISCV-NN Scalar RV32GC -
1969300
( 1.0x )
337552
( 1.0 )
19432
( 1.0 )
128 muRISCV-NN Scalar RV32GCV Loop+SLP
1969300
( 1.0x )
337552
( 1.0 )
19432
( 1.0 )
1024 muRISCV-NN Scalar RV32GCV Loop+SLP
608151
( 3.2x )
337800
( 1.001 )
19432
( 1.0 )
128 muRISCV-NN Vector RV32GCV -
428279
( 4.6x )
337800
( 1.001 )
19432
( 1.0 )
1024 muRISCV-NN Vector RV32GCV -
1873278
( 1.1x )
349708
( 1.036 )
19380
( 0.997 )
0 muRISCV-NN Scalar RV32GCP -
942627
( 2.1x )
351426
( 1.041 )
19380
( 0.997 )
0 muRISCV-NN Packed RV32GCP -

Visual Wake Words (vww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Auto-Vectorization
495266208
( 0.1x )
406111
( 0.957 )
134520
( 1.0 )
0 TFLM Reference RV32GC -
495266207
( 0.1x )
406117
( 0.957 )
134520
( 1.0 )
128 TFLM Reference RV32GCV Loop+SLP
495266207
( 0.1x )
406117
( 0.957 )
134520
( 1.0 )
1024 TFLM Reference RV32GCV Loop+SLP
445892345
( 0.1x )
418478
( 0.986 )
134464
( 1.0 )
0 TFLM Reference RV32GCP -
49676633
( Base )
424502
( Base )
134528
( Base )
0 muRISCV-NN Scalar RV32GC -
49676633
( 1.0x )
424508
( 1.0 )
134528
( 1.0 )
128 muRISCV-NN Scalar RV32GCV Loop+SLP
49676633
( 1.0x )
424508
( 1.0 )
134528
( 1.0 )
1024 muRISCV-NN Scalar RV32GCV Loop+SLP
21932927
( 2.3x )
425026
( 1.001 )
134528
( 1.0 )
128 muRISCV-NN Vector RV32GCV -
10503478
( 4.7x )
425026
( 1.001 )
134528
( 1.0 )
1024 muRISCV-NN Vector RV32GCV -
40750242
( 1.2x )
436070
( 1.027 )
134472
( 1.0 )
0 muRISCV-NN Scalar RV32GCP -
49184061
( 1.0x )
438468
( 1.033 )
134472
( 1.0 )
0 muRISCV-NN Packed RV32GCP -

Original data

Click here to download the raw files for this benchmark.

2024-11-26
2024-11-21
2024-11-19
2024-11-18
2024-07-12
2024-06-29
2024-03-02
2024-02-26
2024-02-23
2024-02-22
2024-02-20
2024-02-11
2023-12-22
Clone this wiki locally