Skip to content

Benchmarks CUSTOM TFLM GCC Os

Philipp van Kempen edited this page Nov 16, 2024 · 1 revision

Setup

Simulator

Toolchains

Models

Package Versions

  • MLonMCU : main

  • TFLM : main

  • Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a

  • Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2

Miscellaneous

  • Used -Os flag for compilation.
  • Benchmarks generated using MLonMCU deployment tool with minimal efforts.
  • Memory metrics are reported in Bytes

Results (Framework: tflm, Backend: tflmi, Toolchain: gcc, Flags: -Os)

Audio Wake Words (aww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
174727877
( 0.1x )
132543
( 0.884 )
36204
( 1.0 )
128 TFLM Reference RV32GC False -
174727829
( 0.1x )
132549
( 0.885 )
36204
( 1.0 )
128 TFLM Reference RV32GCV False Loop+SLP
174727829
( 0.1x )
132549
( 0.885 )
36204
( 1.0 )
256 TFLM Reference RV32GCV False Loop+SLP
174727829
( 0.1x )
132549
( 0.885 )
36204
( 1.0 )
512 TFLM Reference RV32GCV False Loop+SLP
174727829
( 0.1x )
132549
( 0.885 )
36204
( 1.0 )
1024 TFLM Reference RV32GCV False Loop+SLP
174727829
( 0.1x )
132549
( 0.885 )
36204
( 1.0 )
2048 TFLM Reference RV32GCV False Loop+SLP
174727829
( 0.1x )
132549
( 0.885 )
36204
( 1.0 )
4096 TFLM Reference RV32GCV False Loop+SLP
157574076
( 0.1x )
144914
( 0.967 )
36148
( 0.998 )
128 TFLM Reference RV32GCP False -
16660013
( Base )
149852
( Base )
36212
( Base )
128 muRISCV-NN Scalar RV32GC False -
16660013
( 1.0x )
149854
( 1.0 )
36212
( 1.0 )
128 muRISCV-NN Scalar RV32GCV False Loop+SLP
16660013
( 1.0x )
149854
( 1.0 )
36212
( 1.0 )
256 muRISCV-NN Scalar RV32GCV False Loop+SLP
16660013
( 1.0x )
149854
( 1.0 )
36212
( 1.0 )
512 muRISCV-NN Scalar RV32GCV False Loop+SLP
16660013
( 1.0x )
149854
( 1.0 )
36212
( 1.0 )
1024 muRISCV-NN Scalar RV32GCV False Loop+SLP
16660013
( 1.0x )
149854
( 1.0 )
36212
( 1.0 )
2048 muRISCV-NN Scalar RV32GCV False Loop+SLP
16660013
( 1.0x )
149854
( 1.0 )
36212
( 1.0 )
4096 muRISCV-NN Scalar RV32GCV False Loop+SLP
4113193
( 4.1x )
151032
( 1.008 )
36212
( 1.0 )
128 muRISCV-NN Vector RV32GCV False -
2845137
( 5.9x )
151032
( 1.008 )
36212
( 1.0 )
256 muRISCV-NN Vector RV32GCV False -
2156449
( 7.7x )
151032
( 1.008 )
36212
( 1.0 )
512 muRISCV-NN Vector RV32GCV False -
2114737
( 7.9x )
151032
( 1.008 )
36212
( 1.0 )
1024 muRISCV-NN Vector RV32GCV False -
2114737
( 7.9x )
151032
( 1.008 )
36212
( 1.0 )
2048 muRISCV-NN Vector RV32GCV False -
2118126
( 7.9x )
151032
( 1.008 )
36212
( 1.0 )
4096 muRISCV-NN Vector RV32GCV False -
13526056
( 1.2x )
161456
( 1.077 )
36156
( 0.998 )
128 muRISCV-NN Scalar RV32GCP False -
15955430
( 1.0x )
163734
( 1.093 )
36156
( 0.998 )
128 muRISCV-NN Packed RV32GCP False -

Image Classification (resnet)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
745826315
( 0.1x )
173133
( 0.939 )
68968
( 1.0 )
128 TFLM Reference RV32GC False -
745826267
( 0.1x )
173149
( 0.939 )
68968
( 1.0 )
128 TFLM Reference RV32GCV False Loop+SLP
745826267
( 0.1x )
173149
( 0.939 )
68968
( 1.0 )
256 TFLM Reference RV32GCV False Loop+SLP
745826267
( 0.1x )
173149
( 0.939 )
68968
( 1.0 )
512 TFLM Reference RV32GCV False Loop+SLP
745826267
( 0.1x )
173149
( 0.939 )
68968
( 1.0 )
1024 TFLM Reference RV32GCV False Loop+SLP
745826267
( 0.1x )
173149
( 0.939 )
68968
( 1.0 )
2048 TFLM Reference RV32GCV False Loop+SLP
745826267
( 0.1x )
173149
( 0.939 )
68968
( 1.0 )
4096 TFLM Reference RV32GCV False Loop+SLP
697937407
( 0.1x )
185404
( 1.006 )
68912
( 0.999 )
128 TFLM Reference RV32GCP False -
81003295
( Base )
184350
( Base )
68960
( Base )
128 muRISCV-NN Scalar RV32GC False -
81003565
( 1.0x )
184378
( 1.0 )
68960
( 1.0 )
128 muRISCV-NN Scalar RV32GCV False Loop+SLP
81003565
( 1.0x )
184378
( 1.0 )
68960
( 1.0 )
256 muRISCV-NN Scalar RV32GCV False Loop+SLP
81003565
( 1.0x )
184378
( 1.0 )
68960
( 1.0 )
512 muRISCV-NN Scalar RV32GCV False Loop+SLP
81003565
( 1.0x )
184378
( 1.0 )
68960
( 1.0 )
1024 muRISCV-NN Scalar RV32GCV False Loop+SLP
81003565
( 1.0x )
184378
( 1.0 )
68960
( 1.0 )
2048 muRISCV-NN Scalar RV32GCV False Loop+SLP
81003565
( 1.0x )
184378
( 1.0 )
68960
( 1.0 )
4096 muRISCV-NN Scalar RV32GCV False Loop+SLP
15486943
( 5.2x )
186362
( 1.011 )
68960
( 1.0 )
128 muRISCV-NN Vector RV32GCV False -
9799935
( 8.3x )
186362
( 1.011 )
68960
( 1.0 )
256 muRISCV-NN Vector RV32GCV False -
7206287
( 11.2x )
186362
( 1.011 )
68960
( 1.0 )
512 muRISCV-NN Vector RV32GCV False -
5940767
( 13.6x )
186362
( 1.011 )
68960
( 1.0 )
1024 muRISCV-NN Vector RV32GCV False -
4999076
( 16.2x )
186362
( 1.011 )
68960
( 1.0 )
2048 muRISCV-NN Vector RV32GCV False -
4748549
( 17.1x )
186362
( 1.011 )
68960
( 1.0 )
4096 muRISCV-NN Vector RV32GCV False -
62985030
( 1.3x )
196058
( 1.064 )
68904
( 0.999 )
128 muRISCV-NN Scalar RV32GCP False -
68447304
( 1.2x )
198930
( 1.079 )
68904
( 0.999 )
128 muRISCV-NN Packed RV32GCP False -

Anomaly Detection (toycar)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
3106372
( 0.6x )
334042
( 0.991 )
19432
( 1.0 )
128 TFLM Reference RV32GC False -
3106372
( 0.6x )
334048
( 0.991 )
19432
( 1.0 )
128 TFLM Reference RV32GCV False Loop+SLP
3106372
( 0.6x )
334048
( 0.991 )
19432
( 1.0 )
256 TFLM Reference RV32GCV False Loop+SLP
3106372
( 0.6x )
334048
( 0.991 )
19432
( 1.0 )
512 TFLM Reference RV32GCV False Loop+SLP
3106372
( 0.6x )
334048
( 0.991 )
19432
( 1.0 )
1024 TFLM Reference RV32GCV False Loop+SLP
3106372
( 0.6x )
334048
( 0.991 )
19432
( 1.0 )
2048 TFLM Reference RV32GCV False Loop+SLP
3106372
( 0.6x )
334048
( 0.991 )
19432
( 1.0 )
4096 TFLM Reference RV32GCV False Loop+SLP
3121245
( 0.6x )
346402
( 1.027 )
19380
( 0.997 )
128 TFLM Reference RV32GCP False -
1789745
( Base )
337180
( Base )
19432
( Base )
128 muRISCV-NN Scalar RV32GC False -
1789745
( 1.0x )
337182
( 1.0 )
19432
( 1.0 )
128 muRISCV-NN Scalar RV32GCV False Loop+SLP
1789745
( 1.0x )
337182
( 1.0 )
19432
( 1.0 )
256 muRISCV-NN Scalar RV32GCV False Loop+SLP
1789745
( 1.0x )
337182
( 1.0 )
19432
( 1.0 )
512 muRISCV-NN Scalar RV32GCV False Loop+SLP
1789745
( 1.0x )
337182
( 1.0 )
19432
( 1.0 )
1024 muRISCV-NN Scalar RV32GCV False Loop+SLP
1789745
( 1.0x )
337182
( 1.0 )
19432
( 1.0 )
2048 muRISCV-NN Scalar RV32GCV False Loop+SLP
1789745
( 1.0x )
337182
( 1.0 )
19432
( 1.0 )
4096 muRISCV-NN Scalar RV32GCV False Loop+SLP
584271
( 3.1x )
338094
( 1.003 )
19432
( 1.0 )
128 muRISCV-NN Vector RV32GCV False -
465903
( 3.8x )
338094
( 1.003 )
19432
( 1.0 )
256 muRISCV-NN Vector RV32GCV False -
406719
( 4.4x )
338094
( 1.003 )
19432
( 1.0 )
512 muRISCV-NN Vector RV32GCV False -
377463
( 4.7x )
338094
( 1.003 )
19432
( 1.0 )
1024 muRISCV-NN Vector RV32GCV False -
373779
( 4.8x )
338094
( 1.003 )
19432
( 1.0 )
2048 muRISCV-NN Vector RV32GCV False -
371895
( 4.8x )
338094
( 1.003 )
19432
( 1.0 )
4096 muRISCV-NN Vector RV32GCV False -
1631834
( 1.1x )
349340
( 1.036 )
19380
( 0.997 )
128 muRISCV-NN Scalar RV32GCP False -
959447
( 1.9x )
350936
( 1.041 )
19380
( 0.997 )
128 muRISCV-NN Packed RV32GCP False -

Visual Wake Words (vww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
495297621
( 0.1x )
406249
( 0.959 )
134520
( 1.0 )
128 TFLM Reference RV32GC False -
495297621
( 0.1x )
406255
( 0.959 )
134520
( 1.0 )
128 TFLM Reference RV32GCV False Loop+SLP
495297621
( 0.1x )
406255
( 0.959 )
134520
( 1.0 )
256 TFLM Reference RV32GCV False Loop+SLP
495297621
( 0.1x )
406255
( 0.959 )
134520
( 1.0 )
512 TFLM Reference RV32GCV False Loop+SLP
495297621
( 0.1x )
406255
( 0.959 )
134520
( 1.0 )
1024 TFLM Reference RV32GCV False Loop+SLP
495297621
( 0.1x )
406255
( 0.959 )
134520
( 1.0 )
2048 TFLM Reference RV32GCV False Loop+SLP
495297621
( 0.1x )
406255
( 0.959 )
134520
( 1.0 )
4096 TFLM Reference RV32GCV False Loop+SLP
445917090
( 0.1x )
418618
( 0.988 )
134464
( 1.0 )
128 TFLM Reference RV32GCP False -
49691399
( Base )
423556
( Base )
134528
( Base )
128 muRISCV-NN Scalar RV32GC False -
49691399
( 1.0x )
423558
( 1.0 )
134528
( 1.0 )
128 muRISCV-NN Scalar RV32GCV False Loop+SLP
49691399
( 1.0x )
423558
( 1.0 )
134528
( 1.0 )
256 muRISCV-NN Scalar RV32GCV False Loop+SLP
49691399
( 1.0x )
423558
( 1.0 )
134528
( 1.0 )
512 muRISCV-NN Scalar RV32GCV False Loop+SLP
49691399
( 1.0x )
423558
( 1.0 )
134528
( 1.0 )
1024 muRISCV-NN Scalar RV32GCV False Loop+SLP
49691399
( 1.0x )
423558
( 1.0 )
134528
( 1.0 )
2048 muRISCV-NN Scalar RV32GCV False Loop+SLP
49691399
( 1.0x )
423558
( 1.0 )
134528
( 1.0 )
4096 muRISCV-NN Scalar RV32GCV False Loop+SLP
13486989
( 3.7x )
424736
( 1.003 )
134528
( 1.0 )
128 muRISCV-NN Vector RV32GCV False -
10158361
( 4.9x )
424736
( 1.003 )
134528
( 1.0 )
256 muRISCV-NN Vector RV32GCV False -
8869239
( 5.6x )
424736
( 1.003 )
134528
( 1.0 )
512 muRISCV-NN Vector RV32GCV False -
8364740
( 5.9x )
424736
( 1.003 )
134528
( 1.0 )
1024 muRISCV-NN Vector RV32GCV False -
8316126
( 6.0x )
424736
( 1.003 )
134528
( 1.0 )
2048 muRISCV-NN Vector RV32GCV False -
8319515
( 6.0x )
424736
( 1.003 )
134528
( 1.0 )
4096 muRISCV-NN Vector RV32GCV False -
40776459
( 1.2x )
435160
( 1.027 )
134472
( 1.0 )
128 muRISCV-NN Scalar RV32GCP False -
49187703
( 1.0x )
437438
( 1.033 )
134472
( 1.0 )
128 muRISCV-NN Packed RV32GCP False -

Original data

Click here to download the raw files for this benchmark.

2024-11-26
2024-11-21
2024-11-19
2024-11-18
2024-07-12
2024-06-29
2024-03-02
2024-02-26
2024-02-23
2024-02-22
2024-02-20
2024-02-11
2023-12-22
Clone this wiki locally