Skip to content

CUDA: Faster Mixtral prompt processing (#4538) #20

CUDA: Faster Mixtral prompt processing (#4538)

CUDA: Faster Mixtral prompt processing (#4538) #20

Job Run time
1m 45s
7m 0s
5m 40s
1m 29s
1m 42s
1m 43s
5m 53s
1m 45s
3m 51s
15m 51s
6m 12s
5m 24s
3m 52s
1m 19s
2m 3s
3m 51s
3m 57s
3m 14s
17m 13s
2m 12s
3m 18s
17m 5s
3m 4s
7m 1s
2m 59s
2m 26s
44s
2h 12m 33s