CUDA: Faster Mixtral prompt processing (#4538) #20
Job | Run time |
---|---|
1m 45s | |
7m 0s | |
5m 40s | |
1m 29s | |
1m 42s | |
1m 43s | |
5m 53s | |
1m 45s | |
3m 51s | |
15m 51s | |
6m 12s | |
5m 24s | |
3m 52s | |
1m 19s | |
2m 3s | |
3m 51s | |
3m 57s | |
3m 14s | |
17m 13s | |
2m 12s | |
3m 18s | |
17m 5s | |
3m 4s | |
7m 1s | |
2m 59s | |
2m 26s | |
44s | |
2h 12m 33s |