[pull] master from ggerganov:master #24

pull · 2024-01-18T23:42:34Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

For Mistral-7B and fp16, time on my system goes down from 536 seconds to 423 seconds for the full evaluation dataset (10042 tasks). Co-authored-by: Iwan Kawrakow <[email protected]>

PR #4818 (merged last week) reintroduced a config check for vocab_size that was addressed in PR #4258 (merged 2023-11-30). Without the fix, llama2 models can't be converted. The error is: `ValueError: The model's vocab size is set to -1 in params.json. Please update it manually. Maybe 32000?`

* server: defer task when no slot is available * remove unnecessary log --------- Co-authored-by: Xuan Son Nguyen <[email protected]>

* falcon arch fix for tied output embeddings * Update llama.cpp Co-authored-by: Georgi Gerganov <[email protected]> * Update llama.cpp * Update llama.cpp Co-authored-by: Georgi Gerganov <[email protected]> * Update llama.cpp --------- Co-authored-by: Georgi Gerganov <[email protected]>

ikawrakow and others added 9 commits January 18, 2024 19:18

HellaSwag: speed up by parallelizing log-prob evaluation (#5020)

3e945cc

For Mistral-7B and fp16, time on my system goes down from 536 seconds to 423 seconds for the full evaluation dataset (10042 tasks). Co-authored-by: Iwan Kawrakow <[email protected]>

scripts : add get-winogrande.sh

e9240cd

perplexity : fix winogrande N tasks option

d391ae9

imatrix : fix assert for src0 non-cont check

2d5419d

llama : fix mlock with no-mmap with Metal (#5025)

96d7f56

server : defer tasks when "slot unavailable" (#5018)

821f0a2

* server: defer task when no slot is available * remove unnecessary log --------- Co-authored-by: Xuan Son Nguyen <[email protected]>

cmake : add ggml public headers (#5011)

9b6ea42

pull bot added the ⤵️ pull label Jan 19, 2024

teleprint-me closed this Jan 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from ggerganov:master #24

[pull] master from ggerganov:master #24

pull bot commented Jan 18, 2024 •

edited

Loading

[pull] master from ggerganov:master #24

[pull] master from ggerganov:master #24

Conversation

pull bot commented Jan 18, 2024 • edited Loading

pull bot commented Jan 18, 2024 •

edited

Loading