Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: bench: add more ftype, fix triggers and bot comment #6466

Merged
merged 7 commits into from
Apr 4, 2024

Conversation

phymbert
Copy link
Collaborator

@phymbert phymbert commented Apr 3, 2024

Motivation

  • PR comment pops up in all PRs even unrelated to speed, it is a little bit distracting.
  • benchmark results vary probably because a fixed seed is not set
  • add more model ftypes

Proposition

  • reduce file workflow path trigger to only llama.cpp, ggml.c and cuda files.
  • add seed param in k6 script.js
  • reduce the comment to the first line, so it does not use too much space in the PR, add a warning notice
  • add q8_0 and f16 phi-2 quants
  • add more metrics in the commit status to later on show performance history

Tests

Tested here on a self-hosted GCP L4 ( sic ^) ) :

References

@phymbert phymbert added performance Speed related topics server/webui labels Apr 3, 2024
@phymbert phymbert requested a review from ggerganov April 3, 2024 19:53
@phymbert
Copy link
Collaborator Author

phymbert commented Apr 3, 2024

@ggerganov Georgi, as more and more models are MOE based, I suggest later on adding mixtral8x7b, thoughts ?

This comment was marked as off-topic.

Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can continue to scale with more models in the future. For now, let's give it some more time to see how the existing benchmarks perform and gather feedback about what's useful.

@phymbert
Copy link
Collaborator Author

phymbert commented Apr 4, 2024

Understood, please merge once you have restarted the github manager

@ggerganov
Copy link
Owner

Updated to master of https://github.com/ggml-org/ci and restarted

@ggerganov ggerganov merged commit 7a2c926 into ggerganov:master Apr 4, 2024
25 of 26 checks passed
@phymbert phymbert deleted the hp/server/bench/add-quants branch April 4, 2024 09:58
tybalex pushed a commit to rubra-ai/tools.cpp that referenced this pull request Apr 17, 2024
* ci: bench: change trigger path to not spawn on each PR

* ci: bench: add more file type for phi-2: q8_0 and f16.
- do not show the comment by default

* ci: bench: add seed parameter in k6 script

* ci: bench: artefact name perf job

* Add iteration in the commit status, reduce again the autocomment

* ci: bench: add per slot metric in the commit status

* Fix trailing spaces
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Speed related topics server/webui
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants