-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sentence Transformers test (soon) no longer expected to fail #1918
Merged
Wauplin
merged 5 commits into
huggingface:main
from
tomaarsen:contrib/sentence_transformers
Dec 18, 2023
Merged
Changes from 2 commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
0461bfe
Sentence Transformers test (soon) no longer expected to fail
tomaarsen 07a94e7
Merge branch 'main' into contrib/sentence_transformers
Wauplin 26e6d61
Sleep before calling model_info
tomaarsen a74d62a
Merge branch 'contrib/sentence_transformers' of https://github.com/to…
tomaarsen ae7389c
Add sleep to spaCy tests, too
tomaarsen File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like for some reason the
library_name
attribute is not set when requesting the model info 😕There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly odd. When I install with
git+https://github.com/UKPLab/sentence-transformers.git#egg=sentence-transformers
, it fails the first 2 times that I runpytest
, and passes the 2 subsequent times :/There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When it fails, the only
siblings
inModelInfo
is[RepoSibling(rfilename='.gitattributes', size=None, blob_id=None, lfs=None)]
, so the repo only has the gitattributes. If Ibreakpoint
and inspect the times that it does pass, then all files are seen by themodel_info
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and between each run, is
model_id = f"{user}/{repo_name}"
changing? The value should be unique for each run (generated here). I'm asking because having dependency between tests is oddThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it does differ between runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, now the spaCy tests are failing & sentence_transformers is passing. Perhaps it's indeed a flaky test setup where
model_info
gets slightly outdated data if it is called "too soon".There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @Kakulukian @coyotte508 do you know if something has changed recently on the
/api/models/<repo_id>
endpoint that would make the model cache longer to update? In the test and thread above, we are doing:/api/models/repo_id
)model_info.library_name
.It looks like adding a 1 second delay between steps 2. and 3. makes the test more robust. But I don't remember this was the case before. It is not a problem to add this delay in our tests but prefer to let you know in case it's a bigger problem server-side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we rely on cache instead of building from scratch. So if the cache update (following the push) is going on in the background, it can display outdated info.
You can bypass this by passing the commit ID instead of
HEAD
, eg/api/models/<repo_id>/revision/<commit_id>
You can pass
main
as the commit id (for now it should work, maybe later we'll optimize)We can add support for
Cache-Control
header to skip cache if needed later onThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this seeems to happen only for
commit
endpoint, not push (as it's been awaited since https://github.com/huggingface/moon-landing/pull/5501)We can fix it but potentially commit endpoint will take longer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation! No need to optimize anything on the endpoint as it's mostly a problem in internal tests. Good to know about the revision workaround 👍