-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experimental ability to cache Reference
s
#8111
Conversation
6dcb278
to
036a501
Compare
1fab3a0
to
c3f1303
Compare
9672006
to
ee9cca6
Compare
Reference
sReference
s
ba20f8d
to
b27e9b9
Compare
servers/quarkus-common/src/main/java/org/projectnessie/quarkus/config/QuarkusStoreConfig.java
Outdated
Show resolved
Hide resolved
...oned/storage/cache/src/main/java/org/projectnessie/versioned/storage/cache/CacheBackend.java
Show resolved
Hide resolved
* <p>Database specific implementations of {@link Persist} must not implement this function. | ||
*/ | ||
@Nullable | ||
default Reference fetchReferenceForUpdate(@Nonnull String name) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"ForUpdate" looks a bit misleading to me... The client does not have to update it after fetching. How about fetchCurrentReference
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"bypassingCache" would be more correct - but do we care that much?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"ForUpdate" looks really odd in code that just tries to load the latest value 🤷
Why not add a parameter to the old method: fetchReference(String name, boolean refresh)
, for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That caused like 5248 additional changed lines of code ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but we could have it as an overloaded method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to keep the impact low (it's would be rather cosmetic).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding fetchReference(String name, boolean refresh)
instead of fetchReferenceForUpdate(@Nonnull String name)
is about the same in terms of impact, is it not?.. At the same time I find that fetchReference(name, true)
is more intuitive (more precise) than fetchReferenceForUpdate()
at the call sites. fetchReferenceBypassingCache(name)
also works for me, but I think it will bring the term "cache" to implementation that do not care about caching, which is again a bit awkward.
It could also be fetchReference(String name, FetchMode mode)
with FetchMode { LOCAL, FRESH }
... or similar. WDYT?
...rage/cache/src/main/java/org/projectnessie/versioned/storage/cache/CaffeineCacheBackend.java
Outdated
Show resolved
Hide resolved
Adds ability to cache `Reference` objects, avoiding round-trips to the backend database, beneficial for read heavy workloads. Reference-caching can be enabled via two new configuration options: to define the expiration time for `Reference`s (holding the current HEAD/tip) and to define the expiration time for non-existing `Reference`s. Looking up references happens via the name of the reference, usually without the reference type (aka whether it is a branch or a tag), so Nessie has to look up both types - the given name as a branch and the given name as a tag. This is where negative-caching comes into play, because that caches the existing entry and the non-existing "other" reference type. Hence, if you enable reference-caching, it is recommended to also enable negative reference-caching. Operations that are about to change a reference (committing and reference create/assign/delete operations), always consult the backing database, implicitly refreshing the cache. Mutliple Nessie (against the same repository) do not communicate with each other. If for example a commit happened against one Nessie instance, the other instances may or may not return the new commit. This is why this feature is still experimental and only useful for Nessie setups with a _single_ Nessie instance. This change will be later with another change to allow distributed cache-eviction, so that other Nessie instances accessing the same repository work fine.
Should we perhaps expose a way for getting a fresh |
Also, do we want to keep it "experimental" (PR title) still? |
I don't understand, TBH. How should a client "flip back" to an old version? The cached value is always up to date (assuming a single-node Nessie instance, later with #8463 on all instances)
At least until #8463 is not merged. |
Adds ability to cache
Reference
objects, avoiding round-trips to the backend database, beneficial for read heavy workloads.Reference-caching can be enabled via two new configuration options: to define the expiration time for
Reference
s (holding the current HEAD/tip) and to define the expiration time for non-existingReference
s.Looking up references happens via the name of the reference, usually without the reference type (aka whether it is a branch or a tag), so Nessie has to look up both types - the given name as a branch and the given name as a tag. This is where negative-caching comes into play, because that caches the existing entry and the non-existing "other" reference type. Hence, if you enable reference-caching, it is recommended to also enable negative reference-caching.
Operations that are about to change a reference (committing and reference create/assign/delete operations), always consult the backing database, implicitly refreshing the cache.
Mutliple Nessie (against the same repository) do not communicate with each other. If for example a commit happened against one Nessie instance, the other instances may or may not return the new commit. This is why this feature is still experimental and only useful for Nessie setups with a single Nessie instance. This change will be later with another change to allow distributed cache-eviction, so that other Nessie instances accessing the same repository work fine.