Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.x] Enable background re-caching of static cache #9396

Open
wants to merge 29 commits into
base: 5.x
Choose a base branch
from

Conversation

ryanmitchell
Copy link
Contributor

@ryanmitchell ryanmitchell commented Jan 25, 2024

This PR enables background re-caching of the static cache through a new static_caching.background_recache variable (or STATAMIC_BACKGROUND_RECACHE .env value).

When set to true the static cache will re-cache the page without invalidating the previous cache, so at every point a static file will exist and end users will not encounter slow loading pages.

Invalidation continues to occur as it currently does when content is deleted, or when this variable is set to false (the default).

@ryanmitchell ryanmitchell marked this pull request as ready for review January 25, 2024 21:24
@aerni
Copy link
Contributor

aerni commented Jan 26, 2024

Do I understand correctly, that this will serve the old cached file until the new cached file is ready? And then it gets rid of the old cached file? Or what exactly is the purpose of this? Sounds interesting for sure.

Also, from a configuration point of view, it might be easier to just have the config to be a boolean and then assign a random string automatically if the config is true.

@jasonvarga
Copy link
Member

this will serve the old cached file until the new cached file is ready? And then it gets rid of the old cached file?

When the new cached file is ready, it'll just stomp over the top of the old one.

it might be easier to just have the config to be a boolean and then assign a random string automatically if the config is true.

We need to know the value ahead of time, so it can't just be a randomly changing string.

@ryanmitchell
Copy link
Contributor Author

@aerni the idea of it being a fixed string was to provide some sort of security - that someone malicious couldn’t just constantly force your site to recache.

@aerni
Copy link
Contributor

aerni commented Jan 26, 2024

Personally, I'd be a little intimidated by the instruction to "add a unique string here that should not be easy to guess". Like, what is "easy to guess"? Couldn't we just use the APP_KEY as the unique string?

When the new cached file is ready, it'll just stomp over the top of the old one.

By "stomp over" I suppose you mean the old file will just not be served anymore? Won't this result in a lot of abandoned files pretty fast? Should there maybe be some sort of cleanup?

@ryanmitchell
Copy link
Contributor Author

Maybe we could make it a Boolean config and the token could be a Crypt or Sha of the url. The app key is used in the Crypt so it wouldn’t be game-able.

@ryanmitchell
Copy link
Contributor Author

By "stomp over" I suppose you mean the old file will just not be served anymore? Won't this result in a lot of abandoned files pretty fast? Should there maybe be some sort of cleanup?

it literally replaces the same file. So no clean up will be needed.

@aerni
Copy link
Contributor

aerni commented Jan 26, 2024

Maybe we could make it a Boolean config and the token could be a Crypt or Sha of the url. The app key is used in the Crypt so it wouldn’t be game-able.

Oh yeah, makes sense. It's early morning here 😄

@ryanmitchell
Copy link
Contributor Author

@aerni here you go - its a boolean now. Feels simpler.

@aerni
Copy link
Contributor

aerni commented Jan 26, 2024

Love it, thanks! Curious as to why you decided to make this opt-in, though. Feels like everyone who uses static caching would want to enable this feature. Or is there any potential downside to it?

@ryanmitchell
Copy link
Contributor Author

I make all my PRs opt in until Jason changes them to not be :)

@ryanmitchell ryanmitchell changed the base branch from 4.x to 5.x May 10, 2024 10:07
@ryanmitchell ryanmitchell changed the title [4.x] Enable background re-caching of static cache [5.x] Enable background re-caching of static cache May 10, 2024
@Krzemo
Copy link

Krzemo commented Jun 12, 2024

this will serve the old cached file until the new cached file is ready? And then it gets rid of the old cached file?

When the new cached file is ready, it'll just stomp over the top of the old one.

An edge case idea:
In theory, isn't file access concurrency a concern here?
What if file is being read and locked and you try to overwrite it?
What if being written to and server tries to read it?
I realize this is purely theoretic and performance and reliability impact is probably negligible, but ....

@ryanmitchell
Copy link
Contributor Author

@Krzemo Are those things not already a potential issue with the existing implementation? My understanding is modern filesystems will lock the file during writing, so any read operations will wait until the file is unlocked, so it shouldn't be an issue.

@Krzemo
Copy link

Krzemo commented Jun 17, 2024

@ryanmitchell Im not worried about OS not doing its job.

Maybe asking different questions will help explain what I mean (I feel like I should run tests myself first - will do next week if still needed):

  1. Does file get locked when HTML generation starts and freed when done or HTML is ready first and file gets locked for a brief moment only when it is being written? Can a file be locked because of first option for i.e. 1 minute+?
  2. Does locking a file for writing make it unreadable? Can it end up in the same situation like with no background refresh where a user have to wait for file generation to finish not because there is no static file, but because file is locked for reading?

@ryanmitchell
Copy link
Contributor Author

I understand now. The file is only locked during the writing not during the regeneration. So it should be a fraction of a second.

@ryanmitchell
Copy link
Contributor Author

If this is starting to gain some traction again, the thing I'd love to see included would be a way to recache all pages, so that on deploy of a site you could overwrite the existing static cache in the background.

@jasonvarga
Copy link
Member

Isn't that the point of this PR anyway?

@ryanmitchell
Copy link
Contributor Author

ryanmitchell commented Sep 12, 2024

The PR makes it possible yes, but the static:warm task would need to be updated to either recache by default, or provide an option to do it?

@jasonvarga
Copy link
Member

Well yeah I would have expected that happen if you run the command and have the config enabled. Otherwise at the moment we're relying on manually registered listeners to do any background recaching.

@ryanmitchell
Copy link
Contributor Author

Cool - I didn't want to assume. Thats been added here: ryanmitchell@b962385

Copy link
Member

@jasonvarga jasonvarga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you invalidate a URL (not recache) it'll make sure to invalidate any combinations of query strings too.

e.g. If you have visited
/page, /page?foo=bar, and /page?hello=world
when you invalidate /page it'll invalidate all 3.

This doesn't happen when you do recacheUrl.

See:

public function invalidateUrl($url, $domain = null)
{
$this
->getUrls($domain)
->filter(fn ($value) => $value === $url || str_starts_with($value, $url.'?'))
->each(function ($value, $key) {
$this->cache->forget($this->normalizeKey('responses:'.$key));
$this->forgetUrl($key);
});
UrlInvalidated::dispatch($url, $domain);
}

It loops through all cached urls where it matches or starts with the given url.

AbstractCacher@recacheUrl doesn't loop. It just dispatches it on the given url

@ryanmitchell
Copy link
Contributor Author

Good spot, I've updated that to loop now too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants