Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-existent urls are statically cached #10863

Open
stuartcusackie opened this issue Sep 30, 2024 · 7 comments
Open

Non-existent urls are statically cached #10863

stuartcusackie opened this issue Sep 30, 2024 · 7 comments
Labels

Comments

@stuartcusackie
Copy link

stuartcusackie commented Sep 30, 2024

Bug description

I'm noticing that a lot of bad URLs, such as legacy URLs from WordPress and non-existent image paths, are being statically cached.

For example:
https://mysite.ie/app/uploads/2019/04/competitions-at-club-dublin-690x460.jpg
https://mysite.ie/media/good_foods.jpg
https://mysite.ie/wp-content/uploads/2018/06/Leopardstown_10-2.jpg
https://mysite.ie/sitemap.xml.gz
https://mysite.ie/swimming/wp-content/dir/erin1.PhP7
https://mysite.ie/index.php/index.php

It's become a small problem recently when I started listening to the UrlInvalidated event to automatically trigger caching as described here: #8902

My site only has about 250 entries but nearly 3500 UrlInvalidated events are caught by my listeners when the static cache is cleared by my static caching rules. It puts a lot of unnecessary strain on the server through queued jobs.

Can non-existent urls somehow be ignored by the static cache? All of the above urls return a 404 error. I assume they are old links from the original site on Google or other indexes.

Thanks.

How to reproduce

Add a listener to handle the UrlInvalidated event, as described here:
#8902

Non-existent urls will gather in the static cache over time on a live website.

Logs

No response

Environment

Environment
Laravel Version: 11.25.0
PHP Version: 8.2.18
Composer Version: 2.7.4
Environment: local
Debug Mode: ENABLED
Maintenance Mode: OFF
Timezone: Europe/Dublin
Locale: en

Cache
Config: NOT CACHED
Events: NOT CACHED
Routes: NOT CACHED
Views: CACHED

Drivers
Broadcasting: log
Cache: file
Database: mysql
Logs: single
Mail: smtp
Queue: sync
Session: file

Livewire
Livewire: v3.5.8

Statamic
Addons: 7
Sites: 1
Stache Watcher: Enabled
Static Caching: Disabled
Version: 5.27.0 PRO

Statamic Addons
jonassiewertsen/statamic-live-search: 2.1.1
jonassiewertsen/statamic-livewire: 3.8.0
rias/statamic-redirect: 3.8.1
spatie/statamic-responsive-images: 5.0.1
statamic/seo-pro: 6.1.2
stuartcusackie/statamic-cache-requester: 1.2.1
thoughtco/statamic-cache-tracker: 0.9.2

Installation

Fresh statamic/statamic site via CLI

Additional details

No response

@duncanmcclean
Copy link
Member

Are you able to provide the full output of php please support:details?

@stuartcusackie
Copy link
Author

stuartcusackie commented Oct 3, 2024

Sorry, updated above.

@jasonvarga
Copy link
Member

jasonvarga commented Oct 4, 2024

We should be able to pass along to the UrlInvalidated event whether it was a 404 or not. Then you can avoid refetching those URLs.

@stuartcusackie
Copy link
Author

@jasonvarga That would be perfect. Thanks!

@stuartcusackie
Copy link
Author

stuartcusackie commented Oct 4, 2024

Actually... I'm just wondering if this would still cause unnecessary processing. The UrlInvalidated event would still be fired thousands of times, and so would my listener, even though it would perform no actions. It seems to me that these urls shouldn't be cached in the first place.

Maybe it's fine. Just a thought.

@jasonvarga
Copy link
Member

They intentionally get cached since #10294.

If your 404 page is heavy - it might be because of a nav or who knows what else - you could easily make a site struggle by hitting different 404 pages.

@stuartcusackie
Copy link
Author

stuartcusackie commented Oct 29, 2024

Here's a screenshot to highlight the problem. This is what happens when I change my main navigation for a site with only 250 entries; It slows down my site for over an hour.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants