Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Chrome-Lighthouse to Google crawlers list #160

Merged
merged 1 commit into from
Oct 22, 2024

Conversation

aydinkazim
Copy link
Contributor

No description provided.

Copy link
Owner

@alaz alaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alaz alaz merged commit 0af46f9 into alaz:master Oct 22, 2024
6 checks passed
@alaz
Copy link
Owner

alaz commented Oct 22, 2024

Thank you!

@alaz
Copy link
Owner

alaz commented Oct 24, 2024

Hey @aydinkazim , I can't seem to find where this User Agent is documented. What I found is GoogleChrome/lighthouse#14917 and it seems that the User Agent has been removed. I need a documentation reference.

@alaz alaz mentioned this pull request Oct 24, 2024
@aydinkazim
Copy link
Contributor Author

Hi again 👋

You're right, the official documentation shows that the Chrome-Lighthouse user agent has been removed, as mentioned in GoogleChrome/lighthouse#14917. However, based on the current logs from my bot, it's still utilizing the Chrome-Lighthouse user agent. Here's a snippet from my logs:

I, [2024-10-24T16:20:06.053244 #208883]  INFO -- : 
 REAL BOT IP: 74.125.208.7 AGENT: Mozilla/5.0 (Linux; Android 7.0; Moto G (4)) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4590.2 Mobile Safari/537.36 Chrome-Lighthouse

I, [2024-10-24T16:20:06.539276 #208883]  INFO -- : 
 REAL BOT IP: 74.125.208.8 AGENT: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4590.2 Safari/537.36 Chrome-Lighthouse

I, [2024-10-24T16:20:06.704677 #208883]  INFO -- : 
 REAL BOT IP: 74.125.208.7 AGENT: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4590.2 Safari/537.36 Chrome-Lighthouse

I, [2024-10-24T16:20:06.878626 #208883]  INFO -- : 
 REAL BOT IP: 74.125.208.7 AGENT: Mozilla/5.0 (Linux; Android 7.0; Moto G (4)) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4590.2 Mobile Safari/537.36 Chrome-Lighthouse

I, [2024-10-24T16:20:19.868516 #208883]  INFO -- : 
 REAL BOT IP: 74.125.208.7 AGENT: Mozilla/5.0 (Linux; Android 7.0; Moto G (4)) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4590.2 Mobile Safari/537.36 Chrome-Lighthouse

It seems that despite the removal in official references, the bot is still functioning with this user agent. I will continue monitoring this, but it may be a transitional state from Google Lighthouse.

@alaz
Copy link
Owner

alaz commented Oct 24, 2024

Thank you for confirming. It does not look from the issue that they would consider reintroducing it back, but let's see. I do not see any harm in adding it to Legitbot meanwhile. I am going to publish a release by the weekend.

@aydinkazim
Copy link
Contributor Author

Thank you for the update! I completely understand, and I agree that reintroducing it doesn't seem likely based on the current discussions. Looking forward to the upcoming release!

@alaz
Copy link
Owner

alaz commented Oct 24, 2024

Published as 1.11.1

@chendo
Copy link

chendo commented Dec 19, 2024

I'm not sure if the Lighthouse user agent should be here, as the Lighthouse tool can be legitimately used by other companies that aren't Google, and this will cause the Rack Attack example on the README to block Lighthouse requests.

Context: I upgraded legitbot as part of a change, and discovered that our Calibre (which uses Lighthouse) snapshots started to fail, as we use Rack Attack and legitbot to deny traffic to anything claiming to be a bot that they're not.

@aydinkazim is there any particular reason why you want Lighthouse to be in Legitbot?

@alaz
Copy link
Owner

alaz commented Dec 20, 2024

@chendo I agree that it makes sense to remove it from Legitbot.

@aydinkazim
Copy link
Contributor Author

aydinkazim commented Dec 20, 2024

Hi again 👋

You're right, the official documentation shows that the Chrome-Lighthouse user agent has been removed, as mentioned in GoogleChrome/lighthouse#14917. However, based on the current logs from my bot, it's still utilizing the Chrome-Lighthouse user agent. Here's a snippet from my logs:

I, [2024-10-24T16:20:06.053244 #208883]  INFO -- : 
 REAL BOT IP: 74.125.208.7 AGENT: Mozilla/5.0 (Linux; Android 7.0; Moto G (4)) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4590.2 Mobile Safari/537.36 Chrome-Lighthouse

I, [2024-10-24T16:20:06.539276 #208883]  INFO -- : 
 REAL BOT IP: 74.125.208.8 AGENT: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4590.2 Safari/537.36 Chrome-Lighthouse

I, [2024-10-24T16:20:06.704677 #208883]  INFO -- : 
 REAL BOT IP: 74.125.208.7 AGENT: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4590.2 Safari/537.36 Chrome-Lighthouse

I, [2024-10-24T16:20:06.878626 #208883]  INFO -- : 
 REAL BOT IP: 74.125.208.7 AGENT: Mozilla/5.0 (Linux; Android 7.0; Moto G (4)) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4590.2 Mobile Safari/537.36 Chrome-Lighthouse

I, [2024-10-24T16:20:19.868516 #208883]  INFO -- : 
 REAL BOT IP: 74.125.208.7 AGENT: Mozilla/5.0 (Linux; Android 7.0; Moto G (4)) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4590.2 Mobile Safari/537.36 Chrome-Lighthouse

It seems that despite the removal in official references, the bot is still functioning with this user agent. I will continue monitoring this, but it may be a transitional state from Google Lighthouse.

If you check your site using https://pagespeed.web.dev/ and look at your bot logs, you will see "Chrome-Lighthouse." Based on this, I shared my opinion as I mentioned above. Therefore, our pages are also being blocked during the testing phase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants