Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify to allow comma-separated list of user agents #26

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

willcritchlow
Copy link

The underlying code accepts a vector of user agents (which, based on conversations with @garyillyes is how e.g. googlebot-image works - running the AllowedByRobots method against both googlebot and googlebot-image user agents). Before this change, the wrapper in robots_main only took a single user agent argument and passed it as a single element vector.

Gary suggested that in order to enable the project to replicate the behaviour of googlebots like the images crawler, I submit a pull request to enable robots_main to accept a comma-separated list of user agents (like googlebot,googlebot-image) that should then be passed to AllowedByRobots as a vector.

This pull request includes that change as well as changes to comments throughout the project for clarity / correctness in regards to this change.

I have also included some new tests on this new functionality one of which currently fails. It is based on the explicit worked example in the documentation.

It isn't clear to me whether I've got a bug in my change, the parser is wrong, or the documentation is wrong (this same description of how things should work appears in many places throughout the robots.txt help text) so I have currently submitted the pull request with a failing test hoping that we can clarify during the review process. I hope that's the right approach - I'm not very familiar with submitting to open source projects.

Will Critchlow added 3 commits November 8, 2019 12:55
Take a comma-delimited list of user agents and pass them as a vector.
My best understanding of what the code actually does with a vector of user agents is treat it as if all the rules applying to any of the user agents are collapsed into a single ruleset applying to all user agents
Update comments to include possibility of passing in a vector of user agents
@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@willcritchlow
Copy link
Author

@googlebot I fixed it.

@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

Tidy up code to Google C++ standards
@willcritchlow
Copy link
Author

@garyillyes I remembered about this and wondered if you'd had a chance to take a look at it? I was reminded by the launch of Bing's new robots.txt checker...

Copy link

@Songbird0411 Songbird0411 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see my husband is using google to cheat on me for years!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants