-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request - crawl sitemap.xml #124
Comments
This is a good idea. We haven't been actively updating the project lately, but let's keep this one open. |
I believe you should be able to add paths manually to the yaml file, after
an initial crawl . Cleaver is that not still the case?
On Wed, Mar 10, 2021 at 10:24 AM Cleaver Barnes ***@***.***> wrote:
This is a good idea. We haven't been actively updating the project lately,
but let's keep this one open.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#124 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABBFYFWM62DS6M4JCSGTX3TC56J5ANCNFSM4Y5GVSTA>
.
--
------------------------------------------------------------------------------------
Alex Dergachev Evolving Web
Lead developer web design & development
***@***.*** http://evolvingweb.ca
phone 514.844.4930 300 St Sacrement, #204
fax 514.807.7499 Montreal, QC, H2Y 1X4
------------------------------------------------------------------------------------
|
Yes, you could manually reformat the paths from sitemap.xml into paths.txt. |
Yes adding to the paths.txt by parsing a json sitemap is the approach we have been taking. However it is currently semi automated, and I would presume that a good automation using XML instead would be useful to other people as well. :) |
We are finding sitediff works quite well, however it may not find all the URLS on a site by links from the home page. We would like (optionally) to be able to add the URLS listed in the sitemap.xml to the paths.
The text was updated successfully, but these errors were encountered: