Skip to content

This will traverse the Web as a linked graph from the starting --url finding all outgoing links (<a> tag): it will store each outgoing link for the URL, and then repeat the process for each or them, until --limit URLs will have been traversed. The output will be a JSON file with all incoming and outgoing link information

Notifications You must be signed in to change notification settings

04msambit/hyperlink_crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

hyperlink_crawler

This will traverse the Web as a linked graph from the starting --url finding all outgoing links (<a> tag): it will store each outgoing link for the URL, and then repeat the process for each or them, until --limit URLs will have been traversed. The output will be a JSON file with all incoming and outgoing link information

About

This will traverse the Web as a linked graph from the starting --url finding all outgoing links (<a> tag): it will store each outgoing link for the URL, and then repeat the process for each or them, until --limit URLs will have been traversed. The output will be a JSON file with all incoming and outgoing link information

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages