Open Source Website Crawler
Explore the docs »
Report Bug
.
Request Feature
An Open Source Crawler/Spider
Can be used by anyone... And can be ran on any win / linux computers It ain't any crawler for industrial use as written in a slow programming language and may have its own issues..
The project can be easily used with mongoDB.
The project can also be used for pentesting.
- Cross Platform
- Installer for linux
- Related-CLI Tools (includes ,CLI access to tool, not that good search-tool xD, etc)
- Memory efficient [ig]
- Pool Crawling - Use multiple crawlers at same time
- Supports Robot.txt
- MongoDB [DB]
- Language Detection
- 18 + Checks / Offensive Content Check
- Proxies
- Multi Threading
- Url Scanning
- Keyword, Desc And recurring words Logging
- Search Website - search_website.py
- Connection Tree Website - tree_website.py
- Tool for finding proxies - proxy_tool.py
The first thing is install the project... The installer provided is only for Linux..
In windows the application wont be added to path or requirements be installed soo check out the installation procedure for Windows.
git clone https://github.com/merwin-asm/OpenCrawler.git
cd OpenCrawler
chmod +x install.sh && ./install.sh
You need git, python3 and pip installed
git clone https://github.com/merwin-asm/OpenCrawler.git
cd OpenCrawler
pip install -r requirements.txt
The project can be used for :
- Making a (not that good) search engine
- For Osint
- For Pentesting
To see available commands
opencrawler help
or
man opencrawler
To see available commands
python opencrawler help
Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.
- If you have suggestions for adding or removing projects, feel free to open an issue to discuss it, or directly create a pull request after you edit the README.md file with necessary changes.
- Please make sure you check your spelling and grammar.
- Create individual PR for each suggestion.
Distributed under the MIT License. See LICENSE for more information.
- Merwin A J - CS Student - Merwin A J - Build OpenCrawler