Web-crawler-to-detect-malicious-websites

Implementation of a webcrawler, with a limited functionality detection for malicious websites

There are two python scripts in this repo.

crawl.py - usage: python3 crawl.py [website] example : python3 crawl.py http://www.vogellascoding.site90.net/cnsproj/main.html This script basically crawls from website to website, finding links on each website and downloading it sequentially. All the html files goes to the 'html' folder while all the HTTP header files goes to the 'header' folder. Note : Please make sure both the folders are created before running the script.

yara_demo.py - usage: python3 yara_demo.py example : python3 yara_demo.py [/path/to/rule/file] This script looks for files in the html folder, and if found, starts analysing the content file by file and displaying it on stdout.

There is one more file named myrules, which is a YARA rules file. This file has all the regular expressions for matching the malicious signatures.

Feel free to submit a pull request for any extra signatures that you find out there on a webpage.

Extra files:

phish.py - usage: python3 phish.py [website] example : python3 phish.py http://vogellascoding.site90.net This script performs URL heuristics on the website name, as well as considers page rank on alexa and generates a score(out of 1). I have checked the score to be more than 0.5 for it to be a phishing site.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web-crawler-to-detect-malicious-websites

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
header		header
INSTALL.txt		INSTALL.txt
README.md		README.md
crawl.py		crawl.py
myrules		myrules
phish.py		phish.py
yara_demo.py		yara_demo.py

d4r3topk/Web-crawler-to-detect-malicious-websites

Folders and files

Latest commit

History

Repository files navigation

Web-crawler-to-detect-malicious-websites

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages