This repo contains a multi-stage R-based script that scrapes a JavaScript-rendered E-commerce website using RSelenium and RVest. It also formats and cleans the data and stores it in a table for analysis purposes.
This is the R version of the Python crawling script I created to crawl the homzmart.com/en
website. For a full overview of the project's aim and outcomes, please check the other repo. It contains the Python version of this code and all the details about the project.
-
You need to have R and RStudio installed before you can run this code on your machine. You can install R from this link and RStudio (IDE) from this link
-
R doesn't need as much installation steps as Python. Simply clone the repo by typing this command in Git Bash and run the script right away.
git clone https://github.com/omar-elmaria/scraping_with_r_selenium_and_rvest.git
-
Please keep in mind that the homzmart website might update its backend at some point in the future, which might potentially make the CSS/XPath selectors used in the script invalid. This could cause the script to throw errors.
If you have any questions or wish to build a scraper for a particular use case, feel free to contact me on LinkedIn