Google Scholar Scraper

Installation

Clone the repository

git clone https://github.com/jyotishp/google-scholar-scraper

Setup a virtual environment (optional)

First, install virtualenv using pip or your OS package manager. Then run the following in the directory where the repository was cloned,

virtualenv --no-site-packages -p python3 scholar
source scholar/bin/activate

Install necessary packages

Go to the directory where the repository was cloned and run the following,

pip install -r requirements.txt

Usage

Example

#!/usr/bin/env python


from google_scholar import GoogleScholarUser

if __name__ == '__main__':
	user_id =  # The Google Scholar ID of the user you want to scrape
	# To retrieve this user_id, visit the profile of the user.
	# The user=xxxxxxxx part in the URL is the user_id you need.

	scraper = GoogleScholarUser(user_id)
	scraper.get_scholar_articles()

As of now, the articles are stored under scraper.articles variable. All the articles are instances of BeautifulSoup and hence contain HTML tags. In order to access only the text part, you can do something like,

for article in scraper.articles:
	print(article.text)  # Prints only the text part remove HTML tags

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
google_scholar.py		google_scholar.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Google Scholar Scraper

Installation

Clone the repository

Setup a virtual environment (optional)

Install necessary packages

Usage

Example

About

Releases

Packages

Languages

jyotishp/google-scholar-scraper

Folders and files

Latest commit

History

Repository files navigation

Google Scholar Scraper

Installation

Clone the repository

Setup a virtual environment (optional)

Install necessary packages

Usage

Example

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages