Main goal of this Python tool is generate a list of papers on a given topic available in Scopus with minimal input data.
Input:
- a name of a topic (for saving the results)
- a reference paper via Scopus eid
- a list of your keywords for your topic of interest
Output:
- Excel file <topic_name>_outputs.xlsx representing the list of papers in Scopus relevant to the given topic
- Interactive graph showing the population of papers in html format
Figure 1 - Input and outputs of Python module
Interactive graph (shown above and its increased view below) representing the population of papers on a given topic consists of blue dots and lines. Each blue dot represents an article and lines between these dots represent their "connection". Here, a connection appears if one of the paper cites another one.
Figure 2 - Interactive graph as the output of Python code and its increased view. Blue dots are articles on a given topic and lines are their connections
After processing your query, Python generates a excel file with papers corresponding to your given topic sorted in descending order by the connection number (inside of population graph above)
Figure 3 - The example of excel file with the papers on the topic of hosting capacity
Note: In additon to this excel file, Python generates npy files with the list of publications outside Scopus and papers with the error like 404 (such situation happens if paper in Scopus is not correctly filled e.g. empty title, authors names, abstract etc). These npy files can be further processed to doublecheck of relevant papers (this doublechecking is not included in current version of module yet)
Figure 4 - Workflow of how a paper population is reconstructed inside of the python tool
First of all, you need to ensure an access to Scopus API via pybliometrics:
Refer to the site for pybliometrics instructions
-
To access Scopus via its API, you need to check two things. First, your university needs to be a subscriber (not only to Scopus, but also to its API); second, you need to register API keys at https://dev.elsevier.com/apikey/manage. For each profile, you may register 10 keys.
-
Add your API keys into config.ini (see instructions)
-
It may be neccesary to change apikey from config.json (see main folder). Note that a key allows for 5,000 retrieval requests, or 20,000 search requests via the Scopus Search API per week. Without changing the apikey, it may be quickly depleted
Using a poetry to install all neccesary packages and run a code
- Copy the reposioty to your computer and open it in your code software e.g. we use Visual Studio code
- If you do not have a poetry on your computer you can use pip to install it. Just copy
pip install poetry
into your Python terminal - Once poetry is installed, just type
poetry install
in Python terminal. This will create a virtual environment (folder .venv) where all neccesary packages will be installed. Note that the installation may take few minutes but once it will be finished you can be sure that everything would work as on our computer. - During the installation accept that .venv will be installed in the same folder where you copied this Python module. Just click yes.
- Usually this is done automatically but check that Python interpeter (.venv':poetry) is selected. If you are in Visual Studio code just see the right bottom corner
- Open main_test.py in your editor and change the name, reference_paper_eid and select your keywords or run the example for the topic self-consumption (for the sake of example, we intentionally used a long keyword to reduce the number of corresponding papers and therefore get the results faster) .
- Before running the code, make sure that you are using the university network (directly or using VPN) to access the Scopus. Otherwise you will get the 401 error Unauthorized
- Run
main_test.py
- After the message <<<< Analysis is finished >>>>, check the resuts in the excel file _outputs.xlsx and/or interactive graph
Once you installed everything, you can simply change the input data (see Figure 5) and run your case studies. You may find the eid of your reference paper on its Scopus webpage (see example on Figure 6).
Figure 5 - The only input data to be changed in order to run your case in main_test.py
Figure 6 - The way how you may find the Scopus eid e.g. 2-s2.0-85101235827.