Skip to content

Commit

Permalink
Merge pull request #4 from gsu-library/develop
Browse files Browse the repository at this point in the history
Add orcid ID integration
  • Loading branch information
vle91 authored Feb 16, 2023
2 parents e4f376e + 236b729 commit 25fb0ef
Show file tree
Hide file tree
Showing 8 changed files with 217 additions and 54 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
node_modules/
reports/
uploads/
config/*
!config/config.sample.php
.htaccess
.htpasswd
config.php
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.3.0] - 2023-02-15
- Add ORCID ID integration.

## [1.2.0] - 2023-02-01
- Reorganize code.

Expand Down
14 changes: 10 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,23 +3,28 @@ Code Repository: https://github.com/gsu-library/datacite-bulk-doi-creator-webapp
Author: Matt Brooks <[email protected]>
Date Created: 2022-06-29
License: [GPL3](LICENSE)
Version: 1.2.0
Version: 1.3.0

## Description
A PHP WebApp that bulk creates DataCite DOIs from a provided CSV file. DOIs are created in the findable state. If you are looking for the python version of this WebApp see [DataCite Bulk DOI Creator](https://github.com/gsu-library/datacite-bulk-doi-creator).

For more information about DOIs please see DataCite's [support page](https://support.datacite.org/) and/or resources from their [homepage](https://doi.datacite.org/). Information on their [metadata schemas](https://schema.datacite.org/) is also available.

## Setup
Put the repository files in a folder that is within Apache's webroot.
Put the repository files in a folder that is within your web server's webroot.

### Configuration
### General Configuration
Rename config/config.sample.php to config/config.php and fill in your DOI prefix, username (repository ID), and password. If wanting to test the script out with the test DataCite API replace the URL with the test API URL (https://api.test.datacite.org/dois) and credentials. There are other configuration options that can be adjusted if wanted.

**It is important that the config folder and its contents are not readable from a web browser. If not using Apache, the config/.htaccess file should be replaced with something denying web access to the contents of the folder.**

PHP will also need read/write access to both the reports and uploads folders. Make sure owner/group permissions are set accordingly.

### ORCID Configuration
Create an [ORCID](https://orcid.org) account, [register a public API client](https://info.orcid.org/documentation/integration-guide/registering-a-public-api-client/), and set the client ID and secret in the configuration file. The token and API URLs can be adjusted if sandbox testing is wanted.

**The config folder will need to be writable by PHP to save the ORCID acess token.**

### Authentication
Currently this application uses basic authentication provided by Apache (see [Apache AuthType directive](https://httpd.apache.org/docs/2.4/mod/mod_authn_core.html#authtype)). To use basic authentication [create a .htpasswd file](https://httpd.apache.org/docs/2.4/programs/htpasswd.html) within the config directory, rename .htaccess.sample to .htaccess in the root folder, and set the AuthUserFile directive to the absolute path of the .htpasswd file. The .htpasswd file does not have to live in the config folder, but wherever it lives should not be accessible from the web.

Expand All @@ -34,8 +39,9 @@ type - resource type
description - abstract description
publisher - publisher
source_url - URL reference to resource
oricd - ORCID ID (not required, replaces creator fields when configured and present)
creator{n} - full creator name (header example: creator1, creator2, etc.)
creator{n}_type - Personal or Organizational
creator{n}_type - Personal or Organizational (not required, defaults to Personal)
creator{n}_given - creator given name
creator{n}_family - creator family name

Expand Down
9 changes: 7 additions & 2 deletions config/config.sample.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
/**
* Configuration.
*
* @var array [url, doiPrefix, username, password, maxSubmittedFiles, maxReportedFiles, maxUploadSize]
* @var array [url, doiPrefix, username, password, maxSubmittedFiles, maxReportedFiles,
* maxUploadSize, orcidTokenUrl, orcidApiUrl, orcidClientId, orcidSecret]
*/
const CONFIG = [
'url' => 'https://api.datacite.org/dois',
Expand All @@ -11,5 +12,9 @@
'password' => '',
'maxSubmittedFiles' => 20,
'maxReportFiles' => 20,
'maxUploadSize' => 10240
'maxUploadSize' => 10240,
'orcidTokenUrl' => 'https://orcid.org/oauth/token',
'orcidApiUrl' => 'https://pub.orcid.org/',
'orcidClientId' => '',
'orcidSecret' => ''
];
13 changes: 0 additions & 13 deletions includes/functions.php
Original file line number Diff line number Diff line change
@@ -1,17 +1,4 @@
<?php
/**
* Redirects browser to the index page.
*
* // TODO: check to see if this needs to be loaded on any pages or just on includes.
*
* @return void
*/
function go_home() {
header('location: .');
exit;
}


/**
* Loads the configuration file.
*
Expand Down
198 changes: 181 additions & 17 deletions includes/submit_functions.php
Original file line number Diff line number Diff line change
@@ -1,4 +1,17 @@
<?php
/**
* Redirects browser to the index page.
*
* // TODO: move go_home calls to submit.php based on function returns.
*
* @return void
*/
function go_home() {
header('location: .');
exit;
}


/**
* Check for PHP cURL and that both the reports and uploads directories are writable.
*
Expand Down Expand Up @@ -50,18 +63,6 @@ function validate_csrf_token() {
}


/**
* Check if all needles exist in haystack.
*
* @param array $needles Array of values to search for.
* @param array $haystack Array of values to search in.
* @return bool
*/
function in_array_all($needles, $haystack) {
return empty(array_diff($needles, $haystack));
}


/**
* Remove oldest files from directory.
*
Expand Down Expand Up @@ -163,8 +164,6 @@ function process_upload_headers($uploadFp) {
'publisher',
'source_url',
'creator1',
'creator1_type', // TODO: will depend on orchid id
// don't require and assume personal?
'creator1_given',
'creator1_family',
];
Expand All @@ -181,9 +180,8 @@ function process_upload_headers($uploadFp) {
}, $headers);

// Make sure CSV file has all required headers.
// TODO: specify missing headers
if(!in_array_all($requiredHeaders, $headers)) {
array_push($_SESSION['output'], 'The uploaded CSV file is missing required headers.');
if(!empty($missingHeaders = array_diff($requiredHeaders, $headers))) {
array_push($_SESSION['output'], 'The uploaded CSV file is missing the required headers: ' . implode(', ', $missingHeaders));
go_home();
}

Expand Down Expand Up @@ -217,3 +215,169 @@ function open_report_file($uploadFullPath) {

return $reportFp;
}


/**
* Creates and returns an array of creators based on the passed row.
*
* @param array $creatorHeaders An array of the number of creator headers found in the submitted file.
* @param array $row The current row of data being processed.
* @return array A formatted array of creators for the row.
*/
function get_creators($creatorHeaders, $row) {
$creators = [];
$tokenFile = 'config/orcid_token.json';

// If ORCID header exists process that instead of creator{n}.
if(!empty($row['orcid'])) {
if(!file_exists($tokenFile)) {
// Can we write to the config folder?
if(!is_writable('config')) {
array_push($_SESSION['output'], 'The config directory is not writable.');
return $creators;
}

if(!($tokenInfo = get_orcid_token())) {
return $creators;
}
}
else {
$tokenInfo = json_decode(file_get_contents($tokenFile), true);

// If token is expired get a new one.
if($tokenInfo['expires_on'] <= time()) {
if(!($tokenInfo = get_orcid_token())) {
return $creators;
}
}
}

preg_match('/(\d{4}-){3}\d{3}(\d|X)/', $row['orcid'], $matches);
$creators = get_orcid_name($matches[0], $tokenInfo['access_token']);
}
else {
// Process multiple creators.
foreach($creatorHeaders as $x) {
if(!empty($row[$x])) {
// Make nameType optional.
if(empty($row[$x.'_type']) || $row[$x.'_type'] !== 'Organizational') {
$nameType = 'Personal';
}
else {
$nameType = 'Organizational';
}

array_push($creators, [
'name' => $row[$x],
'nameType' => $nameType,
'givenName' => $row[$x.'_given'],
'familyName' => $row[$x.'_family']
]);
}
}
}

return $creators;
}


/**
* Retrieves a public read token from ORCID, writes it to file, and returns an array of related data.
*
* @return array|null The ORCID access token and related information.
*/
function get_orcid_token() {
$tokenFile = 'config/orcid_token.json';
$ch = curl_init();
$postFields = [
'client_id' => CONFIG['orcidClientId'],
'client_secret' => CONFIG['orcidSecret'],
'grant_type' => 'client_credentials',
'scope' => '/read-public'
];

// Are ORCID credentials configured?
if(empty(CONFIG['orcidClientId']) || empty(CONFIG['orcidSecret'])) {
array_push($_SESSION['output'], 'ORCID credentials are not configured.');
return null;
}

if(CONFIG['devMode']) {
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
}

curl_setopt_array($ch, [
CURLOPT_URL => CONFIG['orcidTokenUrl'],
CURLOPT_POST => true,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HTTPHEADER => [
'content-type: application/x-www-form-urlencoded'
],
CURLOPT_POSTFIELDS => http_build_query($postFields)
]);

$result = json_decode(curl_exec($ch), true);

// If no access token is provided.
if(empty($result['access_token'])) {
array_push($_SESSION['output'], 'There was an error requesting an access token from ORCID.');
return null;
}

unset($result['orcid']);
$result['expires_on'] = $result['expires_in'] + time();
curl_close($ch);
// Save contents to file as JSON.
file_put_contents($tokenFile, json_encode($result, JSON_PRETTY_PRINT));

return $result;
}


/**
* Returns a creator array from the given ORCID ID and access token.
*
* @param string $orcid The ORICD ID to lookup.
* @param string $token The public read access token to use.
* @return array A creator array.
*/
function get_orcid_name($orcid, $token) {
$apiUrl = CONFIG['orcidApiUrl'].'v3.0/'.$orcid.'/personal-details';
$creator = [];
$ch = curl_init();

if(CONFIG['devMode']) {
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
}

curl_setopt_array($ch, [
CURLOPT_URL => $apiUrl,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HTTPHEADER => [
'content-type: application/orcid+json',
'Authorization: Bearer '.$token,
]
]);

$result = json_decode(curl_exec($ch), true);
$code = curl_getinfo($ch, CURLINFO_HTTP_CODE);

if($code === 200) {
array_push($creator, [
'name' => $result['name']['family-name']['value'].', '.$result['name']['given-names']['value'],
'nameType' => 'Personal',
'givenName' => $result['name']['given-names']['value'],
'familyName' => $result['name']['family-name']['value']
]);
}
else if($code === 404) {
array_push($_SESSION['output'], 'ORCID ID '.$orcid.' not found.');
}
else {
array_push($_SESSION['output'], 'There was an error querying ORCID. Please try again.');
// Just in case it is a token issue, grab a new token for the next submission.
get_orcid_token();
}

return $creator;
}
5 changes: 5 additions & 0 deletions index.php
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,11 @@
<div class="col-lg-5 my-3">
<h2><span>Configuration</span></h2>
<ul class="list-group mb-3">
<?php
if(CONFIG['devMode']) {
echo '<li class="list-group-item text-danger"><strong>Dev Mode Enabled</strong></li>';
}
?>
<li class="list-group-item">
DOI Prefix: <?= CONFIG['doiPrefix']; ?>
</li>
Expand Down
Loading

0 comments on commit 25fb0ef

Please sign in to comment.