Skip to content

Commit

Permalink
Merge pull request #1 from share-research/main
Browse files Browse the repository at this point in the history
Adding schema documentation and some shell scripts
  • Loading branch information
rickjohnson authored Oct 29, 2020
2 parents 160fb2f + af04fc0 commit e64b493
Show file tree
Hide file tree
Showing 8 changed files with 97 additions and 1 deletion.
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ scratch
.favorites.json
dist
NOTES.md
data/
.env
bak.env
*.code-workspace
Expand Down
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ Run the following command to fetch share data:

make fetch_share_data

# SHARE Schema
A detailed SHARE description is contained in schema/share_schema.pdf. The schema folder also contains a script that was used to generate that PDF file from the original pages within HTML on the SHARE site.

# Loading and Using Existing SHARE Data from the API

Included is a fairly simple script that demonstrates loading in JSON files that were retrieved by the fetch share data script or previously downloaded. It loads them from files within a given directory and converts them to JSON objects within code, and then outputing a sample record to the command line. When loading the JSON objects it also demonstrates to create simplified versions of the SHARE objects that only has relevant fields included. This is intended to be a starting point only for someone to add additional code to then do something with the JSON objects loaded (or to copy the approach in other languages such as python or ruby).
Expand Down
18 changes: 18 additions & 0 deletions schema/get_share_schema_html_to_pdf.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/bin/bash
input="./share_schema_links.csv"
COUNTER=0
PDFString="share_schema_home.pdf"
foldername="$(date +%Y%m%d%H%M%S)"
mkdir -p ./"$foldername"
while IFS= read -r line
do
PREV_COUNTER=$COUNTER
COUNTER=`expr $COUNTER + 1`

wkhtmltopdf "$line" "./$foldername/$COUNTER.pdf"
PDFString+=" ./$foldername/${COUNTER}.pdf"
echo $PDFString
done < "$input"

echo "Concatenating the PDF files..."
pdftk $PDFString cat output ./$foldername/share_schema.pdf
Binary file added schema/share_schema.pdf
Binary file not shown.
Binary file added schema/share_schema_home.pdf
Binary file not shown.
58 changes: 58 additions & 0 deletions schema/share_schema_links.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
https://share.osf.io/api/schema/CreativeWork
https://share.osf.io/api/schema/DataSet
https://share.osf.io/api/schema/Patent
https://share.osf.io/api/schema/Poster
https://share.osf.io/api/schema/Presentation
https://share.osf.io/api/schema/Publication
https://share.osf.io/api/schema/Article
https://share.osf.io/api/schema/Book
https://share.osf.io/api/schema/ConferencePaper
https://share.osf.io/api/schema/Dissertation
https://share.osf.io/api/schema/Preprint
https://share.osf.io/api/schema/Project
https://share.osf.io/api/schema/Registration
https://share.osf.io/api/schema/Report
https://share.osf.io/api/schema/Thesis
https://share.osf.io/api/schema/WorkingPaper
https://share.osf.io/api/schema/Repository
https://share.osf.io/api/schema/Retraction
https://share.osf.io/api/schema/Software
https://share.osf.io/api/schema/Organization
https://share.osf.io/api/schema/Consortium
https://share.osf.io/api/schema/Department
https://share.osf.io/api/schema/Institution
https://share.osf.io/api/schema/Person
https://share.osf.io/api/schema/AgentIdentifier
https://share.osf.io/api/schema/WorkIdentifier
https://share.osf.io/api/schema/Award
https://share.osf.io/api/schema/Subject
https://share.osf.io/api/schema/Tag
https://share.osf.io/api/schema/IsAffiliatedWith
https://share.osf.io/api/schema/IsEmployedBy
https://share.osf.io/api/schema/IsMemberOf
https://share.osf.io/api/schema/Cites
https://share.osf.io/api/schema/Compiles
https://share.osf.io/api/schema/Corrects
https://share.osf.io/api/schema/Discusses
https://share.osf.io/api/schema/Disputes
https://share.osf.io/api/schema/Documents
https://share.osf.io/api/schema/Extends
https://share.osf.io/api/schema/IsDerivedFrom
https://share.osf.io/api/schema/IsPartOf
https://share.osf.io/api/schema/IsSupplementTo
https://share.osf.io/api/schema/References
https://share.osf.io/api/schema/RepliesTo
https://share.osf.io/api/schema/Retracts
https://share.osf.io/api/schema/Reviews
https://share.osf.io/api/schema/UsesDataFrom
https://share.osf.io/api/schema/Contributor
https://share.osf.io/api/schema/Creator
https://share.osf.io/api/schema/PrincipalInvestigator
https://share.osf.io/api/schema/PrincipalInvestigatorContact
https://share.osf.io/api/schema/Funder
https://share.osf.io/api/schema/Host
https://share.osf.io/api/schema/Publisher
https://share.osf.io/api/schema/ThroughAwards
https://share.osf.io/api/schema/ThroughContributor
https://share.osf.io/api/schema/ThroughSubjects
https://share.osf.io/api/schema/ThroughTags
8 changes: 8 additions & 0 deletions scripts/load_all_files.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash
input="./SHARE_2.csv"
COUNTER=0
while IFS= read -r line
do
COUNTER=`expr $COUNTER + 1`
wkhtmltopdf "$line" $COUNTER.pdf
done < "$input"
10 changes: 10 additions & 0 deletions scripts/mv_share_files.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#!/bin/bash

for file in *.gz; do
dir=$(echo ${file} | sed 's/\./_/g')
dir2=$(echo ${dir} | sed 's/_json-list_gz//g')
mkdir $dir2
newfile=$(echo ${file} | sed 's/\.gz/_6\.gz/g')
mv $file $newfile
mv $newfile $dir2
done

0 comments on commit e64b493

Please sign in to comment.