Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Documentation: Improve python snippets in documentation #4255

Closed
ferrys opened this issue Nov 2, 2017 · 4 comments
Closed

API Documentation: Improve python snippets in documentation #4255

ferrys opened this issue Nov 2, 2017 · 4 comments

Comments

@ferrys
Copy link
Contributor

ferrys commented Nov 2, 2017

I've recently been working with Harvard CGA regarding writing code to call the Dataverse API and it has been pretty confusing. It would be helpful to have more python code snippets as guides for calling the API.

I wrote a python script found below which creates a dataset and uploads a file to that dataset. The dataset-create-new.json file can be found here. You must create your own sample_file.txt file.

import json
import requests  # http://docs.python-requests.org/en/master/
from datetime import datetime

# --------------------------------------------------
# Update the 3 params below to run this code
# --------------------------------------------------
dataverse_server = 'https://demo.dataverse.org' # no trailing slash
api_key = 'your-api-key'
dataverse_id = "root" #database id of the dataverse 

# --------------------------------------------------
# Using a "jsonData" parameter, add description for dataset
# --------------------------------------------------
with open('dataset-create-new.json') as dataset_json_file:
	file = dataset_json_file.read()
data_load = json.loads(file)
data = json.dumps(data_load)

# --------------------------------------------------
# Create new DRAFT dataset 
# --------------------------------------------------
#curl version
#POST http://$SERVER/api/dataverses/$id/datasets/?key=$apiKey
url_dataverse_id = '%s/api/dataverses/%s/datasets/?key=%s' % (dataverse_server, dataverse_id, api_key)

# -------------------
# Make the request
# -------------------
print('-' * 40)
print('making request: %s' % url_dataverse_id)
r = requests.post(url_dataverse_id, data=data)

# -------------------
# Print the response
# -------------------
print('-' * 40)
print(r.json())
print(r.status_code)

# --------------------------------------------------
# Get id and persistentId of created dataset
# --------------------------------------------------
dataset_id = r.json()['data']['id'] # database id of the dataset

# --------------------------------------------------
# Prepare "file"
# --------------------------------------------------
file_content = 'content: %s' % datetime.now()
files = {'file': ('sample_file.txt', file_content)}

# --------------------------------------------------
# Using a "jsonData" parameter, add optional description + file tags
# --------------------------------------------------
params = dict(description='Blue skies!',
            categories=['Lily', 'Rosemary', 'Jack of Hearts'])

params_as_json_string = json.dumps(params)

payload = dict(jsonData=params_as_json_string)

# --------------------------------------------------
# Add file using the Dataset's id
# --------------------------------------------------
# curl version
# POST http://$SERVER/api/datasets/$id/add?key=$apiKey
url_dataset_id = '%s/api/datasets/%s/add?key=%s' % (dataverse_server, dataset_id, api_key)

# -------------------
# Make the request
# -------------------
print('-' * 40)
print('making request: %s' % url_dataset_id)
r = requests.post(url_dataset_id, data=payload, files=files)

# -------------------
# Print the response
# -------------------
print('-' * 40)
print(r.json())
print(r.status_code)

@pdurbin
Copy link
Member

pdurbin commented Apr 2, 2019

@ferrys I just spoke with @matthew-a-dunlap about his pull request #5715 in which he's removing some python from our API Guide and I think his approach is valid. He's using curl to document the feature he added and I think it's better to focus on good curl examples rather than trying to maintain examples in multiple languages. That said, if you'd like to contribute your code above, you could put it somewhere and create a pull request to update a future version of http://guides.dataverse.org/en/4.12/api/apps.html#python to link to it. Or maybe we could create a new page linking to useful scripts people have written with a disclaimer that they aren't being actively maintained by IQSS? I do think our API docs can be a bit confusing. I'm hoping that Python developers can rally around https://github.com/IQSS/dataverse-client-python to make it awesome and easy to use.

@pdurbin
Copy link
Member

pdurbin commented Aug 20, 2019

@ferrys are you still interested in this issue? In pull request #6107 I just documented the "why" of documenting curl vs specific languages. Here's what I wrote:

Screen Shot 2019-08-20 at 5 59 11 PM

On a related note, I should mention that while hacking on the API Guide in that pull request I didn't bother testing a Python example that adds a file to a dataset. I actually thought about removing it since I didn't test it. What I'm trying say is that while I like Python, when it comes to documenting Dataverse APIs, it's a maintenance burden to try to support multiple languages. Your script above seems awesome but I probably would have passed over it too when doing the rewrite (well, refresh) of the API Guide I just did.

Also, heads up that pyDataverse is only a pip install away. 😄 We have @skasberger to thank for that! Checked out http://guides.dataverse.org/en/4.15.1/api/client-libraries.html#python . I'm using it in https://github.com/IQSS/dataverse-sample-data

@skasberger
Copy link
Contributor

I agree. Documented use-cases should be as agnostic as possible. Curl is the best way I see for this. And other, more specific libraries are linked anyway on several places.

@pdurbin
Copy link
Member

pdurbin commented Aug 22, 2019

@skasberger thanks for commenting. I still owe @sbarbosadataverse a script for downloading all files in a dataset one by one and I haven't decided yet if I'm going to write it in Bash or Python. This is a work around to #5588. #6093 might obviate the need for such a script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants