-
Notifications
You must be signed in to change notification settings - Fork 41
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
iNaturalist API/Zooniverse Integration (#3983)
* Add webmock to Gemfile * Add inat obs fixture * Add client and spec * Add Observation model and spec * Add iNat API interface and spec * Add webmock to spec_helper.rb * Add SubjectImporter and spec * Allow public access to update methods * Expose total_results with an attr_reader * Expose response full request_url with attr_reader * Mine's funnier * Use SubjectSetImport to track state * Add InatImportWorker and spec * a bit of cleanup * liked mine better * iNat import completion mailer and spec * Worker for completion mailer * remove line * # frozen_string_literal: true * Remove vestigial class constant * Use instance vars to lookup ids * Add specs for missing SubjectImporter params * Use ss_importer method * Sate the Hound * cleanup * Persist SSI in db so it's immediately updated * Add some failure mode specs * Split expects * Remove unnecessary attr_reader * Move no_change matcher def to spec_helper * New route, controller, and spec * Feed the Hound * Hound * más sabueso * Upsert correctly if subject already exists in set * Spaces for the Hound * hound * Fix specs for 5.1 (and a typo) * extra space * Don't duplicate media on subject upserts * typo * Clearer and more useful check * Add spec: count page fetches * Hound * Worker needs mini_mime require
- Loading branch information
Showing
23 changed files
with
895 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -90,4 +90,5 @@ group :test do | |
gem 'rspec-its' | ||
gem 'rspec-rails' | ||
gem 'spring-commands-rspec' | ||
gem 'webmock' | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
# frozen_string_literal: true | ||
|
||
class Api::V1::InaturalistController < Api::ApiController | ||
def import | ||
subject_set = SubjectSet.find(params[:subject_set_id]) | ||
|
||
unless subject_set.project.owners_and_collaborators.include?(api_user.user) | ||
raise Api::Unauthorized, 'Must be owner or collaborator to import' | ||
end | ||
|
||
InatImportWorker.perform_async(api_user.id, params[:taxon_id], params[:subject_set_id], params[:updated_since]) | ||
json_api_render(:ok, {}) | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# frozen_string_literal: true | ||
|
||
class InatImportCompletedMailer < ApplicationMailer | ||
layout false | ||
|
||
def inat_import_complete(ss_import) | ||
@user = User.find(ss_import.user_id) | ||
@email_to = @user.email | ||
@imported_count = ss_import.imported_count | ||
project_id = ss_import.subject_set.project_id | ||
|
||
lab_url_prefix = "#{Panoptes.frontend_url}/lab/#{project_id}" | ||
@subject_set_lab_url = "#{lab_url_prefix}/subject-sets/#{ss_import.subject_set_id}" | ||
@subject_set_name = ss_import.subject_set.display_name | ||
|
||
@no_errors = ss_import.failed_count.zero? | ||
import_status = @no_errors ? 'was successful!' : 'completed with errors' | ||
subject = "Your iNaturalist subject import #{import_status}" | ||
|
||
mail(to: @email_to, subject: subject) | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
23 changes: 23 additions & 0 deletions
23
app/views/inat_import_completed_mailer/inat_import_complete.text.erb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
Hello, | ||
|
||
Your iNaturalist subject import has finished processing. | ||
|
||
<% if @no_errors %> | ||
The iNaturalist observations have been imported successfully. | ||
<% else %> | ||
There were some errors when importing your iNaturalist observations. | ||
<% end %> | ||
|
||
<%= @imported_count %> subjects were imported into subject set '<%= @subject_set_name %>'. | ||
|
||
To view them, visit: <%= @subject_set_lab_url %> | ||
|
||
Cheers, | ||
The Zooniverse Team | ||
|
||
This is an automated email, please do not respond. | ||
|
||
To manage your Zooniverse email subscription preferences visit https://zooniverse.org/settings | ||
|
||
To unsubscribe to all Zooniverse messages please visit https://zooniverse.org/unsubscribe | ||
Please be aware that the above link will unsubscribe you from ALL Zooniverse emails. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
# frozen_string_literal: true | ||
|
||
class InatImportCompletedMailerWorker | ||
include Sidekiq::Worker | ||
|
||
sidekiq_options queue: :data_high | ||
|
||
def perform(ss_import_id) | ||
ss_import = SubjectSetImport.find(ss_import_id) | ||
InatImportCompletedMailer.inat_import_complete(ss_import).deliver | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
# frozen_string_literal: true | ||
|
||
class InatImportWorker | ||
include Sidekiq::Worker | ||
|
||
# skip retries for this job to avoid re-running imports with errors | ||
sidekiq_options retry: 0, queue: :data_medium | ||
|
||
def perform(user_id, taxon_id, subject_set_id, updated_since=nil) | ||
inat = Inaturalist::ApiInterface.new(taxon_id: taxon_id, updated_since: updated_since) | ||
importer = Inaturalist::SubjectImporter.new(user_id, subject_set_id) | ||
|
||
# Use a SubjectSetImport instance to track progress & store data | ||
ss_import = importer.subject_set_import | ||
|
||
imported_row_count = 0 | ||
inat.observations.each do |obs| | ||
begin | ||
importer.import(obs) | ||
rescue Inaturalist::SubjectImporter::FailedImport | ||
ss_import.update_columns( | ||
failed_count: ss_import.failed_count + 1, | ||
failed_uuids: ss_import.failed_uuids | [obs.external_id] | ||
) | ||
end | ||
|
||
imported_row_count += 1 | ||
|
||
# update the imported_count as we progress through the import so we can use | ||
# this as a progress metric on API resource polling (see SubjectSetWorker) | ||
ss_import.save_imported_row_count(imported_row_count) if (imported_row_count % update_progress_every_rows(inat.total_results)).zero? | ||
end | ||
|
||
ss_import.save_imported_row_count(imported_row_count) | ||
|
||
# Count that subject set, like right now | ||
SubjectSetSubjectCounterWorker.new.perform(subject_set_id) | ||
|
||
# notify the user about the import success / failure | ||
InatImportCompletedMailerWorker.perform_async(ss_import.id) | ||
end | ||
|
||
def update_progress_every_rows(total_results) | ||
SubjectSetImport::ProgressUpdateCadence.calculate(total_results) | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
# frozen_string_literal: true | ||
|
||
module Inaturalist | ||
class ApiInterface | ||
require 'faraday' | ||
require 'faraday_middleware' | ||
require 'json' | ||
|
||
# Set maximum imported subjects, or no limit with -1 | ||
attr_reader :taxon_id, :total_results, :observation_cache, :params | ||
|
||
def initialize(taxon_id:, updated_since: nil, max_observations: -1) | ||
@taxon_id = taxon_id | ||
@max_observations = max_observations | ||
@observation_cache = [] | ||
@id_above = 0 | ||
@params = { taxon_id: @taxon_id } | ||
@params[:updated_since] = updated_since unless updated_since.nil? | ||
@done = false | ||
@total_results = nil | ||
end | ||
|
||
def observations | ||
Enumerator.new do |yielder| | ||
loop do | ||
results = fetch_next_page | ||
raise StopIteration if @done | ||
|
||
results.each do |obs| | ||
yielder.yield Observation.new(obs) | ||
end | ||
end | ||
end | ||
end | ||
|
||
def fetch_next_page | ||
page_params = @params.merge(id_above: @id_above) | ||
response = client.get(page_params) | ||
@total_results ||= response['total_results'] | ||
results = response['results'] | ||
# Stop if a) there are no more results | ||
# b) the total number of desired subjects is hit | ||
# c) the ID of the last seen observation is the same as the last result's id | ||
@done = true if results.empty? || max_cache_hit? || @id_above == results.last['id'] | ||
return if @done | ||
|
||
@observation_cache += results | ||
@id_above = results.last['id'] | ||
@params['id_above'] = @id_above | ||
results | ||
end | ||
|
||
def max_cache_hit? | ||
# Short circuit to turn off limit | ||
return false if @max_observations == -1 | ||
|
||
@observation_cache.size >= @max_observations | ||
end | ||
|
||
def client | ||
@client ||= Client.new | ||
end | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# frozen_string_literal: true | ||
|
||
module Inaturalist | ||
class Client | ||
attr_reader :url, :request_url, :headers, :default_params | ||
|
||
def initialize | ||
@url = 'https://api.inaturalist.org/v1/observations' | ||
@request_url = nil | ||
@headers = { 'User-Agent' => 'zooniverse-import' } | ||
@default_params = { | ||
verifiable: true, | ||
order: 'asc', | ||
order_by: 'id', | ||
per_page: 200 | ||
} | ||
end | ||
|
||
def get(params) | ||
request_params = @default_params.merge(params) | ||
conn = Faraday.new( | ||
url: @url, | ||
headers: @headers, | ||
params: request_params | ||
) do |f| | ||
f.request :url_encoded | ||
f.request :retry | ||
f.response :raise_error | ||
f.response :json | ||
f.adapter Faraday.default_adapter | ||
end | ||
|
||
begin | ||
response = conn.get | ||
@request_url = response.env.url.to_s | ||
conn.get.body | ||
rescue Faraday::ClientError => e | ||
raise e | ||
end | ||
end | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
# frozen_string_literal: true | ||
|
||
module Inaturalist | ||
class Observation | ||
require 'mini_mime' | ||
|
||
def initialize(obs) | ||
@obs = obs | ||
end | ||
|
||
def external_id | ||
@obs['id'] | ||
end | ||
|
||
def metadata | ||
@metadata ||= extract_metadata(@obs) | ||
end | ||
|
||
def extract_metadata(obs) | ||
metadata = {} | ||
metadata['id'] = obs['id'] | ||
metadata['change'] = 'No changes were made to this image.' | ||
metadata['observed_on'] = obs['observed_on'] | ||
metadata['time_observed_at'] = obs['time_observed_at'] | ||
metadata['quality_grade'] = obs['quality_grade'] | ||
metadata['num_identification_agreements'] = obs['num_identification_agreements'] | ||
metadata['num_identification_disagreements'] = obs['num_identification_disagreements'] | ||
metadata['location'] = obs['location'] | ||
metadata['geoprivacy'] = obs['geoprivacy'] | ||
metadata['scientific_name'] = obs['taxon']['name'] | ||
metadata | ||
end | ||
|
||
def locations | ||
@locations ||= extract_locations(@obs) | ||
end | ||
|
||
def extract_locations(obs) | ||
locations = [] | ||
obs['photos'].each do |p| | ||
url = p['url'].sub('square', 'original') | ||
mimetype = mime_type_from_file_extension(url) | ||
locations << { mimetype => url } | ||
end | ||
locations | ||
end | ||
|
||
def all_rights_reserved? | ||
@obs['license_code'].nil? | ||
end | ||
|
||
def mime_type_from_file_extension(url) | ||
MiniMime.lookup_by_filename(url).content_type | ||
end | ||
end | ||
end |
Oops, something went wrong.