Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancies between Taxonomy API and Public API #85

Open
salix-d opened this issue Apr 26, 2022 · 2 comments
Open

Discrepancies between Taxonomy API and Public API #85

salix-d opened this issue Apr 26, 2022 · 2 comments
Labels
API-related API-side problem needing work around enhancement Something to add

Comments

@salix-d
Copy link
Collaborator

salix-d commented Apr 26, 2022

Availability of taxa

Some species can't be found by name or id using the Taxonomy API (bold_tax_name()|bold_tax_id()) but still have public records ("Moraea elsiae", for example). bold_stats(ids = "GBVC3127-11") returns the species name.

I contacted their support to inform them.

I'm currently trying to build a database of all their public taxonomy with their taxid and all their name's variation. Goal being that user will be able to confirm that their taxon exist in their database and also to facilitate getting downstream lineage.

Once done, I'll try to make it in a way that can be automate so it can be updated when their database is.

@salix-d salix-d added bug Something to fix enhancement Something to add labels Apr 26, 2022
@salix-d
Copy link
Collaborator Author

salix-d commented Apr 26, 2022

Classification of taxa

bold::bold_tax_id(48327)
#  input taxid      taxon tax_rank tax_division parentid   taxonrep
# 1 48327 48327 Rhodophyta   phylum     Protista        1 Rhodophyta

Their Taxonomy API says that Rhodophyta is from Protista even though on their Taxonomy page it's listed under Plants...

@salix-d salix-d added API-related API-side problem needing work around and removed bug Something to fix labels Apr 26, 2022
@salix-d salix-d changed the title Discrepancy between Taxonomy API and Public API Discrepancies between Taxonomy API and Public API Apr 26, 2022
@salix-d
Copy link
Collaborator Author

salix-d commented Apr 27, 2022

Names of taxa

When looking up "Suaeda sp. 'Socotra'", the taxon name returned is "Suaeda sp. 'Socotra".
However, if we try to get the records using that names, they aren't found; we need to add back the closing quote.

> bold::bold_tax_name("Suaeda sp. \\'Socotra\\'")
#     taxid               taxon tax_rank tax_division parentid parentname specimenrecords                    input
# 1 1082786 Suaeda sp. 'Socotra  species      Plantae   156339     Suaeda               1 Suaeda sp. \\'Socotra\\'
> bold::bold_specimens("Suaeda sp. \\'Socotra")
# Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
  no lines available in input
> bold::bold_specimens("Suaeda sp. \\'Socotra\\'")
#      processid sampleid recordID catalognum fieldnum      institution_storing collection_code      bin_uri phylum_taxID   phylum_name
# 1 GBCMD0539-06 AY803585   491573         NA          Mined from GenBank, NCBI              NA BOLD:AAJ3826           20    Arthropoda
# 2  GBVE4209-11 AY514841  2288624         NA AY514841 Mined from GenBank, NCBI              NA                        12 Magnoliophyta

This also happens with names ending with a dot or a parenthesis (possibly other non-alphanumeric character I haven't seen yet)

This usually isn't a problem when looking for records using higher taxonomic ranks, which I think is the most common way to look for records, but might want to have a check for species names/have a function to make sure species names are valid before looking for records?

salix-d added a commit that referenced this issue Mar 4, 2023
update to assert to be able to check params needing not to be empty or needing to be of length one.
update to setread to be able to add more options if needed.
new function to fix taxon names with single quotes or ending with a parenthesis or dot. see issues #84 & #85
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API-related API-side problem needing work around enhancement Something to add
Projects
None yet
Development

No branches or pull requests

1 participant