Multiple Species Labels #27

riggsd · 2024-04-30T23:05:12Z

The GUANO spec says that Species Auto ID and Species Manual ID are a "list of strings".

The serialized value should be comma-separated:

Species Manual ID: Mylu, Epfu

The Python library currently returns an str value, which in the above case would be "Mylu, Epfu".

This means that library users would need to split it and strip it to know how many and which species labels were present.

It means that they might need to parse it and reassemble it in order to append a new value. (The spec doesn't actually state that values must be unique, but for most use cases that's probably desired.)

Therefore I think we should set up coercion/serialization for these Species fields so that they return a Python list, as follows:

Species field not present: md.get("Species Auto ID") -> None and "species Auto ID" in md == false
Species field empty/blank: md["Species Auto ID"] -> []
Species field has a single species label: md["Species Auto ID"] -> ["Mylu"]
Species field has multiple species labels: md["Species Auto ID"] -> ["Mylu", "Epfu"]

The spec defines these fields as optional, so the way to check whether a recording is a Mylu recording would be:

if "Mylu" in md.get("Species Auto ID", []): ...

It would be cleaner and more intuitive if these two fields were required, but Timestamp is the only required field at this time.

Another approach to making it slightly cleaner would be adding explicit accessor methods for all of the well-known fields, where in could still check whether the field was present, and def species_auto_id(self) -> list would always return a list, possibly empty.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple Species Labels #27

Multiple Species Labels #27

riggsd commented Apr 30, 2024

Multiple Species Labels #27

Multiple Species Labels #27

Comments

riggsd commented Apr 30, 2024