Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about field collisions when GBIF builds occurrence records #4709

Open
Mesibov opened this issue Apr 20, 2023 · 1 comment
Open
Labels
interpretation Interpretation issues, probably should wait to be solved in Pipelines question

Comments

@Mesibov
Copy link

Mesibov commented Apr 20, 2023

In many datasets I've seen, event.txt and occurrence.txt share fields, for example eventDate. When GBIF builds occurrence records from event.txt+occurrence.txt, what happens when the "same" field is in both files? Is one field discarded? Which one?

Does GBIF check for inconsistencies in field collisions?, e.g.
eventDate = 1963-02-06 in event.txt for eventID "something2334"
eventDate = 1973-02-06 in occurrence.txt for N records with eventID "something2334"

@Mesibov
Copy link
Author

Mesibov commented Apr 23, 2023

For anyone following this issue, @timrobertson100 of GBIF has answered a related question here: tdwg/dwc-qa#201 as follows:

"In GBIF processing today, the data is pivoted to occurrences such that the fields on the event will only be used if they are null on the occurrence records. In this instance, those event properties would be dropped. There is exploratory work to bring in an event index where both fields would remain, but that is some way out."

@MortenHofft MortenHofft added question interpretation Interpretation issues, probably should wait to be solved in Pipelines labels Apr 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpretation Interpretation issues, probably should wait to be solved in Pipelines question
Projects
None yet
Development

No branches or pull requests

2 participants