Migration ratings #169
Replies: 2 comments 1 reply
-
Are these criteria supposed to be ordered by importance or is this for now simply a list? Because I would argue that the "provencance of curator" could be checked first and if the event was entered by a curator we should migrate the respective event automatically without another check. How would you judge if a free text is only copy-pasted or not? And is a long free text a criteria for exclusion or inclusion? Checking the indexing claim is also an interesting way to approach the question of legitimacy/trustworthiness of a conference. In general these criteria look reasonable to me when thinking about how to select events for automated migration. I guess you intend to reuse those criteria later to judge the (data) quality of an entry in the later platform to help with for example directing revisions and curation? However, let's not burden ourselves with too many of these criteria already. I guess some of them are more easy to find out than others. To speed up the migration process let's focus on the ones that we can already determine or need only very little work to be able to. I would prioritize the provenance of curator criterium the highest. In terms of judging the legitimacy of a conference we can be very sure that entries done by our curators are legitimate conferences. The other criteria I would prioritze for now by looking at the amount of work that is still needed to use them properly: Too much work = leave them out for now, consider them for later. |
Beta Was this translation helpful? Give feedback.
-
I would rate the provenance information as the highest priority to be able to represent the origin of all our data in a future OR version. Series completeness would be interesting to have as additional information for an editorial team to see where some additional work should be put into. I would not use it as a criterium for exclusion, though. We would not be able to import the long-persisting conference series (Bibliothekstag will have its 109th issue this year. Importing even the complete last 2 decades would still result in a red flag. "Indexing claim versus actual indexing" would already be an interesting and quality metric for the conference, basically a filter for predatory claims. |
Beta Was this translation helpful? Give feedback.
-
To determine which events and event series may be automatically migrated and which ones might need human intervention / curation a list of observations might be helpful:
on top of checking for bugs such as #152
rating aspects - observations: do we value what we measure or do we measure what we value?
-- top ordinal/- of events in corpus (%)
-- is the series known?
-- from the series data we can estimate-predict/guess when the next CFP is to be expected - do we have the data in time?
-- is the conference home page accessible
-- is the conference home page in waybackmachine?
-- is the conference home page the same as the the waybackmachine at the time of the conference
-- is there an indexing claim e.g. "Ei Compendex, and Scopus" and what is claimed - what indexes?
-- is the series actually indeed and where / how often
-- cut&paste or curated?
-- WikiCFP,dblp, GND, CEUR-WS, wikidata, ...
-- do we find the publications / proceedings
-- where were
Beta Was this translation helpful? Give feedback.
All reactions