-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
import 'proceedings-article' and 'book-chapter' #198
Comments
As of 2022-01-21 there are 12343 instance of conference papers. At wikidata.bitplan.com you can analyze the availability of properties. The following query covers the properties but unfortunately times out on Wikidata Query Service. https://qlever.cs.uni-freiburg.de/wikidata is down today so i could not test it there. # truly tabular aggregate query for
# Q23927052:conference paper
# generated by trulytabular.py version 0.4.7 on 2023-01-21T10:43:31.435852
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX schema: <http://schema.org/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?conference_paperItem ?conference_paper
(MAX (?authorItem) AS ?authorItem_max)
(SAMPLE (?authorItem) AS ?authorItem_sample)
(GROUP_CONCAT (DISTINCT ?authorItem;SEPARATOR="|") AS ?authorItem_list)
(COUNT (DISTINCT ?authorItem) AS ?authorItem_count)
?author
(MAX (?page_s_) AS ?page_s__max)
(SAMPLE (?page_s_) AS ?page_s__sample)
(GROUP_CONCAT (DISTINCT ?page_s_;SEPARATOR="|") AS ?page_s__list)
(COUNT (DISTINCT ?page_s_) AS ?page_s__count)
(MAX (?DOI) AS ?DOI_max)
(SAMPLE (?DOI) AS ?DOI_sample)
(GROUP_CONCAT (DISTINCT ?DOI;SEPARATOR="|") AS ?DOI_list)
(COUNT (DISTINCT ?DOI) AS ?DOI_count)
(MAX (?language_of_work_or_nameItem) AS ?language_of_work_or_nameItem_max)
(SAMPLE (?language_of_work_or_nameItem) AS ?language_of_work_or_nameItem_sample)
(GROUP_CONCAT (DISTINCT ?language_of_work_or_nameItem;SEPARATOR="|") AS ?language_of_work_or_nameItem_list)
(COUNT (DISTINCT ?language_of_work_or_nameItem) AS ?language_of_work_or_nameItem_count)
?language_of_work_or_name
(MAX (?publication_date) AS ?publication_date_max)
(SAMPLE (?publication_date) AS ?publication_date_sample)
(GROUP_CONCAT (DISTINCT ?publication_date;SEPARATOR="|") AS ?publication_date_list)
(COUNT (DISTINCT ?publication_date) AS ?publication_date_count)
(MAX (?sponsorItem) AS ?sponsorItem_max)
(SAMPLE (?sponsorItem) AS ?sponsorItem_sample)
(GROUP_CONCAT (DISTINCT ?sponsorItem;SEPARATOR="|") AS ?sponsorItem_list)
(COUNT (DISTINCT ?sponsorItem) AS ?sponsorItem_count)
?sponsor
(MAX (?main_subjectItem) AS ?main_subjectItem_max)
(SAMPLE (?main_subjectItem) AS ?main_subjectItem_sample)
(GROUP_CONCAT (DISTINCT ?main_subjectItem;SEPARATOR="|") AS ?main_subjectItem_list)
(COUNT (DISTINCT ?main_subjectItem) AS ?main_subjectItem_count)
?main_subject
(MAX (?full_work_available_at_URL) AS ?full_work_available_at_URL_max)
(SAMPLE (?full_work_available_at_URL) AS ?full_work_available_at_URL_sample)
(GROUP_CONCAT (DISTINCT ?full_work_available_at_URL;SEPARATOR="|") AS ?full_work_available_at_URL_list)
(COUNT (DISTINCT ?full_work_available_at_URL) AS ?full_work_available_at_URL_count)
(MAX (?author_name_string) AS ?author_name_string_max)
(SAMPLE (?author_name_string) AS ?author_name_string_sample)
(GROUP_CONCAT (DISTINCT ?author_name_string;SEPARATOR="|") AS ?author_name_string_list)
(COUNT (DISTINCT ?author_name_string) AS ?author_name_string_count)
(MAX (?published_inItem) AS ?published_inItem_max)
(SAMPLE (?published_inItem) AS ?published_inItem_sample)
(GROUP_CONCAT (DISTINCT ?published_inItem;SEPARATOR="|") AS ?published_inItem_list)
(COUNT (DISTINCT ?published_inItem) AS ?published_inItem_count)
?published_in
(MAX (?NIOSHTIC_2_ID) AS ?NIOSHTIC_2_ID_max)
(SAMPLE (?NIOSHTIC_2_ID) AS ?NIOSHTIC_2_ID_sample)
(GROUP_CONCAT (DISTINCT ?NIOSHTIC_2_ID;SEPARATOR="|") AS ?NIOSHTIC_2_ID_list)
(COUNT (DISTINCT ?NIOSHTIC_2_ID) AS ?NIOSHTIC_2_ID_count)
(MAX (?on_focus_list_of_Wikimedia_projectItem) AS ?on_focus_list_of_Wikimedia_projectItem_max)
(SAMPLE (?on_focus_list_of_Wikimedia_projectItem) AS ?on_focus_list_of_Wikimedia_projectItem_sample)
(GROUP_CONCAT (DISTINCT ?on_focus_list_of_Wikimedia_projectItem;SEPARATOR="|") AS ?on_focus_list_of_Wikimedia_projectItem_list)
(COUNT (DISTINCT ?on_focus_list_of_Wikimedia_projectItem) AS ?on_focus_list_of_Wikimedia_projectItem_count)
?on_focus_list_of_Wikimedia_project
(MAX (?title) AS ?title_max)
(SAMPLE (?title) AS ?title_sample)
(GROUP_CONCAT (DISTINCT ?title;SEPARATOR="|") AS ?title_list)
(COUNT (DISTINCT ?title) AS ?title_count)
WHERE {
# instanceof Q23927052:conference paper
?conference_paperItem wdt:P31 wd:Q23927052.
# label
?conference_paperItem rdfs:label ?conference_paper.
FILTER (LANG(?conference_paper) = "en").
# author (P50)
OPTIONAL {
?conference_paperItem wdt:P50 ?authorItem.
?authorItem rdfs:label ?author.
FILTER (LANG(?author) = "en").
}
# page(s) (P304)
OPTIONAL {
?conference_paperItem wdt:P304 ?page_s_.
}
# DOI (P356)
OPTIONAL {
?conference_paperItem wdt:P356 ?DOI.
}
# language of work or name (P407)
OPTIONAL {
?conference_paperItem wdt:P407 ?language_of_work_or_nameItem.
?language_of_work_or_nameItem rdfs:label ?language_of_work_or_name.
FILTER (LANG(?language_of_work_or_name) = "en").
}
# publication date (P577)
OPTIONAL {
?conference_paperItem wdt:P577 ?publication_date.
}
# sponsor (P859)
OPTIONAL {
?conference_paperItem wdt:P859 ?sponsorItem.
?sponsorItem rdfs:label ?sponsor.
FILTER (LANG(?sponsor) = "en").
}
# main subject (P921)
OPTIONAL {
?conference_paperItem wdt:P921 ?main_subjectItem.
?main_subjectItem rdfs:label ?main_subject.
FILTER (LANG(?main_subject) = "en").
}
# full work available at URL (P953)
OPTIONAL {
?conference_paperItem wdt:P953 ?full_work_available_at_URL.
}
# author name string (P2093)
OPTIONAL {
?conference_paperItem wdt:P2093 ?author_name_string.
}
# published in (P1433)
OPTIONAL {
?conference_paperItem wdt:P1433 ?published_inItem.
?published_inItem rdfs:label ?published_in.
FILTER (LANG(?published_in) = "en").
}
# NIOSHTIC-2 ID (P2880)
OPTIONAL {
?conference_paperItem wdt:P2880 ?NIOSHTIC_2_ID.
}
# on focus list of Wikimedia project (P5008)
OPTIONAL {
?conference_paperItem wdt:P5008 ?on_focus_list_of_Wikimedia_projectItem.
?on_focus_list_of_Wikimedia_projectItem rdfs:label ?on_focus_list_of_Wikimedia_project.
FILTER (LANG(?on_focus_list_of_Wikimedia_project) = "en").
}
# title (P1476)
OPTIONAL {
?conference_paperItem wdt:P1476 ?title.
}
}
GROUP BY
?conference_paperItem
?conference_paper
?author
?language_of_work_or_name
?sponsor
?main_subject
?published_in
?on_focus_list_of_Wikimedia_project |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I faced recently some failures due to the type of publication imported (mainly
proceedings-article
andbook-chapter
), here are some examples:(None, ['unknown type: book-chapter', 'ISSN:[None] not found'], ValueError('can not create WDItemID with None'))
(None, ['unknown type: proceedings-article', 'ISSN:[None] not found'], ValueError('can not create WDItemID with None'))
I am not sure what is the best mapping to adopt for
book-chapter
, but Q23927052 seems reasonable forproceedings-article
?The text was updated successfully, but these errors were encountered: