Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import 'proceedings-article' and 'book-chapter' #198

Open
Adafede opened this issue Oct 27, 2022 · 1 comment
Open

import 'proceedings-article' and 'book-chapter' #198

Adafede opened this issue Oct 27, 2022 · 1 comment

Comments

@Adafede
Copy link

Adafede commented Oct 27, 2022

I faced recently some failures due to the type of publication imported (mainly proceedings-article and book-chapter), here are some examples:

I am not sure what is the best mapping to adopt for book-chapter, but Q23927052 seems reasonable for proceedings-article?

@WolfgangFahl
Copy link

WolfgangFahl commented Jan 21, 2023

As of 2022-01-21 there are 12343 instance of conference papers. At wikidata.bitplan.com you can analyze the availability of properties.
The most common properties are:
grafik

The following query covers the properties but unfortunately times out on Wikidata Query Service. https://qlever.cs.uni-freiburg.de/wikidata is down today so i could not test it there.

# truly tabular aggregate query for 
# Q23927052:conference paper
# generated by trulytabular.py version 0.4.7 on 2023-01-21T10:43:31.435852
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX schema: <http://schema.org/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

SELECT ?conference_paperItem ?conference_paper
  (MAX (?authorItem) AS ?authorItem_max)
  (SAMPLE (?authorItem) AS ?authorItem_sample)
  (GROUP_CONCAT (DISTINCT ?authorItem;SEPARATOR="|") AS ?authorItem_list)
  (COUNT (DISTINCT ?authorItem) AS ?authorItem_count)
  ?author
  (MAX (?page_s_) AS ?page_s__max)
  (SAMPLE (?page_s_) AS ?page_s__sample)
  (GROUP_CONCAT (DISTINCT ?page_s_;SEPARATOR="|") AS ?page_s__list)
  (COUNT (DISTINCT ?page_s_) AS ?page_s__count)
  (MAX (?DOI) AS ?DOI_max)
  (SAMPLE (?DOI) AS ?DOI_sample)
  (GROUP_CONCAT (DISTINCT ?DOI;SEPARATOR="|") AS ?DOI_list)
  (COUNT (DISTINCT ?DOI) AS ?DOI_count)
  (MAX (?language_of_work_or_nameItem) AS ?language_of_work_or_nameItem_max)
  (SAMPLE (?language_of_work_or_nameItem) AS ?language_of_work_or_nameItem_sample)
  (GROUP_CONCAT (DISTINCT ?language_of_work_or_nameItem;SEPARATOR="|") AS ?language_of_work_or_nameItem_list)
  (COUNT (DISTINCT ?language_of_work_or_nameItem) AS ?language_of_work_or_nameItem_count)
  ?language_of_work_or_name
  (MAX (?publication_date) AS ?publication_date_max)
  (SAMPLE (?publication_date) AS ?publication_date_sample)
  (GROUP_CONCAT (DISTINCT ?publication_date;SEPARATOR="|") AS ?publication_date_list)
  (COUNT (DISTINCT ?publication_date) AS ?publication_date_count)
  (MAX (?sponsorItem) AS ?sponsorItem_max)
  (SAMPLE (?sponsorItem) AS ?sponsorItem_sample)
  (GROUP_CONCAT (DISTINCT ?sponsorItem;SEPARATOR="|") AS ?sponsorItem_list)
  (COUNT (DISTINCT ?sponsorItem) AS ?sponsorItem_count)
  ?sponsor
  (MAX (?main_subjectItem) AS ?main_subjectItem_max)
  (SAMPLE (?main_subjectItem) AS ?main_subjectItem_sample)
  (GROUP_CONCAT (DISTINCT ?main_subjectItem;SEPARATOR="|") AS ?main_subjectItem_list)
  (COUNT (DISTINCT ?main_subjectItem) AS ?main_subjectItem_count)
  ?main_subject
  (MAX (?full_work_available_at_URL) AS ?full_work_available_at_URL_max)
  (SAMPLE (?full_work_available_at_URL) AS ?full_work_available_at_URL_sample)
  (GROUP_CONCAT (DISTINCT ?full_work_available_at_URL;SEPARATOR="|") AS ?full_work_available_at_URL_list)
  (COUNT (DISTINCT ?full_work_available_at_URL) AS ?full_work_available_at_URL_count)
  (MAX (?author_name_string) AS ?author_name_string_max)
  (SAMPLE (?author_name_string) AS ?author_name_string_sample)
  (GROUP_CONCAT (DISTINCT ?author_name_string;SEPARATOR="|") AS ?author_name_string_list)
  (COUNT (DISTINCT ?author_name_string) AS ?author_name_string_count)
  (MAX (?published_inItem) AS ?published_inItem_max)
  (SAMPLE (?published_inItem) AS ?published_inItem_sample)
  (GROUP_CONCAT (DISTINCT ?published_inItem;SEPARATOR="|") AS ?published_inItem_list)
  (COUNT (DISTINCT ?published_inItem) AS ?published_inItem_count)
  ?published_in
  (MAX (?NIOSHTIC_2_ID) AS ?NIOSHTIC_2_ID_max)
  (SAMPLE (?NIOSHTIC_2_ID) AS ?NIOSHTIC_2_ID_sample)
  (GROUP_CONCAT (DISTINCT ?NIOSHTIC_2_ID;SEPARATOR="|") AS ?NIOSHTIC_2_ID_list)
  (COUNT (DISTINCT ?NIOSHTIC_2_ID) AS ?NIOSHTIC_2_ID_count)
  (MAX (?on_focus_list_of_Wikimedia_projectItem) AS ?on_focus_list_of_Wikimedia_projectItem_max)
  (SAMPLE (?on_focus_list_of_Wikimedia_projectItem) AS ?on_focus_list_of_Wikimedia_projectItem_sample)
  (GROUP_CONCAT (DISTINCT ?on_focus_list_of_Wikimedia_projectItem;SEPARATOR="|") AS ?on_focus_list_of_Wikimedia_projectItem_list)
  (COUNT (DISTINCT ?on_focus_list_of_Wikimedia_projectItem) AS ?on_focus_list_of_Wikimedia_projectItem_count)
  ?on_focus_list_of_Wikimedia_project
  (MAX (?title) AS ?title_max)
  (SAMPLE (?title) AS ?title_sample)
  (GROUP_CONCAT (DISTINCT ?title;SEPARATOR="|") AS ?title_list)
  (COUNT (DISTINCT ?title) AS ?title_count)
WHERE {
  # instanceof Q23927052:conference paper
  ?conference_paperItem wdt:P31 wd:Q23927052.
  # label
  ?conference_paperItem rdfs:label ?conference_paper.  
  FILTER (LANG(?conference_paper) = "en").
  # author (P50)
  OPTIONAL { 
    ?conference_paperItem wdt:P50 ?authorItem. 
    ?authorItem rdfs:label ?author.
    FILTER (LANG(?author) = "en").
  }
  # page(s) (P304)
  OPTIONAL { 
    ?conference_paperItem wdt:P304 ?page_s_. 
  }
  # DOI (P356)
  OPTIONAL { 
    ?conference_paperItem wdt:P356 ?DOI. 
  }
  # language of work or name (P407)
  OPTIONAL { 
    ?conference_paperItem wdt:P407 ?language_of_work_or_nameItem. 
    ?language_of_work_or_nameItem rdfs:label ?language_of_work_or_name.
    FILTER (LANG(?language_of_work_or_name) = "en").
  }
  # publication date (P577)
  OPTIONAL { 
    ?conference_paperItem wdt:P577 ?publication_date. 
  }
  # sponsor (P859)
  OPTIONAL { 
    ?conference_paperItem wdt:P859 ?sponsorItem. 
    ?sponsorItem rdfs:label ?sponsor.
    FILTER (LANG(?sponsor) = "en").
  }
  # main subject (P921)
  OPTIONAL { 
    ?conference_paperItem wdt:P921 ?main_subjectItem. 
    ?main_subjectItem rdfs:label ?main_subject.
    FILTER (LANG(?main_subject) = "en").
  }
  # full work available at URL (P953)
  OPTIONAL { 
    ?conference_paperItem wdt:P953 ?full_work_available_at_URL. 
  }
  # author name string (P2093)
  OPTIONAL { 
    ?conference_paperItem wdt:P2093 ?author_name_string. 
  }
  # published in (P1433)
  OPTIONAL { 
    ?conference_paperItem wdt:P1433 ?published_inItem. 
    ?published_inItem rdfs:label ?published_in.
    FILTER (LANG(?published_in) = "en").
  }
  # NIOSHTIC-2 ID (P2880)
  OPTIONAL { 
    ?conference_paperItem wdt:P2880 ?NIOSHTIC_2_ID. 
  }
  # on focus list of Wikimedia project (P5008)
  OPTIONAL { 
    ?conference_paperItem wdt:P5008 ?on_focus_list_of_Wikimedia_projectItem. 
    ?on_focus_list_of_Wikimedia_projectItem rdfs:label ?on_focus_list_of_Wikimedia_project.
    FILTER (LANG(?on_focus_list_of_Wikimedia_project) = "en").
  }
  # title (P1476)
  OPTIONAL { 
    ?conference_paperItem wdt:P1476 ?title. 
  }
}
GROUP BY
  ?conference_paperItem 
  ?conference_paper

  ?author
  ?language_of_work_or_name
  ?sponsor
  ?main_subject
  ?published_in
  ?on_focus_list_of_Wikimedia_project

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants