These scripts written by Ettore Rizza (Université libre de Bruxelles) help to extract place names and personal names from user queries. Developed in the context of a large-scale case study conducted at the Royal Library of Belgium, scripts have been applied on a data set of 83,854 queries resulting from 29,812 visits (on the online historical newspapers platform BelgicaPress), over a 12-month period. By making use of information extraction methods, knowledge bases (KBs) and various authority files, our method aims to facilitate automated analysis of user queries' content.
This case-study has been presented in a paper published by the Journal of Documentation: Anne Chardonnens, Ettore Rizza, Mathias Coeckelbergs, Seth van Hooland, (2018) "Mining user queries with information extraction methods and linked data", Journal of Documentation, https://doi.org/10.1108/JD-09-2017-0133
As open data enthusiats, we wanted to share our code with you. However, we have to admit it can still be cryptic and not as user-friendly as we would dream. Please be get in touch on Twitter (@Ettore_Rizza) if you plan to use it and have further questions or need some assistance!