Skip to content

Ongoing BL development of code produced in 2022/23 by Isaac Dunford as part of a Digital Humanities Internship funded by the School of Humanities at the University of Southampton.

Notifications You must be signed in to change notification settings

britishlibrary/Incunabula-Catalogue-Entry-Detection

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Catalogue Entry Detection

This project investigates and implements different methods for detecting catalogue entries within printed catalogues. While printed catalogues are easy enough to digitise and convert into machine readable data, dividing that data by catalogue entry requires converting visual signifiers of divisions between entries - gaps in the printed page, large or upper-case headers, catalogue references - into machine-readable information.

The data used is XML-formatted data derived from the 13-volume Catalogue of books printed in the 15th century now at the British Museum. The project was undertaken in support of Rossitza Atanassova's AHRC-RLUK Professional Practice Fellowship.

This project is the British Library maintained version of code produced in 2022/2023 by Isaac Dunford as part of a Digital Humanities Internship funded by the School of Humanities at the University of Southampton. Isaac's original code is at https://github.com/Southampton-Digital-Humanities/2023_Catalogue-Entry-Detection.

Isaac describes the work in his post of the British Library Digital Scholarship blog.

License

All data provided by the British Library: text data CC0 1.0 Universal Public Domain; images CC-BY 4.0 International. For code use MIT License.

About

Ongoing BL development of code produced in 2022/23 by Isaac Dunford as part of a Digital Humanities Internship funded by the School of Humanities at the University of Southampton.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Jupyter Notebook 98.3%
  • Python 1.7%