OCR-Parser

##Purpose

The purpose of this project is to create a program that will allow librarians/information professionals to simplify their workflow by automating certain aspects of the metadata extraction process. This program is meant mainly for digitized newspapers, and the testing is being done on papers for the North Carolina Digital Heritage Center.

Outsourcing page level digitization is a cost effective strategy, but results in missing or error prone metadata for the images returned from the vendor. The current solution at our institution is:

Create a template spreadsheet
Copy and fill revelant series or reel metadata
Open each image and manually extract metadata
Manually enter the data into the spreadsheet

This process is expensive, time consuming, and error prone.

This project makes some inroads towards automating the metadata harvesting process. It's a bit rough around the edges, but open to improvement as a community effort.

##Usage

All code and instructions will be available from this github repository.

For information on how to get started with usage, please see our Getting Started.

##Contributing

Once the intital project is realized, we'd love to have contributors help continue to automate the workflow.
Please see our Contributing Guidelines for more information.

##Contributors

This project is maintained by Amber Sherman, Dave Pcolar, and Elizabeth Peele.

##License

Provided under an MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
_data		_data
_regex		_regex
_scripts		_scripts
_templates		_templates
_utility		_utility
.gitattributes		.gitattributes
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Getting_Started.md		Getting_Started.md
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCR-Parser

About

Releases

Packages

Languages

License

eipeele/OCR-Parser

Folders and files

Latest commit

History

Repository files navigation

OCR-Parser

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages