Skip to content

Latest commit

 

History

History
33 lines (20 loc) · 3.32 KB

README.md

File metadata and controls

33 lines (20 loc) · 3.32 KB

SAF Creator

is a desktop application written in Java. Its purpose is to prepare Simple Archive Format (SAF) archives for importation into DSpace repositories. There are a number of good tools for this purpose, and every use case is different. Many digital curators choose to package their SAF with local custom scripts. But general purpose tools can be immensely useful, especially when supplemented by custom scripts. Other popular general-purpose SAF support tools that may meet your needs include PySAF and SAFBuilder.

Deployment basics

Running SAFCreator requires a JVM.

If you prefer not to build from source, a compiled jar is provided at this link: https://github.com/jcreel/SAFCreator/raw/master/jarfile/SAFCreator-0.0.2-SNAPSHOT.one-jar.jar

Building SAFCreator requires Apache Maven. Build with "mvn clean package". Run (replacing the version as appropriate) with java -jar target/SAFCreator-0.0.2-SNAPSHOT.one-jar.jar

Usage instructions

Basically, you need a spreadsheet (a CSV file) with the metadata and references to the files. Each row represents one item, and each column a metadata field. This is a typical starting format for digital library metadata.

To make the references to the files, there needs to be (at least) one special column having the heading “filename” or “bundle:ORIGINAL” if you want to specify the bundle. You’d typically replace ORIGINAL with another bundle. Just using “filename” defaults to the ORIGINAL bundle. You can have multiple columns for multiple bundles. Then in the column under that heading, you would have the filenames (separated by double bar ||) of the bitstreams to go in that item. You can also use subpaths relative to the top level directory of the files, and * as a wildard to include everything under a subpath.

The other headings would be dc-style field labels, e.g. “dc.title” or “dc.description.abstract”. And the cells in that column would be the values (again, double bar || delimited) for that field for each item.

You can find a couple of example projects at https://github.com/jcreel/SAFCreator/tree/master/src/main/resources/SAF.

In the SAFCreator, you need to use the file picker to select the CSV file, select the directory where the files are (that are referred to in the “filename” or bundle columns), and select the directory where you want to write the SAF. You need load the batch with the button, then go over to the validation tab and validate the batch, and then go back to the first tab and do the writing.

Character encoding issues on Windows

DSpace works best with everything encoded in UTF-8, but the JVM on a Windows box will default to the local encoding. This can be addressed by running Java with the -Dfile.encoding=UTF-8 flag. E.g. java -jar -Dfile.encoding=UTF-8 SAFCreator-0.0.2-SNAPSHOT.one-jar.jar

Thanks to Eric Pennington for providing this solution.

For those in a hurry

Again, if you want a direct download to avoid building from source, you can grab it here: https://github.com/jcreel/SAFCreator/raw/master/jarfile/SAFCreator-0.0.2-SNAPSHOT.one-jar.jar