export_archivesspace_xml

This is a small Ruby (not Rails) project meant to run on Heroku, whose only purpose is to connect to our ArchivesSpace's API once a week and export .ead.xml files to a location on S3. They can be harvested there by various partners. All the important code can be found in export_archivesspace_xml/lib/exporter.rb. More documentation can be found in the wiki.

Destination bucket

The files are uploaded to an s3 bucket; this is publicly accessible via a cname record at https://ead.sciencehistory.org/ .

Index file

index_page.rb creates a very simple index.html file in the bucket. The file allows our partners to use a variation on the following command to download all our EAD files: wget -r http://ead.sciencehistory.org -A *.ead.xml

Cloudfront distribution

We maintain a cloudfront distribution for the files at https://ead.sciencehistory.org/ . AWS details, including specifics about the SSL cert, are documented on the wiki; see wiki link above.

Infrastructure

We maintain a description of the app's infrastructure, such as S3 buckets, in Terraform (details).

Configuration

This is done via environment variables set on the Heroku project. Here are some of the important ones:

ArchivesSpace API settings

These allow the code to contact ArchivesSpace and download the EADs.

ARCHIVESSPACE_URL
ARCHIVESSPACE_EXPORT_USERNAME
ARCHIVESSPACE_EXPORT_PASSWORD

S3 settings

These are needed so the code knows where to put the files.

AWS_BUCKET
AWS_REGION
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY Note: The IAM permissions associated with this key pair in S3 are minimal: the code can only write files to the ead bucket.

Settings set by Heroku

We don't manage these- they're set by Heroku for our add-ons.

PAPERTRAIL_API_TOKEN

Scheduler add-on

The project does not include a web dyno, and relies on the Heroku Scheduler to spin up a nightly process.

EAD validation

bundle exec ruby run_check.rb will download each EAD file from the bucket, validate it against the EAD schema, and report any fatal errors.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
bin		bin
config		config
export_archivesspace_xml/lib		export_archivesspace_xml/lib
terraform		terraform
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
LICENSE.txt		LICENSE.txt
Procfile		Procfile
README.md		README.md
ead.rng		ead.rng
run.rb		run.rb
run_check.rb		run_check.rb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

export_archivesspace_xml

Destination bucket

Index file

Cloudfront distribution

Infrastructure

Configuration

ArchivesSpace API settings

S3 settings

Settings set by Heroku

Scheduler add-on

EAD validation

About

Releases

Packages

Languages

License

sciencehistory/export_archivesspace_xml

Folders and files

Latest commit

History

Repository files navigation

export_archivesspace_xml

Destination bucket

Index file

Cloudfront distribution

Infrastructure

Configuration

ArchivesSpace API settings

S3 settings

Settings set by Heroku

Scheduler add-on

EAD validation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages