Skip to content

Files_Directory_Importer

Julie Allinson edited this page Jun 30, 2017 · 4 revisions

Files Directory Importer

This is a simple importer for directories of files:

git clone http://github.com/ULCC/dart_hyku
git checkout kfpub_shib

<setup the appliation as per the Hyku README> 

Run:

bin/import_files_to_existing_objects <server> <path_to_csv_file> <path_to_directory> <depth>

For example:

bin/import_files_to_existing_objects localhost path/to/my/file.csv path/to/my/files 0

What it does

  • Given a csv file with two columns, the script can import files from a directory or sub-directories

Instructions

CSV File

  • The csv file must contain two columns and no header row.
  • Column one must contain the id of an existing Work to which you want to add FileSets/Files
  • Column two must contain a list of files or directories within the files directory provided

Files Directory and Depth Parameter

The depth parameter tells the where the files to be ingested are:

  • A depth of 0 means that files are contained directly within the files directory. The filename name will match the value in the second column of the csv file.
  • A depth of 1 means that the files are contained within a folder within the files directory. The folder name will match the value in the second column of the csv file.
  • A depth of two or more means that there are sub-directories beneath the folder name in the csv file.

For example:

Command:

bin/import_files_to_existing_objects localhost csv_file files_directory 0

CSV:

12345,file_to_ingest.pdf

File to ingest:

files_directory\file_to_ingest.pdf

Command:

bin/import_files_to_existing_objects localhost csv_file files_directory 1

CSV:

12345,directory_for_12345

Directory to ingest - all files in this directory are ingested to the object:

files_directory\directory_for_12345

Command:

bin/import_files_to_existing_objects localhost csv_file files_directory 2

CSV:

12345,directory_for_12345

Directory to ingest - all files in these directories are ingested to the object:

files_directory\directory_for_12345\01\
files_directory\directory_for_12345\02\
files_directory\directory_for_12345\03\

The Code

The importer code can be found here:

lib/importer/directory_files_importer.rb
lib/importer/files_parser.rb 

Extending the importer

The importer could be extended:

  • to create new objects with minimal metadata
  • to add metadata to the files / filesets themselves (eg. a fileset title, or visibility setting)
Clone this wiki locally