Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batch ingest of articles #196

Open
ebenenglish opened this issue Oct 29, 2019 · 0 comments
Open

batch ingest of articles #196

ebenenglish opened this issue Oct 29, 2019 · 0 comments

Comments

@ebenenglish
Copy link
Collaborator

NOTE: This is a bit of a placeholder ticket, will add more detail once we have a better sense of what article batches look like.

Descriptive Summary
An important use case is allowing users to upload article-level objects in a batch process.

Unresolved questions:

  • What do article batches look like? Is there a standard format we can assume?
  • Do we need separate ingest scripts for:
    • page images with article segmentation data (in accompanying XML?)
    • article images
  • What do we do with the Page objects created in terms of search? Should we suppress Page objects if the component articles are available?

Rationale
We want to support adding large numbers of Article objects without having to upload one at a time.

Expected behavior
As an admin user
I should be able to run a rake task on the command line of my repo app
And I should be able to specify the AdminSet to ingest content into
And I should be able to specify the visibility of the objects to be created
And I should be able to specify the depositor for the objects to be created
And I should be able to specify the location of the batch of article files on a file system
And when I execute the rake task
The application should perform the batch ingest
And it should create a NewspaperTitle object for each publication in the batch
And it should create NewspaperIssue objects for each issue in the batch and create the appropriate derivatives for each
And it should create NewspaperPage objects for each page in the batch and create the appropriate derivatives for each
And it should create NewspaperArticle objects for each article in the batch and create the appropriate derivatives for each
And I should be able to view the created objects in the front end UI
And the full text should be searchable

Additional info
This ticket may end up overriding #123, further investigation needed.

This ticket will be complete when:
User story above is functional.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants