Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete processing - delete doesn't mark deletes correctly #174

Open
patrickzurek opened this issue Sep 9, 2016 · 3 comments
Open

Delete processing - delete doesn't mark deletes correctly #174

patrickzurek opened this issue Sep 9, 2016 · 3 comments

Comments

@patrickzurek
Copy link

JIRA issue created by: rcook
Originally opened: 2011-06-23 02:40 PM

Issue body:
(nt)

@patrickzurek
Copy link
Author

JIRA Coment by user: rcook
JIRA Timestamp: 2011-06-23 02:42 PM

Comment body:

Actually, to be totally sure, there is one more thing we should confirm. That the delete file contained records in it that "SHOULD HAVE MATCHED" records already in the repos.

-----Original Message-----
From: Cook, Randall
Sent: Thursday, June 23, 2011 10:39 AM
To: Anderson, Benjamin D; Arbelo, Ralph; 'John Brand'
Cc: 'Delis, Christopher'
Subject: RE: Deleted files on 137

Yuck, that's what I feared. Bug for Chris, I will open in FB.

-----Original Message-----
From: Anderson, Benjamin D
Sent: Thursday, June 23, 2011 10:37 AM
To: Cook, Randall; Arbelo, Ralph; 'John Brand'
Cc: 'Delis, Christopher'
Subject: RE: Deleted files on 137

according to the harvest - the records are all new (ids that we haven't seen that are incrementally larger than the previously seen highest id)

-----Original Message-----
From: Cook, Randall
Sent: Thursday, June 23, 2011 10:31 AM
To: Anderson, Benjamin D; Arbelo, Ralph; 'John Brand'
Cc: 'Delis, Christopher'
Subject: RE: Deleted files on 137

Copying Chris, though he is still on vacation.

Can you tell what is happening?

For example, if the repos has 1 million records, and we process a delete file that deletes 10,000 of those, then the total number of records in the repos stays at 1 million. Are the deleted records being successfully matched to records in the repos and marked deleted, or are we creating new records that are marked deleted? Or perhaps everything is working correctly by the logs are not correct?

-----Original Message-----
From: Anderson, Benjamin D
Sent: Thursday, June 23, 2011 10:25 AM
To: Cook, Randall; Arbelo, Ralph; 'John Brand'
Subject: RE: Deleted files on 137

I just harvested and got 2272 deleted records.

-----Original Message-----
From: Cook, Randall
Sent: Thursday, June 23, 2011 10:23 AM
To: Arbelo, Ralph; Anderson, Benjamin D; John Brand
Subject: RE: Deleted files on 137

Do you know if the records are marked as Deletes? If the actual record is not marked as a deleted record, then you have to process them with a parameter to tell the OAI Toolkit that they should be processed as deletes?

-----Original Message-----
From: Arbelo, Ralph
Sent: Thursday, June 23, 2011 10:07 AM
To: Anderson, Benjamin D; John Brand
Cc: Cook, Randall
Subject: Deleted files on 137

I processed one set of deletes (there are three files, one for each record type).

I ran the convertload_as_deleted.sh script to process them.

Import statistics summary: created 2272, updated: 0, skipped: 0, invalid: 4, deleted: 0, bib: 1789, auth: 1, holdings: 482 records. Invalid files: 0. It took 00:00:05.983. checkTime: 00:00:02.104. insertTime: 00:00:00.319. others: 00:00:03.560

I was expecting to see a lot more deleted than created, so this looks a little stange. However, I haven't done that much work with the deletes, so maybe this is typical.

Ralph

@patrickzurek
Copy link
Author

JIRA Coment by user: rcook
JIRA Timestamp: 2012-05-08 12:34 PM

Comment body:

Chris, is this still an open issue? Can you review and advise.

@patrickzurek
Copy link
Author

JIRA Comment by user: Chris Delis (cedelis)
JIRA Timestamp: 2012-05-08 12:41 PM

Comment body:

I think I would have to run my own test, since the scenario explained above doesn't tell me enough (were 003s involved? if so, was care made to ensure that the 001/003s matched? because if they didn't then it would be acceptable to create new deleted records; if the 001/003s matched, then obviously that'd be a bug).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant