Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP fix derive logic and redundant hash calc #351

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Dobatymo
Copy link
Contributor

This addresses #253 and #288

Counts all files (even those which might be skipped). This is consistent with the newly added behavior for file-level metadata.

BUG: if the last file is skipped, then a derive is never queued

Any idea how to solve this? Is there a way to queue a derive WITHOUT uploading a file?
The easiest way would be to just upload everything and skip whatever you like, and then after everything is completed, simple do queue a derive with a separate call.
Then we could do away with this fragile file counting.

@JustAnotherArchivist
Copy link
Contributor

Is there a way to queue a derive WITHOUT uploading a file?

Related: #252

This would be nice to have in general. I wrote a script for myself last year since I didn't want to click around on the web interface all the time as I'm using --no-derive to avoid hitting those two bugs you're trying to solve as well as wanting to be sure everything's fine before initiating the derive that can take a very long time. I'm not sure if that works for non-admin accounts though, and it should most definitely not be used as a basis for an implementation here as it emulates the website interaction, so I'm not going to link it here.

@jjjake
Copy link
Owner

jjjake commented Jun 1, 2020

Yes, it's possible to queue a derive task without uploading anything:

$ ia tasks jj-test-2020-05-14 --cmd derive.php

Python:

>>> r = item.derive()

@Dobatymo
Copy link
Contributor Author

Dobatymo commented Jun 2, 2020

Great, I will use this instead of the counting logic.

@Dobatymo
Copy link
Contributor Author

Dobatymo commented Jun 3, 2020

I removed all counting logic and just call derive() in the end. Do you think that is an acceptable solution?
The tests don't seem to like it though. If you think we should chose this solution. I will have a look at the tests.

@jjjake
Copy link
Owner

jjjake commented Jun 3, 2020

No, I think the derive task should still be queued in the upload request. I'd like to avoid submitting an extra request to another API if possible.

@JustAnotherArchivist
Copy link
Contributor

@jjjake

--cmd derive.php

Could this be added to the documentation? It only mentions make_dark.php and make_undark.php currently: https://archive.org/services/docs/api/tasks.html#supported-tasks. Maybe also an example in the ia tasks help.

removed all counting whatsoever and just queue the derive in the end
@soredake
Copy link

What's the status of this?

@Dobatymo
Copy link
Contributor Author

Dobatymo commented Nov 8, 2024

@soredake jjjake didn't like the approach to have a seperate API call to queue the derive. Also at the time the unittests failed for some reason. I haven't looked at this PR for the past few years and now there are merge conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants