[minor] Add option for custom Internet Archive object ID #316

phaseloop · 2023-10-03T13:29:11Z

Internet Archive object ID (URL) is being computed from movie metadata (title, ID, etc). When downloading non-youtube videos with invalid metadata - multiple videos can have same URL which causes upload conflict. This option adds switch to choose manual ID.

brandongalbraith · 2023-10-03T14:02:00Z

Hey @phaseloop! Thanks for submitting this. Is there any way we could make it more deterministic? tubeup is frequently used in point and shoot mode, and rarely would someone check if a non standard item already exists before performing an upload. If we can handle the logic gracefully in tubeup to generate object IDs, we should imho.

phaseloop · 2023-10-03T19:11:23Z

Current tubeup implementation handles this pretty well - it tries to use movie name and ID which is deterministic. My PR solves only the issue when you try to download videos from obscure sites that have broken metadata. For example there is a TV station with uses "MOVIE1" as title for each video. So after first upload to Internet Archive you can't upload anything more because object "movie1" already exists.

vxbinaca · 2023-10-05T00:30:39Z

No you need to talk to yt-dlp to get them to fix the extractors. The extractors are missing metadata grabbing. Adding this into Tubeup would allow people to bulk upload duplicate videos. Internet Archive is already pissed off at the amount of uploads and this would make it worse.

Just be satisfied with the options we bake in when it comes to downloads and uploads.

vxbinaca · 2023-10-05T00:33:32Z

Is the item ID mangled because the sites extractor isn't grabbing metadata? Go edit the extractor so it grabs the metadata we need.

vxbinaca · 2023-10-05T00:44:05Z

Internet Archive has a way of searching for things on that site. If you allow metadata or the video to be edited by the user before upload, or in this case edit the way things are organized there, you will introduce chaos and break the way things are done.

If a site extractor in yt-dlp doesn't do metadata properly, go submit a PR to them and then we get the downstream benefit without the code cruft of having to hand edit uploads through flags.

Thank you @phaseloop but if a site is broken - and I notice you didn't mention the site - talk to Puka at yt-dlp and get it fixed there. We are a middeman who eases mirroring and the last few PRs have been seemingly to complicate and frustrate rips.

Please feel free to submit PRs in the future to fix bugs that are actually with Tubeup.

phaseloop · 2023-10-05T07:57:52Z

@vxbinaca - this is not about fixing yt-dlp, it is about metadata that is not physically there. Imagine TV station naming each of their files "test1".

vxbinaca · 2023-10-05T15:10:57Z

@vxbinaca - this is not about fixing yt-dlp, it is about metadata that is not physically there. Imagine TV station naming each of their files "test1".

Tubeup does not download the video or gather metadata, it is merely a middeman that semi-automates it. Yt-dlp downloads the video and metadata, Tubeup makes a item based on that metadata. If the metadata that Tubeup requires is lacking, then it is on yt-dlp to fix that extractor to get the metadata we require. This is called a division of labor.

Submit a PR to yt-dlp to fix that sites extractor to provide scraping of the metadata you require.

The 'solution' you offer still won't fix the problem, requires other users to lengthen commands, adds code complexity to Tubeup that has to be maintained, and most of all opens Tubeup to abuse by allowing item identifiers to be changed. Archive.org has scripts that automatically move Tubeup uploads to a collection, allowing open-ended item names breaks this and gets me yelled at.

Your PR is ill-concieved and a incorrect 'fix' to a 'problem' that's not Tubeups problem. Sorry don't take it personally.

Talk to Pukkaden about fixing the yt-dlp site extractor you need.

[minor] Add option for custom Internet Archive object ID

04a20e0

vxbinaca closed this Oct 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[minor] Add option for custom Internet Archive object ID #316

[minor] Add option for custom Internet Archive object ID #316

phaseloop commented Oct 3, 2023

brandongalbraith commented Oct 3, 2023

phaseloop commented Oct 3, 2023

vxbinaca commented Oct 5, 2023 •

edited

Loading

vxbinaca commented Oct 5, 2023

vxbinaca commented Oct 5, 2023

phaseloop commented Oct 5, 2023

vxbinaca commented Oct 5, 2023 •

edited

Loading

[minor] Add option for custom Internet Archive object ID #316

[minor] Add option for custom Internet Archive object ID #316

Conversation

phaseloop commented Oct 3, 2023

brandongalbraith commented Oct 3, 2023

phaseloop commented Oct 3, 2023

vxbinaca commented Oct 5, 2023 • edited Loading

vxbinaca commented Oct 5, 2023

vxbinaca commented Oct 5, 2023

phaseloop commented Oct 5, 2023

vxbinaca commented Oct 5, 2023 • edited Loading

vxbinaca commented Oct 5, 2023 •

edited

Loading

vxbinaca commented Oct 5, 2023 •

edited

Loading