-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[minor] Add option for custom Internet Archive object ID #316
Conversation
Hey @phaseloop! Thanks for submitting this. Is there any way we could make it more deterministic? |
Current tubeup implementation handles this pretty well - it tries to use movie name and ID which is deterministic. My PR solves only the issue when you try to download videos from obscure sites that have broken metadata. For example there is a TV station with uses "MOVIE1" as title for each video. So after first upload to Internet Archive you can't upload anything more because object "movie1" already exists. |
No you need to talk to yt-dlp to get them to fix the extractors. The extractors are missing metadata grabbing. Adding this into Tubeup would allow people to bulk upload duplicate videos. Internet Archive is already pissed off at the amount of uploads and this would make it worse. Just be satisfied with the options we bake in when it comes to downloads and uploads. |
Is the item ID mangled because the sites extractor isn't grabbing metadata? Go edit the extractor so it grabs the metadata we need. |
Internet Archive has a way of searching for things on that site. If you allow metadata or the video to be edited by the user before upload, or in this case edit the way things are organized there, you will introduce chaos and break the way things are done. If a site extractor in yt-dlp doesn't do metadata properly, go submit a PR to them and then we get the downstream benefit without the code cruft of having to hand edit uploads through flags. Thank you @phaseloop but if a site is broken - and I notice you didn't mention the site - talk to Puka at yt-dlp and get it fixed there. We are a middeman who eases mirroring and the last few PRs have been seemingly to complicate and frustrate rips. Please feel free to submit PRs in the future to fix bugs that are actually with Tubeup. |
@vxbinaca - this is not about fixing yt-dlp, it is about metadata that is not physically there. Imagine TV station naming each of their files "test1". |
Tubeup does not download the video or gather metadata, it is merely a middeman that semi-automates it. Yt-dlp downloads the video and metadata, Tubeup makes a item based on that metadata. If the metadata that Tubeup requires is lacking, then it is on yt-dlp to fix that extractor to get the metadata we require. This is called a division of labor. Submit a PR to yt-dlp to fix that sites extractor to provide scraping of the metadata you require. The 'solution' you offer still won't fix the problem, requires other users to lengthen commands, adds code complexity to Tubeup that has to be maintained, and most of all opens Tubeup to abuse by allowing item identifiers to be changed. Archive.org has scripts that automatically move Tubeup uploads to a collection, allowing open-ended item names breaks this and gets me yelled at. Your PR is ill-concieved and a incorrect 'fix' to a 'problem' that's not Tubeups problem. Sorry don't take it personally. Talk to Pukkaden about fixing the yt-dlp site extractor you need. |
Internet Archive object ID (URL) is being computed from movie metadata (title, ID, etc). When downloading non-youtube videos with invalid metadata - multiple videos can have same URL which causes upload conflict. This option adds switch to choose manual ID.