-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File upload hangs for large zip file (>= ~1 GB) #3634
Comments
@ tdilauro what if you upload via the SWORD API instead? Any difference? Sorry to hear about your trouble. |
@pdurbin I tried it with SWORD and got an internal server error. I have replaced the filename and DOI in the command below. $DV01_DVA contains the dataverseAdmin API key. I haven't used the SWORD API before, so please let me know if I'm "doing it wrong." Command$ curl --insecure -u $DV01_DVA: --data-binary @filename_double.zip -H "Content-Disposition: filename=filename.zip" -H "Content-Type: application/zip" -H "Packaging: http://purl.org/net/sword/package/SimpleZip" https://archive.data.jhu.edu/dvn/api/data-deposit/v1.1/swordv2/edit-media/study/doi:10.xxxx/xx/XXXXXX When the command completed, there was a "SWORD-{uuid}" file in the ${dataverse.files.directory}/sword directory. Its filesize matched that of the file referenced above. Response
Corresponding log messagesNB: The command ran from 17:32:03 until 17:33:35. The second entry might not be associated with this transaction, as it is logged at 17:33:50, 15 seconds after the curl command ended.
Any suggestions on configuration/logging changes to get a better idea of what's going on? |
@tdilauro - thanks for sending on the test file. I'll try it on this side and let you know what happens. |
@djbrooke, @pdurbin: An update. In spite of the error I receive above during the SWORD upload, the extracted zip file eventually appeared as a DataFile and the uploaded doubly zipped file disappeared from the SWORD directory. I'm gonna try to upload the other large files for this dataset. I'll report back. |
@djbrooke, @pdurbin: I ran a SWORD upload for the last three problem files. The response document is still the status 500 Internal Server Error page, but the container zip files land in the {datafiles}/sword directory and their contents eventually get converted into DataFiles and appear in the file inventory for the draft dataset. Here's the output of the run with a few things redacted and a duplicate error responses summarized. We are out of the woods for now, but there's still the issue of of the 500 response for SWORD and the hanging UI.
|
Thanks @tdilauro. I just checked on the file that I attempted to upload through the UI, and it doesn't appear that it was successful. Let me know if you had more success with SWORD. |
Also see #3645 where uploading large FITS files also hang at the end of the upload meter. Closing that as related but contains specific examples and a couple other minor issues. |
@tdilauro out of curiosity, what if you had a workaround where you manually placed large file in question on disk using scp or some other means and then ran an API endpoint to tell Dataverse to read the file and enter it into the Dataverse database? Any interest in this or is it too much of a hack? I'm only bringing this up because I think this endpoint was included in #3497 which was recently merged. We talked about leaving it in the branch, at least. 😄 |
@pdurbin That is not practical for us, especially with me no longer being on that team for support. I wrote a script and made some shell start-up changes to make it easier for our Data Management Consultants to do large file (and small fille -- why not? :) deposits via SWORD API. @djbrooke We still get the 500 error mentioned above, but the datafiles in question do make it into the draft object. I've warned our consultants about the spurious message and that they should simply verify that the files appear online and that the checksums match. |
@tdilauro I'm going to close this but if you feel like the SWORD API workaround isn't sufficient, please let us know! |
Huh. I guess I said I was going to close this back in June and I never did. I'll go ahead close it now, especially since we now have a new related issue over at #4433 that people can track. @tdilauro I'm not sure who's on the support team these days for the installation of Dataverse at Johns Hopkins but please pass along that we are happy to help them try to resolve any issues they're having. |
To enable upload of an intact target zip file, we encapsulate it in another zip file (the containing zip). So we are uploading a zip (target) within a zip (containing. When uploading a large ( ~1 GB or larger) one, the UI seems to hang with the progress bar at complete (behavior similar to that described in #2643, now consolidated under #2482 ). The unencapsulated target zip ends up in the
${dataverse.files.directory}/temp directory, but the upload seems not to complete and the target zip never appears in the uploaded files box below, so the files cannot be "saved". No error is reported in the UI.
This error was observed in Dataverse 4.6, but may occur in earlier releases.
The text was updated successfully, but these errors were encountered: