-
Notifications
You must be signed in to change notification settings - Fork 220
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Output a more helpful error message on malformed bulk upload CSV data.
Malformed metadata keys in a bulk upload CSV file previously lead to a somewhat obscure ConnectionResetError. Now the offending key is named in an error message which should greatly help users who encounter this problem. The XML parser in the standard library does not properly verify tag names. I avoided the regular expression solution at first and chose lxml because the proper regex to verify XML tag names is rather complex. But as it turns out, the Internet Archive only allows a very limited subset of those characters in their metadata keys (`'[A-Za-z][.-0-9A-Za-z_]+`). (Verified by trying to create keys containing all other Unicode characters from 0 to x10FFFF against the metadata API.) See: https://archive.org/services/docs/api/metadata-schema/index.html#internet-archive-metadata
- Loading branch information
Showing
4 changed files
with
30 additions
and
12 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,8 @@ | ||
requests>=2.9.1,<3.0.0 | ||
jsonpatch>=0.4 | ||
backports.csv | ||
docopt>=0.6.0,<0.7.0 | ||
tqdm>=4.0.0 | ||
six>=1.13.0,<2.0.0 | ||
jsonpatch>=0.4 | ||
requests>=2.9.1,<3.0.0 | ||
schema>=0.4.0 | ||
backports.csv | ||
setuptools | ||
six>=1.13.0,<2.0.0 | ||
tqdm>=4.0.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
e7747a6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maxz This change broke PY2 support for
ia upload --spreadsheet ...
.re.fullmatch
does not exist in PY2. It's important that this works in PY2. Do you have an alternative to that will work in PY2? Tests would also be nice.e7747a6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I turned this off for now, unitl PY2 support can be added.