Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unhandled exception if the IA response is a HTML 403 page rather than a JSON response #656

Open
msikma opened this issue Oct 26, 2024 · 1 comment

Comments

@msikma
Copy link

msikma commented Oct 26, 2024

This is a minor issue that I think only occurred because the IA is currently in process of getting the site back up.

When running ia metadata 'win95-logo.sys' (link: https://archive.org/details/win95-logo.sys), the following unhandled exception occurs:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/requests/models.py", line 975, in json
    return complexjson.loads(self.text, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Cellar/[email protected]/3.11.9/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Cellar/[email protected]/3.11.9/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Cellar/[email protected]/3.11.9/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/ia", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/local/lib/python3.11/site-packages/internetarchive/cli/ia.py", line 171, in main
    sys.exit(ia_module.main(argv, session))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/internetarchive/cli/ia_metadata.py", line 203, in main
    item = session.get_item(identifier)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/internetarchive/session.py", line 253, in get_item
    item_metadata = self.get_metadata(identifier, request_kwargs) or {}
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/internetarchive/session.py", line 284, in get_metadata
    return resp.json()
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/requests/models.py", line 979, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Logging the response reveals that the site sends an HTML response: https://gist.github.com/msikma/faa97e6509ec88754c325a66eb935650

The page itself states:

Item not available
The item is not available due to issues with the item's content.

This is probably a rare case that will vanish when things are back online properly, but it might be nice to handle the case.

@jjjake
Copy link
Owner

jjjake commented Oct 28, 2024

Thanks @msikma. Yes, this would be nice to handle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants