Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create caching screenshot service for OA material #1

Open
10 tasks done
anjackson opened this issue Dec 10, 2018 · 5 comments
Open
10 tasks done

Create caching screenshot service for OA material #1

anjackson opened this issue Dec 10, 2018 · 5 comments
Assignees

Comments

@anjackson
Copy link
Contributor

anjackson commented Dec 10, 2018

Add an endpoint that uses https://github.com/ukwa/webrender-api to render screenshots of OA items via pywb in proxy mode. It should cache them essentially permanently, and keep the URL lists to retain the transclusions for whitelisting.

This has been made much easier by using an off-the-shelf caching IIIF server (Cantaloupe) as an intermediary. This now works reasonably well, see: ukwa/ukwa-services#24 (comment)

Some work remains:

  • Remove banner by having separate pywb config for rendering.
  • Chase up T:73492 (all o' Twitter)
  • Review exposed API and tidy up.
  • Actually do thumbs/full/card sizes properly
  • Update pywb and add in IIIF API hooks for social cards.
  • Note switch to https://flask-restx.readthedocs.io/en/latest/
  • Extend Flask-RESTx to add logo etc as per this
  • Switch to general OG markup rather than proprietary Twitter tags, as per
  • Report pywb returning that 304. (???)
  • Implement test suite to cover new functionality.
@anjackson
Copy link
Contributor Author

anjackson commented Apr 4, 2019

So, started working.

@anjackson
Copy link
Contributor Author

As per ukwa/ukwa-services#24 (comment) added an IIIF server which makes a lot of this much easier. Still lots of those details to tie off.

@anjackson
Copy link
Contributor Author

anjackson commented Dec 20, 2020

Also need to access rendered versions from crawl time. Simplest is to use the screenshot: prefix, but PyWB doesn't support that at this time.

@anjackson
Copy link
Contributor Author

I monkeypatched ukwa-pywb to allow the screenshot:http URL scheme through, but now we get a 451 because the access rights are not matched up. See ukwa/ukwa-pywb#62

@anjackson
Copy link
Contributor Author

This mess of ideas needs splitting into separate tickets. Deferring for now.

anjackson pushed a commit that referenced this issue Apr 24, 2023
Update push-to-docker-hub.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant