Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate integration of SolrWayback #70

Closed
12 tasks done
Tracked by #39
anjackson opened this issue Mar 2, 2022 · 3 comments
Closed
12 tasks done
Tracked by #39

Evaluate integration of SolrWayback #70

anjackson opened this issue Mar 2, 2022 · 3 comments

Comments

@anjackson
Copy link
Contributor

anjackson commented Mar 2, 2022

Rather than continuing to roll our own search and visualization tool, we should consider adopting SolrWayback and collaborating with the NAS team on it. We could start by making it available as an internal tool, within the W3ACT stack. For this, we need to:

  • Dockerise it in a way that allows it to work to our Solr indexes.
  • Cope with content_type being either single or multiValued.
  • Ensure required configuration options to be overridden from environment variables.
  • Tomcat should log the response from the server when calls to Solr fail: add alternative logback config for Docker so logs go to the console not a file.
  • Implement HTTP and/or WebHDFS WARC record retrieval back-ends so it can work properly with our WARC store.
  • SolrWayback relies on url_norm for many queries, but our older indexes do not contain it. Can we work around this somehow?
  • To facilitate automated CI testing, switch to a test WARC as well as test Solr documents, i.e. use consistent records so we can search and replay from the test system.
  • Allow WAR name to be overridden on launch, so I can use act#solrwayback and hence get it to deploy in the right place.
  • Allow alternative playback engine to be overridden. Added ALT_PLAYBACK_PREFIX env var.
  • Also override regex for path as our Solr has name only not full path.
  • Complete proper byte range support ukwa-warc-server#12
  • Adjust SolrWayback so it can be deployed at an alternative path (e.g. /act/solrwayback).

It's possible we can't resolve some of these issues without re-indexing. In which case, this will have to wait.

Notes on public use moved to #73

@anjackson
Copy link
Contributor Author

Now built as ukwa/solrwayback:docker-hub-action for testing. Should be configurable to talk to our Solr and WARC servers, and deployable as part of the W3ACT stack.

@anjackson
Copy link
Contributor Author

Based on this: https://stackoverflow.com/a/69275528/6689

I have modified the fork of SolrWayback to use relative paths in Vue and it seems to work. Documentation indicates that this feature may not work with history.pushState. See the 'limitations' section of https://cli.vuejs.org/config/#publicpath

@anjackson
Copy link
Contributor Author

This works well enough to trial so calling this done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant