forked from esmero/archipelago-documentation
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request esmero#200 from alliomeria/1.4.0
Documentation New Guides & Updates for 1.4.0 : round 1
- Loading branch information
Showing
25 changed files
with
557 additions
and
75 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
--- | ||
title: Archipelago 101 - Core Documentation Guides | ||
tags: | ||
- Archipelago 101 | ||
- Documentation | ||
--- | ||
|
||
|
||
# Archipelago 101: Core Documentation Guides | ||
|
||
Top 10 guides we recommend you review as you get started working with Archipelago: | ||
|
||
1. [Metadata in Archipelago](metadatainarchipelago.md): a long and worthwhile read that covers the fundamentals of Archipelago's architecture and approach to metadata and data | ||
|
||
2. [Strawberryfield Formatters](strawberryfield-formatters.md): overview of the general setup of an Archipelago Digital Object (ADO) page and the way your ADO JSON metadata and data are output | ||
|
||
3. [Primer on Display Modes & How to Create a Webform as an Input Method](webformsasinput.md): deeper look at Display Modes and Form Modes, two ways you'll be interacting with your ADOs most frequently | ||
|
||
4. [Twig Templates and Archipelago](metadatatwigs.md): a great place to dive into one of Archipelago's best loved feature areas | ||
|
||
5. [Archipelago Multi Importer](ami_index.md): all about Archipelago's batch ingest and update functionality | ||
|
||
6. [Search and Solr Overview](search_solr_index.md): for repositories, it's all about the search | ||
* [In-a-nutshell : JSON data to Strawberry Keyname Providers to Solr](search_solr_index.md#in-a-nutshell-json-data-to-strawberry-keyname-providers-to-solr): essential overview of the pipeline from JSON data into and out of Solr | ||
* [Strawberry Key Name Providers, Solr Field, and Facet Configuration](strawberry_key_name_providers.md): fundamental information for site adminisrators | ||
|
||
7. [Advanced Batch Find and Replace](find_and_replace.md): targetted batch updates for your ADO metadata | ||
|
||
8. [Strawberry Runners Post-Processing Configuration](strawberryrunners.md): background post-processing defaults and options for all your file transformation and data indexing needs | ||
|
||
9. [Archipelago Local Deployment Guide](archipelago-deployment-readme.md): get your own local Archipelago up and running in about 15 minutes | ||
|
||
10. [Archipelago Presentations, Events, and Additional Resources](presentations_events.md): features recordings and links to different Archipelago workshops, conference presentations, and other helpful references | ||
|
||
___ | ||
|
||
Thank you for reading! Please contact us on our [Archipelago Commons Google Group](https://groups.google.com/forum/#!forum/archipelago-commons) with any questions or feedback. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
--- | ||
title: IIIF Content Search API Integration | ||
tags: | ||
- IIIF | ||
- IIIF Server Settings Form | ||
- IIIF Content Search API | ||
- Solr | ||
- Solr Fields | ||
- Solr Index | ||
--- | ||
|
||
# IIIF Content Search API Integration | ||
|
||
Beginning in release 1.3.0 and now fully mature in 1.4.0, Archipelago features IIIF Content Search API integration with attendant default configurations and settings. | ||
|
||
Through a non-trifling amount of code and maths, Archipelago speaks the IIIF Content Search API language using data from your Archipelago's Digital Objects, to enable you to search within Mirador (or other supported viewers) for specific hits within OCR, VTT file, or manually created textual annotations. | ||
|
||
Please also see the related [IIIF Server Settings Form](iiif_server_settings.md), and Strawberry Runners guides for [Reviewing and adjusting the `pager` and `ocr` Post-Processor operations](strawberryrunners_pager_ocr.md) and [Reviewing and Adjusting the `subtitle` Post-Processor operations](strawberryrunners_subtitle.md). | ||
|
||
## 1. IIIF Manifest Templates | ||
|
||
First, Archipelago's default IIIF Manifest templates explicitly state that they support the 3 versions of IIIF Content Search APIS in the 'service' key. | ||
|
||
```JSON | ||
"service": [ | ||
{ | ||
"id": "{{ baseurl }}iiifcontentsearch/v2/do/{{ node.uuid.value }}/metadatadisplayexposed/iiifmanifest/mode/advanced/page/0", | ||
"type": "SearchService2" | ||
}, | ||
{ | ||
"id": "{{ baseurl }}iiifcontentsearch/v1/do/{{ node.uuid.value }}/metadatadisplayexposed/iiifmanifest/mode/advanced/page/0", | ||
"type": "SearchService1", | ||
"@context": "http://iiif.io/api/search/1/context.json", | ||
"profile": "http://iiif.io/api/search/1/search" | ||
}, | ||
{ | ||
"@id": "{{ baseurl }}iiifcontentsearch/v1/do/{{ node.uuid.value }}/metadatadisplayexposed/iiifmanifest/mode/advanced/page/0", | ||
"@context": "http://iiif.io/api/search/0/context.json", | ||
"profile": "http://iiif.io/api/search/0/search" | ||
} | ||
], | ||
``` | ||
|
||
## 2. API Endpoints Exposure | ||
|
||
Next, in the default Exposed Metadata Endpoints API Endpoints (generated from the IIIF templates), Archipelago provides the specific structure needed for the IIIF Content Search API. Archipelago passes the data about “the template containing it”, the IIIF API version, if simple or advanced, and the Archipelago Digital Object resource UUID we are searching against (the one that contains the RAW data feeding the template, or at least the Top level/parent one of that). | ||
|
||
## 3. Pathway into and out of the Solr Index | ||
|
||
Then, Archipelago's backend recreates an ADO's IIIF manifest using this data (basically repeats what the client did before), but uses JMESPATHs to extract just what is needed, flipping the order of the structure and putting IIIF Image IDs, as "top keys" referencing canvases and their #xywh selectors (for the annotation text), if present. | ||
|
||
Using this transformed data, Archipelago's backend search is able to be limited to OCR generated only by those images (importantly, as Archipelago repositories can contain millions of OCR'd documents). Archipelago's internal search then returns natively, via the [Bavarian State Library’s Solr OCR highlight plugin](https://github.com/dbmdz/solr-ocrhighlighting/), the relevant hits within a specified ADO. These are then reprocessed to be IIIF compliant (W3C) annotations and then reverted back to results as “canvases with images”. | ||
|
||
## Things to keep in mind | ||
|
||
- To make this performant, Archipelago uses two levels of caches that get invalidated automatically on any "ingredient" used modification. | ||
|
||
- Archipelago can also tell the backend to use a "different" template than the one used at the front (Mirador), allowing you to define which "canvases" are searchable. This is not a normal use case, but still a valid one. And you can, per resource, have complex logic and/or different Viewers, even on a one by one basis. | ||
|
||
### Acknowledgements | ||
|
||
Archipelago's developers would like to extend our gratitude to our community, especially to [Mike](https://github.com/digitaldogsbody) and [Johannes](https://github.com/jbaiter) for their work and help, and everyone else in the IIIF and repository communities for all the amazing tools, viewers, specs and cookbook examples. | ||
|
||
___ | ||
|
||
Thank you for reading! Please contact us on our [Archipelago Commons Google Group](https://groups.google.com/forum/#!forum/archipelago-commons) with any questions or feedback. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
--- | ||
title: IIF Server Settings Form Default Settings | ||
tags: | ||
- IIIF | ||
- IIIF Server Settings Form | ||
- IIIF Content Search API | ||
- Solr | ||
- Solr Fields | ||
- Solr Index | ||
--- | ||
|
||
# IIIF Server Settings Form Default Settings | ||
|
||
The IIIF Server Settings Form is used to configure different IIIF related settings used throughout your Archipelago environment. We strongly advise keeping the default settings intact. The necessary [Solr Fields](strawberry_key_name_providers.md#creating-a-solr-field) listed below should be setup by default. | ||
|
||
You can find the IIIF Configuration Form: | ||
|
||
- Through the `Manage` menu > `Configuration` > `Archipelago` > `Configure Strawberry Runners Post Processors` | ||
- Directly at `/admin/config/archipelago/iiif` | ||
|
||
![IIIF Server Settings Form](images/iiif_server_settings_form.png) | ||
|
||
On the IIIF Server Settings Form page, you will see the following: | ||
|
||
|
||
1. Note that these 'IIIF Server configuration URLs are used as defaults for field formatters using IIIF, but can be overridden on a one by one basis when setting up your formatters for each Display Mode.' | ||
|
||
2. Base URL of your IIIF Media Server public accessible from the Outside World. | ||
- Please provide a publicly accessible IIIF server URL. This URL will be used for AJAX and JS calls. Trailing Slashes will be removed. | ||
- Set to `http://localhost:8183/iiif/2` by default. | ||
- We do not recommend changing this selection. | ||
|
||
3. Base URL of your IIIF Media Server accessible from inside this Webserver. | ||
- Please provide Internal IIIF server URL. This URL will be used by Internal Server calls and needs to be locally accessible by your server, e.g 127.0.0.1 or an local Docker alias. Trailing Slashes will be removed. | ||
- Set to `http://esmero-cantaloupe:8182/iiif/2` by default. | ||
- We do not recommend changing this selection. | ||
|
||
4. Checkbox to 'Enable IIIF Content Search API V1 and V2 endpoints'. | ||
- Checked by default in later (1.4.0+) versions of Archipelago. | ||
- See the [related (and essential) IIIF Manifest snippet shared here](iiif-content-search.md#1-iiif-manifest-templates) | ||
- APIs are accesible at the following path: "/iiifcontentsearch/{version}/do/{node_uuid}/metadatadisplayexposed/{metadataexposeconfig_entity}/mode/{mode}/page/{page}" with: | ||
- {version} one of [v1,v2] | ||
- {node_uuid} the UUID of the ADO whose Manifest you want to search inside | ||
- {metadataexposeconfig_entity} the machine name of the exposed Metadata Display endpoint used to render the Manifest that is calling the API (e.g iiifmanifest) | ||
- {mode} one of [simple,advanced]. Advanced is the smartest choice. Simple is faster, but requires your Canvas ids to be exactly in this pattern http(s)://domain.ext/do/{node_uuid}/{file_uuid}/canvas/{internal_to_the_file_sequence_order} | ||
- {page} 0 to N depedening on the Number of results. By default please use 0 | ||
|
||
5. Checkbox to 'Only allow searches inside a Manifest If the Manifest itself (for an ADO) defines the Search Endpoints as a Service' | ||
- Checked by default in later (1.4.0+) versions of Archipelago. | ||
- If enabled we will double check if the calling IIIF Manifest defines the Endpoint(s) in the `service` key. If unchecked any Manifest will be searchable by calling an API URL directly. | ||
|
||
6. IIIF Content Search API: field(s) that holds Parent Nodes | ||
- Strawberry Flavor Data Source Search API Fields that can be used to connect a Strawberry Flavor to a Parent AD0. | ||
- Default specified fields are: `Strawberryfield Flavor Datasource >> SBF Parent ID` and `Strawberryfield Flavor Datasource >> SBF Parent Node >> isPartOf >> ID` | ||
|
||
7. Strawberry Runner processors that should be searched against for visual highlights. | ||
- e.g Strawberry Flavor Data might have been generated by the "ocr" strawberry runners processor. A comma separated list of processors (machine names) that generated miniOCR. | ||
- Default is: `ocr` | ||
- If you are using the [Strawberry Runners `pager` and `ocr` post-processors](strawberryrunners_subtitle.md), you should always keep this enabled. | ||
|
||
8. Strawberry Runner processors that should be searched against for time based media. | ||
- e.g Strawberry Flavor Data might have been generated by the "subtitle" strawberry runners processor. These will have time based fragments and will match IIIF Annotations with motivation supplementing and target the time based media on the parent Canvas. A comma separated list of processors (machine names) that generated time based transcripts encoded as miniOCR. | ||
- Default is: `subtitle` | ||
|
||
9. Check to 'Target the VTT Supplementing Annotation' | ||
- If enabled (aligned with the specs) the target of a hit result will point to the supplementing Annotation containing in its body the VTT file. If not the Canvas containing in its body a Media Resource (less precise but more compatible with Viewers | ||
- If you are using the [Strawberry Runners `subtitle` post-processor](strawberryrunners_subtitle.md), you should always keep this enabled. | ||
|
||
10. Strawberry Runner processors that should be searched against plain text extractions. | ||
- e.g Strawberry Flavor Data might have been generated by the "text" strawberry runners processor. These will not have coordinates but will match IIIF Annotations with motivation supplementing and target the whole canvas. A comma separated list of processors (machine names) that generated time based transcripts encoded as miniOCR. | ||
- Default is: `text` | ||
- If you are using the [Strawberry Runners `subtitle` post-processor](strawberryrunners_subtitle.md), you should always keep this enabled. | ||
|
||
11. IIIF Content Search API: field(s) that hold the URI of the File that produced the Searchable content | ||
- Strawberry Flavor Data Source Search API Fields that hold the URI of the File that generated its content. | ||
- Default specified fields are: `Strawberryfield Flavor Datasource >> Parent File`, `Strawberryfield Flavor Datasource >> SBF source or related URI/URL`, and` Strawberryfield Flavor Datasource >> Parent File >> URI` | ||
|
||
12. IIIF Content Search API: Max Results per Page | ||
- Default is: `25` | ||
|
||
13. IIIF Content Search API: Max allowed characters/length for a Search term | ||
- Default is: `64` | ||
|
||
___ | ||
|
||
Return to the main [Strawberry Runners](strawberryrunners.md) or the [Archipelago Documentation main page](index.md). |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,15 @@ | ||
# Archipelago Commons Intro | ||
|
||
Archipelago Commons, or simply Archipelago, is an Open Source Digital Objects Repository / DAM Server Architecture based on the popular CMS [`Drupal 9/10`](https://www.drupal.org) and released under [`GLP V.3 License`](https://www.gnu.org/licenses/gpl-3.0.txt). Archipelago is developed and supported at the [Metropolitan New York Library Council (METRO)](https://metro.org). | ||
Archipelago Commons, or simply Archipelago, is an Open Source Digital Objects Repository / DAM Server Architecture based on the popular CMS [`Drupal 9/10+`](https://www.drupal.org) and released under [`GLP V.3 License`](https://www.gnu.org/licenses/gpl-3.0.txt). Archipelago is developed and supported at the [Metropolitan New York Library Council (METRO)](https://metro.org). | ||
|
||
Archipelago is a mix of deeply integrated custom-coded Drupal modules (made with care by us) and a curated and well-configured Drupal instance, running under a discrete and well-planned set of service containers. Learn more about the different [`Software Services`](devops.md) used by Archipelago. | ||
Archipelago is a mix of deeply integrated custom-coded Drupal modules (made with care by us, the [Digital Services Team and METRO](https://metro.org/digital-services)) and a curated and well-configured Drupal instance, running under a discrete and well-planned set of complementary additional service containers. You can learn more about the different [Software Services used by Archipelago here](devops.md), and [Archipelago's unique approach to Metadata here](metadatainarchipelago.md). | ||
|
||
Archipelago's primary focus is to serve the greater [`GLAM community`](https://en.wikipedia.org/wiki/GLAM_(industry_sector)) by providing a flexible, consistent, and unified way of describing, storing, linking, exposing metadata and media assets. We respect identities and existing workflows. We endeavor to design Archipelago in ways that empower communities of every size and shape. | ||
Archipelago's primary focus is to serve the greater [`GLAM community`](https://en.wikipedia.org/wiki/GLAM_(industry_sector)) (libraries, archives, museums, universities and colleges, cultural heritage organizations) by providing a flexible, consistent, and unified way of describing, storing, linking, exposing metadata and media assets that make up rich repository collections all around our shared beautiful world. We respect identities and existing workflows, and we endeavor to design Archipelago in ways that empower communities of every size, shade, and shape. | ||
|
||
Finally, Archipelago tries to stay humble, slim, and nimble in nature with a small codebase full of inline comments and `@todos`. All of our work is driven by a clear and [concise but thoughtful planned technical roadmap --updated in tandem with new releases](https://github.com/esmero/archipelago-deployment/issues/243). | ||
|
||
We recommend you start with the [Core Documentation Guides listed here](101_guides_list.md) as you begin your Archipelago explorations. | ||
___ | ||
|
||
Thank you for reading! Please contact us on our [Archipelago Commons Google Group](https://groups.google.com/forum/#!forum/archipelago-commons) with any questions or feedback. | ||
|
||
Finally, Archipelago tries to stay humble, slim, and nimble in nature with a small code base full of inline comments and `@todos`. All of our work is driven by a clear and [concise but thoughtful planned technical roadmap --updated in tandem with new releases](https://github.com/esmero/archipelago-deployment/issues/243). |
Oops, something went wrong.