diff --git a/docs/101_guides_list.md b/docs/101_guides_list.md new file mode 100644 index 00000000..11d45899 --- /dev/null +++ b/docs/101_guides_list.md @@ -0,0 +1,37 @@ +--- +title: Archipelago 101 - Core Documentation Guides +tags: + - Archipelago 101 + - Documentation +--- + + +# Archipelago 101: Core Documentation Guides + +Top 10 guides we recommend you review as you get started working with Archipelago: + +1. [Metadata in Archipelago](metadatainarchipelago.md): a long and worthwhile read that covers the fundamentals of Archipelago's architecture and approach to metadata and data + +2. [Strawberryfield Formatters](strawberryfield-formatters.md): overview of the general setup of an Archipelago Digital Object (ADO) page and the way your ADO JSON metadata and data are output + +3. [Primer on Display Modes & How to Create a Webform as an Input Method](webformsasinput.md): deeper look at Display Modes and Form Modes, two ways you'll be interacting with your ADOs most frequently + +4. [Twig Templates and Archipelago](metadatatwigs.md): a great place to dive into one of Archipelago's best loved feature areas + +5. [Archipelago Multi Importer](ami_index.md): all about Archipelago's batch ingest and update functionality + +6. [Search and Solr Overview](search_solr_index.md): for repositories, it's all about the search + * [In-a-nutshell : JSON data to Strawberry Keyname Providers to Solr](search_solr_index.md#in-a-nutshell-json-data-to-strawberry-keyname-providers-to-solr): essential overview of the pipeline from JSON data into and out of Solr + * [Strawberry Key Name Providers, Solr Field, and Facet Configuration](strawberry_key_name_providers.md): fundamental information for site adminisrators + +7. [Advanced Batch Find and Replace](find_and_replace.md): targetted batch updates for your ADO metadata + +8. [Strawberry Runners Post-Processing Configuration](strawberryrunners.md): background post-processing defaults and options for all your file transformation and data indexing needs + +9. [Archipelago Local Deployment Guide](archipelago-deployment-readme.md): get your own local Archipelago up and running in about 15 minutes + +10. [Archipelago Presentations, Events, and Additional Resources](presentations_events.md): features recordings and links to different Archipelago workshops, conference presentations, and other helpful references + +___ + +Thank you for reading! Please contact us on our [Archipelago Commons Google Group](https://groups.google.com/forum/#!forum/archipelago-commons) with any questions or feedback. diff --git a/docs/find_and_replace.md b/docs/find_and_replace.md index ab9a6590..eb8a1395 100644 --- a/docs/find_and_replace.md +++ b/docs/find_and_replace.md @@ -91,6 +91,14 @@ After reviewing the 'Important Notes & Workflow Recommendations' below, please s The Actions available through Archipelago's Advanced Batch Find and Replace can potentially have repository-wide effects. It is strongly recommended that you proceed with caution when executing any of the available Actions. + +!!! warning "Adding New Facets" + + The default Facets available through Archipelago's Advanced Batch Find and Replace have an important configuration selection made on each individual Facet. For every [new Facet you add](strawberry_key_name_providers.md) for Find and Replace, you need to select the checkboxes for both the 'VBO batch handler' settings to use the `VBO Batch Facet processor`, and the selection within the 'VBO batch handler settings' to `Use URL based facets in VBO Batches`. You need to make sure these are selected so that the "visible" list/count of objects you filter using a Facet is respected during actual VBO process execution of batch changes you make for any Find and Replace Actions. + Also, please be aware that Drupal's VBO does not pass a "limit" (except if your VIEW has actually a "SHOW" a defined number of results which most users will never use). Because of that, when you run a VBO-based action, the default batch limitation will be set to the Search API/Solr defined Limit. You can view this Limit information at +'~yoursite/admin/config/search/search-api/server/esmero_solr/edit', under the Advanced Tab. This all means that if you first set a Limit of 100 in your Search API/SOLR defined Limit, then you see 1000 objects in your Find and Replace results and select all 1000 results for batch change operations, when you run your Find and Replace action only 100 changes will be processed. There is no way Archipelago can work around that VBO related behavior (for now, except open an ISSUE, perhaps a way can be found!). + + ## Simulation Mode Before executing any of the available Find and Replace Actions, the best-practice workflow recommendation is to **always** first run in Simulation Mode: diff --git a/docs/iiif-content-search.md b/docs/iiif-content-search.md new file mode 100644 index 00000000..3124e200 --- /dev/null +++ b/docs/iiif-content-search.md @@ -0,0 +1,67 @@ +--- +title: IIIF Content Search API Integration +tags: + - IIIF + - IIIF Server Settings Form + - IIIF Content Search API + - Solr + - Solr Fields + - Solr Index +--- + +# IIIF Content Search API Integration + +Beginning in release 1.3.0 and now fully mature in 1.4.0, Archipelago features IIIF Content Search API integration with attendant default configurations and settings. + +Through a non-trifling amount of code and maths, Archipelago speaks the IIIF Content Search API language using data from your Archipelago's Digital Objects, to enable you to search within Mirador (or other supported viewers) for specific hits within OCR, VTT file, or manually created textual annotations. + +Please also see the related [IIIF Server Settings Form](iiif_server_settings.md), and Strawberry Runners guides for [Reviewing and adjusting the `pager` and `ocr` Post-Processor operations](strawberryrunners_pager_ocr.md) and [Reviewing and Adjusting the `subtitle` Post-Processor operations](strawberryrunners_subtitle.md). + +## 1. IIIF Manifest Templates + +First, Archipelago's default IIIF Manifest templates explicitly state that they support the 3 versions of IIIF Content Search APIS in the 'service' key. + +```JSON +"service": [ + { + "id": "{{ baseurl }}iiifcontentsearch/v2/do/{{ node.uuid.value }}/metadatadisplayexposed/iiifmanifest/mode/advanced/page/0", + "type": "SearchService2" + }, + { + "id": "{{ baseurl }}iiifcontentsearch/v1/do/{{ node.uuid.value }}/metadatadisplayexposed/iiifmanifest/mode/advanced/page/0", + "type": "SearchService1", + "@context": "http://iiif.io/api/search/1/context.json", + "profile": "http://iiif.io/api/search/1/search" + }, + { + "@id": "{{ baseurl }}iiifcontentsearch/v1/do/{{ node.uuid.value }}/metadatadisplayexposed/iiifmanifest/mode/advanced/page/0", + "@context": "http://iiif.io/api/search/0/context.json", + "profile": "http://iiif.io/api/search/0/search" + } + ], +``` + +## 2. API Endpoints Exposure + +Next, in the default Exposed Metadata Endpoints API Endpoints (generated from the IIIF templates), Archipelago provides the specific structure needed for the IIIF Content Search API. Archipelago passes the data about “the template containing it”, the IIIF API version, if simple or advanced, and the Archipelago Digital Object resource UUID we are searching against (the one that contains the RAW data feeding the template, or at least the Top level/parent one of that). + +## 3. Pathway into and out of the Solr Index + +Then, Archipelago's backend recreates an ADO's IIIF manifest using this data (basically repeats what the client did before), but uses JMESPATHs to extract just what is needed, flipping the order of the structure and putting IIIF Image IDs, as "top keys" referencing canvases and their #xywh selectors (for the annotation text), if present. + +Using this transformed data, Archipelago's backend search is able to be limited to OCR generated only by those images (importantly, as Archipelago repositories can contain millions of OCR'd documents). Archipelago's internal search then returns natively, via the [Bavarian State Library’s Solr OCR highlight plugin](https://github.com/dbmdz/solr-ocrhighlighting/), the relevant hits within a specified ADO. These are then reprocessed to be IIIF compliant (W3C) annotations and then reverted back to results as “canvases with images”. + +## Things to keep in mind + +- To make this performant, Archipelago uses two levels of caches that get invalidated automatically on any "ingredient" used modification. + +- Archipelago can also tell the backend to use a "different" template than the one used at the front (Mirador), allowing you to define which "canvases" are searchable. This is not a normal use case, but still a valid one. And you can, per resource, have complex logic and/or different Viewers, even on a one by one basis. + +### Acknowledgements + +Archipelago's developers would like to extend our gratitude to our community, especially to [Mike](https://github.com/digitaldogsbody) and [Johannes](https://github.com/jbaiter) for their work and help, and everyone else in the IIIF and repository communities for all the amazing tools, viewers, specs and cookbook examples. + +___ + +Thank you for reading! Please contact us on our [Archipelago Commons Google Group](https://groups.google.com/forum/#!forum/archipelago-commons) with any questions or feedback. + diff --git a/docs/iiif_server_settings.md b/docs/iiif_server_settings.md new file mode 100644 index 00000000..648e431c --- /dev/null +++ b/docs/iiif_server_settings.md @@ -0,0 +1,86 @@ +--- +title: IIF Server Settings Form Default Settings +tags: + - IIIF + - IIIF Server Settings Form + - IIIF Content Search API + - Solr + - Solr Fields + - Solr Index +--- + +# IIIF Server Settings Form Default Settings + +The IIIF Server Settings Form is used to configure different IIIF related settings used throughout your Archipelago environment. We strongly advise keeping the default settings intact. The necessary [Solr Fields](strawberry_key_name_providers.md#creating-a-solr-field) listed below should be setup by default. + +You can find the IIIF Configuration Form: + +- Through the `Manage` menu > `Configuration` > `Archipelago` > `Configure Strawberry Runners Post Processors` +- Directly at `/admin/config/archipelago/iiif` + +![IIIF Server Settings Form](images/iiif_server_settings_form.png) + +On the IIIF Server Settings Form page, you will see the following: + + +1. Note that these 'IIIF Server configuration URLs are used as defaults for field formatters using IIIF, but can be overridden on a one by one basis when setting up your formatters for each Display Mode.' + +2. Base URL of your IIIF Media Server public accessible from the Outside World. + - Please provide a publicly accessible IIIF server URL. This URL will be used for AJAX and JS calls. Trailing Slashes will be removed. + - Set to `http://localhost:8183/iiif/2` by default. + - We do not recommend changing this selection. + +3. Base URL of your IIIF Media Server accessible from inside this Webserver. + - Please provide Internal IIIF server URL. This URL will be used by Internal Server calls and needs to be locally accessible by your server, e.g 127.0.0.1 or an local Docker alias. Trailing Slashes will be removed. + - Set to `http://esmero-cantaloupe:8182/iiif/2` by default. + - We do not recommend changing this selection. + +4. Checkbox to 'Enable IIIF Content Search API V1 and V2 endpoints'. + - Checked by default in later (1.4.0+) versions of Archipelago. + - See the [related (and essential) IIIF Manifest snippet shared here](iiif-content-search.md#1-iiif-manifest-templates) + - APIs are accesible at the following path: "/iiifcontentsearch/{version}/do/{node_uuid}/metadatadisplayexposed/{metadataexposeconfig_entity}/mode/{mode}/page/{page}" with: + - {version} one of [v1,v2] + - {node_uuid} the UUID of the ADO whose Manifest you want to search inside + - {metadataexposeconfig_entity} the machine name of the exposed Metadata Display endpoint used to render the Manifest that is calling the API (e.g iiifmanifest) + - {mode} one of [simple,advanced]. Advanced is the smartest choice. Simple is faster, but requires your Canvas ids to be exactly in this pattern http(s)://domain.ext/do/{node_uuid}/{file_uuid}/canvas/{internal_to_the_file_sequence_order} + - {page} 0 to N depedening on the Number of results. By default please use 0 + +5. Checkbox to 'Only allow searches inside a Manifest If the Manifest itself (for an ADO) defines the Search Endpoints as a Service' + - Checked by default in later (1.4.0+) versions of Archipelago. + - If enabled we will double check if the calling IIIF Manifest defines the Endpoint(s) in the `service` key. If unchecked any Manifest will be searchable by calling an API URL directly. + +6. IIIF Content Search API: field(s) that holds Parent Nodes + - Strawberry Flavor Data Source Search API Fields that can be used to connect a Strawberry Flavor to a Parent AD0. + - Default specified fields are: `Strawberryfield Flavor Datasource >> SBF Parent ID` and `Strawberryfield Flavor Datasource >> SBF Parent Node >> isPartOf >> ID` + +7. Strawberry Runner processors that should be searched against for visual highlights. + - e.g Strawberry Flavor Data might have been generated by the "ocr" strawberry runners processor. A comma separated list of processors (machine names) that generated miniOCR. + - Default is: `ocr` + - If you are using the [Strawberry Runners `pager` and `ocr` post-processors](strawberryrunners_subtitle.md), you should always keep this enabled. + +8. Strawberry Runner processors that should be searched against for time based media. + - e.g Strawberry Flavor Data might have been generated by the "subtitle" strawberry runners processor. These will have time based fragments and will match IIIF Annotations with motivation supplementing and target the time based media on the parent Canvas. A comma separated list of processors (machine names) that generated time based transcripts encoded as miniOCR. + - Default is: `subtitle` + +9. Check to 'Target the VTT Supplementing Annotation' + - If enabled (aligned with the specs) the target of a hit result will point to the supplementing Annotation containing in its body the VTT file. If not the Canvas containing in its body a Media Resource (less precise but more compatible with Viewers + - If you are using the [Strawberry Runners `subtitle` post-processor](strawberryrunners_subtitle.md), you should always keep this enabled. + +10. Strawberry Runner processors that should be searched against plain text extractions. + - e.g Strawberry Flavor Data might have been generated by the "text" strawberry runners processor. These will not have coordinates but will match IIIF Annotations with motivation supplementing and target the whole canvas. A comma separated list of processors (machine names) that generated time based transcripts encoded as miniOCR. + - Default is: `text` + - If you are using the [Strawberry Runners `subtitle` post-processor](strawberryrunners_subtitle.md), you should always keep this enabled. + +11. IIIF Content Search API: field(s) that hold the URI of the File that produced the Searchable content + - Strawberry Flavor Data Source Search API Fields that hold the URI of the File that generated its content. + - Default specified fields are: `Strawberryfield Flavor Datasource >> Parent File`, `Strawberryfield Flavor Datasource >> SBF source or related URI/URL`, and` Strawberryfield Flavor Datasource >> Parent File >> URI` + +12. IIIF Content Search API: Max Results per Page + - Default is: `25` + +13. IIIF Content Search API: Max allowed characters/length for a Search term + - Default is: `64` + +___ + +Return to the main [Strawberry Runners](strawberryrunners.md) or the [Archipelago Documentation main page](index.md). diff --git a/docs/images/ado-type-to-view-mode-mapping.png b/docs/images/ado-type-to-view-mode-mapping.png new file mode 100644 index 00000000..b5b7c289 Binary files /dev/null and b/docs/images/ado-type-to-view-mode-mapping.png differ diff --git a/docs/images/display-modes-2024.png b/docs/images/display-modes-2024.png new file mode 100644 index 00000000..d943f8cd Binary files /dev/null and b/docs/images/display-modes-2024.png differ diff --git a/docs/images/forms-modes-2024.png b/docs/images/forms-modes-2024.png new file mode 100644 index 00000000..e1f3b47a Binary files /dev/null and b/docs/images/forms-modes-2024.png differ diff --git a/docs/images/iiif_server_settings_form.png b/docs/images/iiif_server_settings_form.png new file mode 100644 index 00000000..8120dada Binary files /dev/null and b/docs/images/iiif_server_settings_form.png differ diff --git a/docs/images/manage-display-2024.png b/docs/images/manage-display-2024.png new file mode 100644 index 00000000..754f712d Binary files /dev/null and b/docs/images/manage-display-2024.png differ diff --git a/docs/images/manage-display-coll.png b/docs/images/manage-display-coll.png new file mode 100644 index 00000000..2ccd19cc Binary files /dev/null and b/docs/images/manage-display-coll.png differ diff --git a/docs/images/manage-form-display-2024.png b/docs/images/manage-form-display-2024.png new file mode 100644 index 00000000..abb392ed Binary files /dev/null and b/docs/images/manage-form-display-2024.png differ diff --git a/docs/images/managing-display-modes-2024.png b/docs/images/managing-display-modes-2024.png new file mode 100644 index 00000000..eb4adf1d Binary files /dev/null and b/docs/images/managing-display-modes-2024.png differ diff --git a/docs/images/sbr_subtitle.png b/docs/images/sbr_subtitle.png new file mode 100644 index 00000000..67fa12f7 Binary files /dev/null and b/docs/images/sbr_subtitle.png differ diff --git a/docs/images/strawberryrunnershome_updated.png b/docs/images/strawberryrunnershome_updated.png new file mode 100644 index 00000000..df50db79 Binary files /dev/null and b/docs/images/strawberryrunnershome_updated.png differ diff --git a/docs/index.md b/docs/index.md index c1f4fd26..956e24af 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,9 +1,15 @@ # Archipelago Commons Intro -Archipelago Commons, or simply Archipelago, is an Open Source Digital Objects Repository / DAM Server Architecture based on the popular CMS [`Drupal 9/10`](https://www.drupal.org) and released under [`GLP V.3 License`](https://www.gnu.org/licenses/gpl-3.0.txt). Archipelago is developed and supported at the [Metropolitan New York Library Council (METRO)](https://metro.org). +Archipelago Commons, or simply Archipelago, is an Open Source Digital Objects Repository / DAM Server Architecture based on the popular CMS [`Drupal 9/10+`](https://www.drupal.org) and released under [`GLP V.3 License`](https://www.gnu.org/licenses/gpl-3.0.txt). Archipelago is developed and supported at the [Metropolitan New York Library Council (METRO)](https://metro.org). -Archipelago is a mix of deeply integrated custom-coded Drupal modules (made with care by us) and a curated and well-configured Drupal instance, running under a discrete and well-planned set of service containers. Learn more about the different [`Software Services`](devops.md) used by Archipelago. +Archipelago is a mix of deeply integrated custom-coded Drupal modules (made with care by us, the [Digital Services Team and METRO](https://metro.org/digital-services)) and a curated and well-configured Drupal instance, running under a discrete and well-planned set of complementary additional service containers. You can learn more about the different [Software Services used by Archipelago here](devops.md), and [Archipelago's unique approach to Metadata here](metadatainarchipelago.md). -Archipelago's primary focus is to serve the greater [`GLAM community`](https://en.wikipedia.org/wiki/GLAM_(industry_sector)) by providing a flexible, consistent, and unified way of describing, storing, linking, exposing metadata and media assets. We respect identities and existing workflows. We endeavor to design Archipelago in ways that empower communities of every size and shape. +Archipelago's primary focus is to serve the greater [`GLAM community`](https://en.wikipedia.org/wiki/GLAM_(industry_sector)) (libraries, archives, museums, universities and colleges, cultural heritage organizations) by providing a flexible, consistent, and unified way of describing, storing, linking, exposing metadata and media assets that make up rich repository collections all around our shared beautiful world. We respect identities and existing workflows, and we endeavor to design Archipelago in ways that empower communities of every size, shade, and shape. + +Finally, Archipelago tries to stay humble, slim, and nimble in nature with a small codebase full of inline comments and `@todos`. All of our work is driven by a clear and [concise but thoughtful planned technical roadmap --updated in tandem with new releases](https://github.com/esmero/archipelago-deployment/issues/243). + +We recommend you start with the [Core Documentation Guides listed here](101_guides_list.md) as you begin your Archipelago explorations. +___ + +Thank you for reading! Please contact us on our [Archipelago Commons Google Group](https://groups.google.com/forum/#!forum/archipelago-commons) with any questions or feedback. -Finally, Archipelago tries to stay humble, slim, and nimble in nature with a small code base full of inline comments and `@todos`. All of our work is driven by a clear and [concise but thoughtful planned technical roadmap --updated in tandem with new releases](https://github.com/esmero/archipelago-deployment/issues/243). diff --git a/docs/inthewild.md b/docs/inthewild.md index 9b6cd960..990d78ff 100644 --- a/docs/inthewild.md +++ b/docs/inthewild.md @@ -4,15 +4,15 @@ Explore Archipelago instances running free across digital realms. !!! note - _*Please be aware that some of the following Archipelago instances are still brewing and these links may change. Stay tuned for future updates to live production sites when available._ + _*Please be aware that some of the following Archipelago instances are still brewing and these links may change._ ## METRO + Archipelago The Archipelagos listed below are supported by the [Digital Services Team at the Metropolitan New York Library Council](https://metro.org/digital-services). 🧑‍🌾 🐝 🍓 -- [Archipelago Playground](http://play.archipelago.nyc) and [Studio Site](https://studio.archipelago.nyc/) - - METRO's public Archipelago playground to experiment, learn, and evaluate. +- [Archipelago (Early/Legacy) Playground](http://play.archipelago.nyc) and [Studio Site](https://studio.archipelago.nyc/) + - METRO's public (play..) and internal (studio..) Archipelago playgrounds to experiment, learn, and evaluate. - [Barnard College](https://digitalcollections.barnard.edu/) @@ -30,6 +30,9 @@ The Archipelagos listed below are supported by the [Digital Services Team at the - [Olin College Library Phoenix Files](https://phoenixfiles.olin.edu) - *Early adopter - live since Summer 2020 + +- [New York State Archives](https://www.archives.nysed.gov) Finding Aids Discovery Portal + - *Migration and development kicked off Spring 2024 - [New York State COVID-19 Personal History Initiative](https://www.nyspersonalhistory.com) @@ -46,7 +49,6 @@ The Archipelagos listed below are supported by the [Digital Services Team at the From all around our beautiful shared world. 🏡 🏫 🏛️ - [Amherst College](https://acdc.amherst.edu) - - Migration to Archipelago began Spring 2022 - [Association Montessori Internationale](https://montessori-ami.org/) - Development of Archipelago environment began Summer 2022; Launch of new site Spring 2024 @@ -63,7 +65,7 @@ From all around our beautiful shared world. 🏡 🏫 🏛️ - [Virtual Tour Santuario Paola](http://archipelago.byterfly.eu/do/5aea0a3f-cf03-40cc-9611-924dea1fd806) - [University of Edinburgh Libraries](https://www.ed.ac.uk/information-services/library-museum-gallery) - - _Development of Archipelago environment began Summer 2022_ + - _Development of Archipelago environment began late 2022/3 ## We should be here diff --git a/docs/presentations_events.md b/docs/presentations_events.md index f98b43e7..e22375b3 100644 --- a/docs/presentations_events.md +++ b/docs/presentations_events.md @@ -6,6 +6,23 @@ [METRO's Digital Services Team](https://metro.org/digital-services) facilitated many different internal training sessions throughout 2020-2022. If you and your team need access to any of these sessions that were recorded, please [contact us](mailto:repositorysupport@metro.org). Thank you! +## 2024 + +- Archipelago Summer Workshop Series : [more details coming soon](https://groups.google.com/g/archipelago-commons/c/T3im4-gp8og/m/62HL9RMoAQAJ) + +- IIIF Annual Conference (June 2024) + - [Creating a Better Balance: Respectful Reuse & ML/AI Tags in IIIF Manifests](https://docs.google.com/presentation/d/18rggHeFld7HOJefmVc6ku_7M5edLfgfZkpnQZiagSOA/edit?usp=share_link). Allison Sherrick, Diego Pino Navarro. + - [Transdimensional mutations for audio/moving media annotations in IIIF Content Search](https://docs.google.com/presentation/d/1qtVirRG5_4z2RHg6RH5cCu4dwxU1v-IH93lSrK-fZ34/edit?usp=share_link). Allison Sherrick, Diego Pino Navarro. + - 📺 Recordings will be made available on the [IIIF Youtube channel](https://www.youtube.com/@IIIF-Consortium) in early Summer 2024. + +- Open Repositories Annual Conference (June 2024): + - [Hybrid ML/AI driven search as cataloging aid in Archipelago Commons](https://tinyurl.com/hybrydMLOR2024). Diego Pino Navarro. + - [Creating a better balance: the need for tools and practices to combat AI harvests and resource flooding in repository environments](https://tinyurl.com/AIBOTSOR2024). Diego Pino Navarro, Allison Sherrick. + - Collaboration Across Borders. Jessica Barlow, Lisa Lamont, Matt Ferrill, Hilario Castillo Castillo, Kristofer Patrón Soberano. + - Copies of the slides for all presentations will be uploaded to the [Open Repositories Zenodo](https://zenodo.org/communities/openrepos/records?q=&l=list&p=1&s=10&sort=newest) in early Summer 2024. + +- Public Release of IOI's InfraFinder Tool (Spring 2024): [Archipelago Commons on Infrafinder](https://infrafinder.investinopen.org/solutions/archipelago-commons) + ## 2023 - IIIF Search API and Dynamic/evolving Manifest Generation: Facing the Unknown. Diego Alberto Pino Navarro, Allison Sherrick. @@ -16,6 +33,8 @@ - [Working with Open-Schema JSON in Archipelago. Allison Sherrick, Diego Pino Navarro, Martha Tenney, Joanna DiPasquale, Corinne Chatnik.](https://osf.io/dx3fm/) - [Slaying the Migration Dragon: Approaches to Navigating an Open Source System Migration. Lisa McFall, Sarah Walden McGowan, Brenden McCarthy, Shay Foley.](https://osf.io/aymhd/) +- [Archipelago 1.3.0 Release Announcement (October 31, 2023)](https://groups.google.com/g/archipelago-commons/c/zvJOVzC1WnQ/m/7A-vW5HBBgAJ) + - IIIF Annual Conference (June 2023) - [Experimental IIIF Kitchen using Archipelago. Pino Navarro, Diego; Sherrick, Allison.](https://tinyurl.com/apiiif2023) - [Mapping an Engineer Through IIIF. Monger, Jenifer J.; McCarthy, Brenden; Pino Navarro, Diego; Sherrick, Allison.](https://tinyurl.com/2x9mshx5) @@ -50,21 +69,6 @@ - Formation of the Archipelago Working Group (April 2022) - In the Spring of 2022, METRO supported the creation of a select group of both early adopters and longtime members of the Archipelago community to provide a dedicated space for Archipelago power users to build upon their demonstrated use-explorations, contribute further to the platform and have a direct influence on roadmap code, direction, and timeline. This group will also work on documentation needs, use cases and outreach (including public showcases, trainings/workshops, and other events). -??? info "Archipelago Working Group Members" - - - Giancarlo Birello at CNR Italy - - Jennifer Palmentiero at SENYLRC - - Brenden McCarthy at RPI - - Lisa McFall at Hamilton College - - Megan Tyne at Association Montessori Internationale  - - Carl Jones at MIT Libraries - - Martha Tenney at Barnard College Library - - David Bass / Max Bronsema at Western Washington University - - Sarah Walden McGowan at Amherst College - - Prashanth B at Vipassana Research Institute - - Ianthe Sutherland at University at Edinburgh - - Corinne Chatnik at Union College - - [Toward Empathetic Digital Repositories: An Interview with Diego Pino Navarro (January 2022)](https://digitalcommons.lsu.edu/jcdl/vol2/iss1/1/) ## 2021 diff --git a/docs/strawberryfield-formatters.md b/docs/strawberryfield-formatters.md index c77fd0f9..c2bece53 100644 --- a/docs/strawberryfield-formatters.md +++ b/docs/strawberryfield-formatters.md @@ -22,44 +22,35 @@ Once the page loads the `Default` View mode is automatically selected. However, #### How to find and configure which View mode is Default per Media type -The **ADO Type to View mode Mapping** page tells the ADOs which View mode to use by default per Media type. This page can be accessed at `yoursite//admin/config/archipelago/viewmode_mapping` +The **ADO Type to View Mode Mapping** page tells the ADOs which View mode to use by default per Media type. You can find the ADO Type to View Mode Mapping Form at `~yoursite/admin/config/archipelago/viewmode_mapping` -??? info "Formatters Shipped with Archipelago" +![ADO Type to View Mode Mapping Form](images/ado-type-to-view-mode-mapping.png) + +??? info "Strawberryfield Formatters Shipped and Configured with Archipelago" 1. Default 2. Collection listing 3. Digital Object Full View - 4. Digital Object with 3D Viewer - 5. Digital Object with A/V Player - 6. Digital Object with Book Reader - 7. Digital Object with Mirador Viewer - 8. Digital Object with Pannellum Panorama + 4. Digital Object Image Only for Carousel + 5. Digital Object with 3D Viewer + 6. Digital Object with Audio Player + 7. Digital Object with Book Reader + 8. Digital Object with Mirador Viewer 9. Digital Object with PDF Viewer - 10. Digital Object with Replay.web Webarchive Player - 11. Digital Object with Replay.web Webarchive with Navbars - 12. Digital Object with Video Player + 10. Digital Object with Pannellum Panorama + 11. Digital Object with Video Player + 12. Digital Object with Replay.web WARC Replay.web Widget 13. Digital Object with thumbnail and abstract + 14. Digital Object with thumbnail for Grid + _For Digital Object Collections/Compound Objects/CreativeWorkSeries_ + 15. Digital Object Collection with Mirador Viewer + 16. Digital Object Creative Work Series with Mirador Viewer + + +??? info "Default View Modes" -??? info "Default View Mode Mappings by Media Type" - - |JSON (Media) Type | View Mode Name | - |----------|----------------------------------| - |1. Video | Digital Object with Video Player | - |2. 3DModel | Digital Object with 3D Viewer | - |3. Photograph| Digital Object Full View | - |4. Thesis | Digital Object with PDF Viewer | - |5. Panorama | Digital Object with Pannellum Panorama | - |6. Book | Digital Object with Book Reader | - |7. Podcast | Digital Object with A/V Player | - |8. Collection | Collection Listing | - |9. Article | Digital Object with PDF Viewer | - |10. Map | Digital Object with Mirador Viewer | - |11. MusicRecording | Digital Object with A/V Player | - |12. Sculpture | Digital Object with 3D Viewer | - |13. VisualArtwork | Digital Object with Video Player | - |14. Painting | Digital Object with Mirador Viewer | - |15. WebPage | Digital Object with Replay.web Webarchive Player | - |16. PanoramaTour | Digital Object with Pannellum Panorama | + You can see the full list of [Default View Modes included in Archipelago here](/webformsasinput.md#view-mode), and you can access your Archipelago's ADO Type to View Mode Mapping at `~yoursite/admin/config/archipelago/viewmode_mapping`. + ![Selecting Digital Object Full View](images/strawberryfield-formatters/03_default-managedisplay.jpg) diff --git a/docs/strawberryrunners.md b/docs/strawberryrunners.md index dd9a6da9..a1d9ebbf 100644 --- a/docs/strawberryrunners.md +++ b/docs/strawberryrunners.md @@ -15,7 +15,8 @@ Archipelago's [Strawberry Runners (SBR)](https://github.com/esmero/strawberry_ru The default Archipelago SBR post-processor configurations include operations that: - perform page-based HOCR/OCR for image and pdf-based ADOs, send the output to the Search API, and use Natural Language Processing to extract entities from the output - extract text from pages within a Webarchives File and send the output to the Search API -- convert WARC format Webarchives Files into WACZ format and attach the new WACZ file to the original source ADO to complement the WARC original +- convert WARC format Webarchives Files into WACZ format and attach the new WACZ file to the original source ADO to complement the WARC original +- extract textual values from subtitle/transcript VTT files and generates time/space transmuted OCR SBR actions can be chained and nested to enable ordered operations, such as first extract individual pages in an ordered sequence and then run HOCR/OCR across the individual pages. @@ -28,13 +29,14 @@ You can access the Strawberry Runners Settings: On the Strawberry Runners Settings page, you will see the Archipelago default post processor configurations (unless modified). -![Strawberry Runners Home](images/strawberryrunnershome.png) +![Strawberry Runners Home](images/strawberryrunnershome_updated.png) 1. The `pager` action uses the 'Post processor that extracts/generates Ordered Sequences of files/pages/children using Files present in an ADO' plugin. 2. Nested one level in, the `ocr` action uses the 'Post processor that Runs OCR/HORC against files' plugin. The `ocr` operations will be executed after the completion of the `pager` operations. 3. The `wacz_page_extractor` action uses the 'Post processor that extracts/generates Indexed Page Content from WACZ files in an ADO' plugin. 4. Nested one level in, the `webpage` action uses the 'Post processor that Indexes WACZ Frictionless data Search Index to Search API' plugin. The `webpage` operations will be executed after the completion of the `wacz_page_extractor` operations. 5. The `warc_to_wacz` action uses the 'Post processor that uses a System Binary to process * files' operations. +6. The `subtitle` action extracts textual values from subtitle/transcript VTT files and generates time/space transmuted OCR. This transmuted OCR can be used to search within a time-based video or audio file's corresponding subtitle/transcript VTT file(s), then navigate to the matching time of the video or audio file within a media viewer. ## Reviewing and Adjusting the default Post-Processors @@ -45,6 +47,7 @@ Please see the following guides for: - [Adjusting the `pager` and `ocr` operations](strawberryrunners_pager_ocr.md) - [Adjusting the `wacz_page_extractor` and `webpage` operations](strawberryrunners_webpage_text.md) - [Adjusting the `warc_to_wacz` operation](strawberryrunners_wacz_binary.md) +- [Adjusting the `subtitle` operation](strawberryrunners_subtitle.md) ## Triggering Post-Processing Actions Manually @@ -56,3 +59,6 @@ You can use Archipelago's [Find and Replace](find_and_replace.md) to first selec Archipelago also includes the `Post processor that writes/reads Frictionless Data Packages` plugin. Please keep a lookout for future documentation related to using this plugin. +___ + +Return to the [Archipelago Documentation main page](index.md). diff --git a/docs/strawberryrunners_pager_ocr.md b/docs/strawberryrunners_pager_ocr.md index 7f8e6f03..be6cd4fa 100644 --- a/docs/strawberryrunners_pager_ocr.md +++ b/docs/strawberryrunners_pager_ocr.md @@ -177,6 +177,7 @@ In the `pager` settings, you will see several different configuration options. 23. Timeout in seconds for this process. - 900 - If the process runs out of time it can still be processed again. + 24. Order or execution in the global chain. - 0 diff --git a/docs/strawberryrunners_subtitle.md b/docs/strawberryrunners_subtitle.md new file mode 100644 index 00000000..790070b7 --- /dev/null +++ b/docs/strawberryrunners_subtitle.md @@ -0,0 +1,77 @@ +--- +title: Strawberry Runners Post-Processing +tags: + - Strawberry Runners + - Subtitles + - VTT + - HOCR + - OCR + - Post-processing + - Background Processing +--- + +# Reviewing and adjusting the `subtitle` Post-Processor operations + +As stated on the [Strawberry Runners overview page](docs/strawberryrunners.md), the `subtitle` action extracts textual values from subtitle/transcript VTT files and generates time/space transmuted OCR. This transmuted OCR can be used to search within a time-based video or audio file's corresponding subtitle/transcript VTT file(s), then navigate to the matching time of the video or audio file within a media viewer. + +We strongly recommend using caution when making any adjustments to the default `subtitle` configurations as this may result in unexpected issues with the transmuted OCR values in your Solr Index. Also, the `subtitle` Strawberry Runner Post-Processor needs to used with corresponding related default IIIF Configuration Form Settings. + +# Subtitle Settings + +To review or adjust the configurations for the `subtitle` operation, select `Edit` from the `Operations` menu. + +In the `subtitle` settings, you will see the following configuration options: + +![Strawberry Runners Pager](images/sbr_pager.png) + +1. Label: + - Label for this Processor; which should be a unique machine-readable name + - Can only contain lowercase letters, numbers, and underscores + - We do not recommend changing this Label from the default `pager`. + +2. Strawberry Runner Post Processor Plugin: + - The `Post processor that extracts/generates Ordered Sequences of files/pages/children using Files present in an ADO` should be selected. + - We do not recommend changing this Plugin selection. + +3. Checkbox to mark this processor plugin as active + - We recommend keeping this checked as `active` at all times, but you may wish to temporarily disable this if you are performing certain types of administrative review tasks such as running large test ingests where you plan on deleting the ADOs before a final ingest. + - If you accidentally uncheck this and need to re-trigger the `pager` (and corresponding nested `ocr` action), you can use Archipelago's [Find and Replace](find_and_replace.md) to first select a specific group of Digital Objects you wish to target for Post-Processing, then select the `Trigger Strawberrry Runners process/reprocess for Archipelago Digital Objects content item` from the [Find and Replace](find_and_replace.md) `Actions menu`. + +4. ADO type(s) to limit this processor to: + - A single ADO type or a comma delimited list of ado types that qualify to be Processed. + - Leave empty to apply to all ADOs. If you do not provide any specific ADO types here, the processor will be applied for all ADOs with the JSON keys selected in the next step. + - Default ADO types specified are: 'Document,Book,Article' + - You may wish to add additional types of document/multiple-paged type of ADOs to this list that are custom to you Archipelago environment. + +5. The JSON key that contains the desired source files: + - By default, the `as:image` and `as:document` keys are selected. + - We do not recommend changing this selection. + +6. Mimetypes(s) to limit this Processor to: + - A single Mimetype type or a comma separated list of mimetypes that qualify to be Processed. + - Leave empty to apply any file. + - Default mimetypes are: 'application/pdf,image/tiff,image/jpeg,image/jp2' + +7. Within the ADO's metadata, the JSON key that contains the language in ISO639-3 (3 letter) format to be used for OCR/NLP processing via Tesseract. + - Default JSON key specified is: 'language_iso639_3' + +8. Please provide a default language in ISO639-3 (3 letter) format. If none is provided we will use 'eng'. + - Default language specified is: 'eng' + +9. Timeout in seconds for this process. + - If the process runs out of time it can still be processed again + - Default selection is: 10 + +10. Order or execution in the global chain. + - Default selection is: 0 + +### Related IIIF Configuration Form and IIIF Content Search Guides + +* [IIIF Configuration Form](iiif_server_settings.md) +* [IIIF Content Search API Overview](iiif-content-search.md) + +___ + +Thank you for reading! Please contact us on our [Archipelago Commons Google Group](https://groups.google.com/forum/#!forum/archipelago-commons) with any questions or feedback. + +Return to the main [Strawberry Runners](strawberryrunners.md) or the [Archipelago Documentation main page](index.md). diff --git a/docs/strawberryrunners_wacz_binary.md b/docs/strawberryrunners_wacz_binary.md index 80acd99e..76eb9908 100644 --- a/docs/strawberryrunners_wacz_binary.md +++ b/docs/strawberryrunners_wacz_binary.md @@ -1,5 +1,5 @@ --- -title: Strawberry Runners Post-Processing +title: Reviewing and adjusting the `warc_to_wacz` Post-Processor operation tags: - Strawberry Runners - WACZ @@ -11,7 +11,68 @@ tags: # Reviewing and adjusting the `warc_to_wacz` Post-Processor operation -_This page is under construction. Please stay tuned for further updates and thank you for your patience as we continue to brew up more documentation._ +As stated on the [Strawberry Runners overview page](docs/strawberryrunners.md), the `warc_to_wacz` action uses the 'Post processor that uses a System Binary to process * files' operations. + +## Wacz Page Extractor Settings + +To review or adjust the configurations for the `wacz_page_extractor` operation, select `Edit` from the `Operations` menu. + +In the `wacz_page_extractor` settings, you will see the following configuration options: + +1. Label: + - Label for this Processor; which should be a unique machine-readable name + - Can only contain lowercase letters, numbers, and underscores + - We do not recommend changing this Label from the default `pager`. + +2. Strawberry Runner Post Processor Plugin: + - The `Post processor that extracts/generates Indexed Page Content from WACZ in an ADO` should be selected. + - We do not recommend changing this Plugin selection. + +3. Checkbox to mark this processor plugin as active + - We recommend keeping this checked as `active` at all times, but you may wish to temporarily disable this if you are performing certain types of administrative review tasks such as running large test ingests where you plan on deleting the ADOs before a final ingest. + - If you accidentally uncheck this and need to re-trigger the `pager` (and corresponding nested `ocr` action), you can use Archipelago's [Find and Replace](find_and_replace.md) to first select a specific group of Digital Objects you wish to target for Post-Processing, then select the `Trigger Strawberrry Runners process/reprocess for Archipelago Digital Objects content item` from the [Find and Replace](find_and_replace.md) `Actions menu`. + +4. The JSON key that contains the desired source files: + - By default, only the `as:document` key is selected. + - We do not recommend changing this selection. + +5. Mimetypes(s) to limit this Processor to: + - A single Mimetype type or a comma separated list of mimetypes that qualify to be Processed. + - Default mimetype for this Processor is: 'application/warc' + +!!! warning "Advanced OCR/HOCR Settings" + + We do not recommend making changes to the follow settings unless you are the System Administrator. + + +6. The system path to the binary that will be executed by this processor. + - A full system path to a binary present in the same environment your PHP runs + - Default is: '/usr/bin/wacz' + +7. Any additional argument your executable binary requires. + - Any arguments your binary requires to run. Use %file as replacement for the file if the executable requires the filename to be passed under a specific argument. Use %outfile if the binary is intended to generate a new file and the output is going to be a file entity. If you know the extension please add it in the form of %outfile.extension + - Default is: 'create %file -t -o %outfile.wacz' + +8. The expected and desired output of this processor. + - If the output is just data and "One or more Files" is selected all data will be dumped into a file and handled as such. + - Default is set to 'One or more Files' + +9. Where and how the output will be used. + - Default is set to 'A new file to be attached to the source ADO' + - The 'As Input for another processor Plugin' selection will only have an effect if another Processor is setup to consume this output. + - Other optional selections include: + - In the same Source Metadata, as a child structure of each Processed file + - In the same Source Metadata but inside its own, top level, "as:flavour" subkey based on the given machine name of the current plugin + - A new file to be attached to the source ADO + - As Input for another processor Plugin + - In a Search API Document using the Strawberryfield Flavor Data Source (e.g used for HOCR highlight) + +11. Timeout in seconds for this process. + - Default is set to: 99 + - If the process runs out of time it can still be processed again. + +12. Order or execution in the global chain. + - Default is set to: 0 ___ diff --git a/docs/strawberryrunners_webpage_text.md b/docs/strawberryrunners_webpage_text.md index e996c07e..01f38d7d 100644 --- a/docs/strawberryrunners_webpage_text.md +++ b/docs/strawberryrunners_webpage_text.md @@ -1,5 +1,5 @@ --- -title: Strawberry Runners Post-Processing +title: Reviewing and adjusting the `wacz_page_extractor` and `webpage` Post-Processor operations tags: - Strawberry Runners - Fulltext Search @@ -11,7 +11,110 @@ tags: # Reviewing and adjusting the `wacz_page_extractor` and `webpage` Post-Processor operations -_This page is under construction. Please stay tuned for further updates and thank you for your patience as we continue to brew up more documentation._ +As stated on the [Strawberry Runners overview page](docs/strawberryrunners.md), the `wacz_page_extractor` action uses the 'Post processor that extracts/generates Indexed Page Content from WACZ files in an ADO' plugin. Nested one level in, the `webpage` action uses the 'Post processor that Indexes WACZ Frictionless data Search Index to Search API' plugin. The webpage operations will be executed after the completion of the wacz_page_extractor operations. The `ocr` operations will be executed after the completion of the `pager` operations. + +## Wacz Page Extractor Settings + +To review or adjust the configurations for the `wacz_page_extractor` operation, select `Edit` from the `Operations` menu. + +In the `wacz_page_extractor` settings, you will see the following configuration options: + +1. Label: + - Label for this Processor; which should be a unique machine-readable name + - Can only contain lowercase letters, numbers, and underscores + - We do not recommend changing this Label from the default `pager`. + +2. Strawberry Runner Post Processor Plugin: + - The `Post processor that extracts/generates Indexed Page Content from WACZ in an ADO` should be selected. + - We do not recommend changing this Plugin selection. + +3. Checkbox to mark this processor plugin as active + - We recommend keeping this checked as `active` at all times, but you may wish to temporarily disable this if you are performing certain types of administrative review tasks such as running large test ingests where you plan on deleting the ADOs before a final ingest. + - If you accidentally uncheck this and need to re-trigger the `pager` (and corresponding nested `ocr` action), you can use Archipelago's [Find and Replace](find_and_replace.md) to first select a specific group of Digital Objects you wish to target for Post-Processing, then select the `Trigger Strawberrry Runners process/reprocess for Archipelago Digital Objects content item` from the [Find and Replace](find_and_replace.md) `Actions menu`. + +4. The JSON key that contains the desired source files: + - By default, only the `as:document` key is selected. + - We do not recommend changing this selection. + +5. Mimetypes(s) to limit this Processor to: + - A single Mimetype type or a comma separated list of mimetypes that qualify to be Processed. + - Default mimetype for this Processor is: 'application/vnd.datapackage+zip' + +6. ADO type(s) to limit this processor to: + - A single ADO type or a comma delimited list of ado types that qualify to be Processed. + - Leave empty to apply to all ADOs. If you do not provide any specific ADO types here, the processor will be applied for all ADOs with the JSON keys selected in the next step. + - Default ADO type specified for this Processor: 'WebPage' + +7. The expected and desired output of this processor: + - Only option for this Processor is JSON output + - Default is set to: `Data/Values that can be serialized to JSON` + +8. The queue to use this processor: + - The primary queue will be execute in realtime while the Secondary will be execute in background + - Default selection is for the 'Secondary queue in background' + +9. Timeout in seconds for this process. + - Default is set to: 300 + - If the process runs out of time it can still be processed again. + +10. Order or execution in the global chain. + - Default is set to: 7 + +## Webpage Settings + +To review or adjust the configurations for the `webpage` operation, select `Edit` from the `Operations` menu. + +In the `webpage` settings, you will see the following configuration options: + +1. Label: + - Label for this Processor; which should be a unique machine-readable name + - Can only contain lowercase letters, numbers, and underscores + - We do not recommend changing this Label from the default `webpage`. + +2. Strawberry Runner Post Processor Plugin: + - The `Post processor that extracts/generates Indexed Page Content from WACZ in an ADO` should be selected. + - We do not recommend changing this Plugin selection. + +3. Checkbox to mark this processor plugin as active + - We recommend keeping this checked as `active` at all times, but you may wish to temporarily disable this if you are performing certain types of administrative review tasks such as running large test ingests where you plan on deleting the ADOs before a final ingest. + - If you accidentally uncheck this and need to re-trigger the `pager` (and corresponding nested `ocr` action), you can use Archipelago's [Find and Replace](find_and_replace.md) to first select a specific group of Digital Objects you wish to target for Post-Processing, then select the `Trigger Strawberrry Runners process/reprocess for Archipelago Digital Objects content item` from the [Find and Replace](find_and_replace.md) `Actions menu`. + +4. ADO type(s) to limit this processor to: + - A single ADO type or a comma delimited list of ado types that qualify to be Processed. + - Leave empty to apply to all ADOs. If you do not provide any specific ADO types here, the processor will be applied for all ADOs with the JSON keys selected in the next step. + - Default ADO type specified for this Processor: 'WebPage' + +5. The expected and desired output of this processor: + - Only option for this Processor is JSON output + - If the output is just data and "One or more Files" is selected all data will be dumped into a file and handled as such. + - Default is set to: `Data/Values that can be serialized to JSON` + +6. Where and how the output will be used. + - 'As Input for another processor Plugin' selection will only have an effect if another Processor is setup to consume this ouput. + - Default selection is: 'In a Search API Document using the Strawberryfield Flavor Data Source (e.g used for HOCR highlight)' + +7. The queue to use this processor: + - The primary queue will be execute in realtime while the Secondary will be execute in background + - Default selection is for the 'Secondary queue in background' The queue to use for this processor. + +8. Checkbox option to 'Use NLP to extract entities from Text' + - If checked Full text will be processed for Natural language Entity extraction using Polyglot + - Default is have this option checked + +9. The URL location of your NLP64 server. + - Defaults to http://esmero-nlp:6400 + +10. Which method(NER) to use + - The NER NLP method to use to extract Agents, Places and Sentiment. + - Default selection: 'Polyglot (faster)' + - Alternation selection: 'spaCy (more accurate)' + +11. Timeout in seconds for this process. + - Default is set to: 25 + - If the process runs out of time it can still be processed again. + +12. Order or execution in the global chain. + - Default is set to: 0 ___ diff --git a/docs/webformsasinput.md b/docs/webformsasinput.md index c93fbe57..cc1b0e7f 100644 --- a/docs/webformsasinput.md +++ b/docs/webformsasinput.md @@ -1,5 +1,5 @@ --- -title: How to Create a Webform as an Input Method for Archipelago Digital Objects (ADO) +title: Primer on Display Modes & How to Create a Webform as an Input Method for Archipelago Digital Objects (ADO) tags: - Webform - Form Mode @@ -10,11 +10,9 @@ tags: - Handler --- -# How to Create a Webform as an Input Method for Archipelago Digital Objects (ADO) / Primer on Display Modes +# Primer on Display Modes & How to Create a Webform as an Input Method for Archipelago Digital Objects (ADO) -Drupal 8/9 provides a lot of out-of-the-box functionality to setup the way Content Entities (Nodes or in our case ADOs) are exposed to users with the proper credentials. That functionality lives under the "Display Modes" and can be accessed at `yoursite/admin/structure/display-modes`. - -![Display Modes](images/display-modes.jpg) +Drupal 9/10 provides a lot of out-of-the-box functionality to setup the way Content Entities (Nodes or in our case ADOs) are exposed to users with the proper credentials. That functionality lives under the "Display Modes" >> and can be accessed at `yoursite/admin/structure/display-modes`. In a few quick words, The Display Mode Concept covers: formatting your Content Entities and their associated Fields so when a user lands on a Content Page, they are displayed in a certain, hopefully pleasing, way and also how users with proper Credentials can fill inputs/edit values for each `field` a Content Entity provides. @@ -29,11 +27,13 @@ First, formatting output (basically building the front facing page for each cont The main difference, other than their purpose (Output v/s Input) is that, on View Modes, the settings you apply to each field are associated to "Formatters" and on Form Modes, the settings you apply to each field are connected to "Widgets". -So, resuming, this is what lives under the Concept of a "Display Mode": +**Please note that this guides features some older screenshots using earlier versions of Archipelago/Drupal Adminsitrative Theming. Please pardon any jumps between themes.* + +So, resuming, this is what lives under the Concept of a "Display Mode"... ## View Mode -![See all your View Modes](images/view-modes.jpg) +![Display Modes - View Modes](images/display-modes-2024.png) - Each field attached to a Content Entity can have a Formatter applied and most of them have configuration options. - Formatters do one thing right: they take the raw, stored value and make it "visible" inside Drupal. @@ -41,9 +41,32 @@ So, resuming, this is what lives under the Concept of a "Display Mode": - E.g A Node title/Label will have a Title formatter with the option of just displaying a text or a text with a link to the entity. - More Complex and fun Fields, like the ones of type `SBF` will provide a large list of possible `Formatters`, like IIIF driven viewers, Video formatters, Metadata Display (Twig template driven) ones, etc. This is because a SBF type of field has much more than just a text value, it contains a full graph of metadata and properties, inclusive links to Files and provenance metadata. +### All of the Default View Modes Bundled in Archipelago + +??? info "Click to see the full list of Default View Modes" + * Collection listing + * Digital Object Collection with Mirador Viewer + * Digital Object Creative Work Series with Mirador Viewer + * Digital Object Full View + * Digital Object Image Only for Carousel + * Digital Object Oral History with Multiple Media + * Digital Object with 3D Viewer + * Digital Object with Audio Player + * Digital Object with Book Reader + * Digital Object with Mirador Viewer + * Digital Object with Pannellum Panorama + * Digital Object with PDF Viewer + * Digital Object with thumbnail and abstract + * Digital Object with thumbnail for Grid + * Digital Object with Video Player + * Digital Object with WARC Replay.web Widget + * Search index + * Search result highlighting input + * _Plus Drupal default View Modes: RSS, Teaser, and Token_ + ## Form Mode -![Form Modes](images/form-modes.jpg) +![Form Modes](images/forms-modes-2024.png) - Each field attached to a Content Entity can have a Widget applied and most of them have configuration options. - Widgets do one thing right: they expose some type of Form/UI interaction that allows a user to input data into the Entity, under that specific field. And of course they make sure that what you input is validated and saved (if good) correctly. @@ -56,11 +79,11 @@ So, resuming, this is what lives under the Concept of a "Display Mode": If you chose a widget other than the raw JSON, the widget will take the raw JSON to build, massage and enrich the data so that it can be presented in a visual format by the SBF. This is because a SBF type of field has much more than just a text value. It contains a full graph of metadata and properties, inclusive links to Files and provenance metadata, which for example allows us to use an Upload field directly in the attached/configured webform. - Form modes also have an additional benefit. Each one can have fine grained permissions. That way you can have many different Form Modes, but allow only certain ones to be visible, or usable by users of a given Drupal Role. -### I think i get this...but how can i use this knowledge now? +### I think I get this...but how can I use this knowledge now? Good question! So, to enable, configure, and customize these Display Modes you have to navigate to your `Content Type` Configuration page in your running Archipelago. This is found at `/admin/structure/types`. Note: the way things are named in Drupal can be confusing to even the most deeply committed Drupal user, so bear in mind some terms will change. Feel free to read and re-read. -![Display Mode Managment for Content Types](images/managing-display-modes.jpg) +![Display Mode Management for Content Types](images/managing-display-modes-2024.png) You can see that for every existent Content Type, there is a drop down menu with options: @@ -69,7 +92,12 @@ You can see that for every existent Content Type, there is a drop down menu with ## Manage Display -![Manage Display](images/manage-display.jpg) +For Digital Objects: +![Manage Display Digital Object](images/manage-display-2024.png) + +For Digital Object Collections and Compounds/Creative Work Series +![Manage Display Digital Object Collections and Compounds/Creative Work Series](images/manage-display-coll.png) + On the top you will see all your View Modes Listed, with the `Default` one selected and expanded. The Table that follows has one row per Field attached/part of this Content Type. Some of the fields are part of the Content Type itself, in this case Digital Object (bundled) and some other ones are common to every Content Entity derived from a Node. @@ -97,7 +125,7 @@ You can play with this, experiment and change some settings to get more comforta ## Manage Form Display -![Manage Display](images/manage-form-display.jpg) +![Manage Display](images/manage-form-display-2024.png) On the top you will see all your Form Modes Listed, with the `Default` one selected and expanded. The Table that follows has one row per Field attached/partof this Content Type. The list of fields here is shorter, the SBF CopyFields are not present because all data goes really only into real fields. Also some other, display only ones (means you can not modify them) will not appear here. Again, Some of the fields are part of the Content Type itself, in this case Digital Object (bundled) and some other ones are common to every Content Entity derived from a Node. diff --git a/mkdocs.yml b/mkdocs.yml index e5e1a4c5..baf49bea 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -2,7 +2,7 @@ site_name: Archipelago Documentation site_url: https://docs.archipelago.nyc repo_url: https://github.com/esmero/archipelago-documentation repo_name: archipelago-documentation -edit_uri: commits/1.3.0/docs/ +edit_uri: commits/1.4.0/docs/ site_description: Project documentation for Archipelago. site_author: The Digital Services Team at Metropolitan New York Library Council extra_css: @@ -10,6 +10,7 @@ extra_css: nav: - Home: index.md - Archipelago 101: + - Core Documentation Guides: 101_guides_list.md - Archipelago's Philosophy & Guiding Principles: ourtake.md - Strawberryfields Forever: strawberryfields.md - Metadata in Archipelago: metadatainarchipelago.md @@ -44,25 +45,29 @@ nav: - Site Administration & Configuration: - Strawberryfield Formatters: strawberryfield-formatters.md - Creating Display Modes: createdisplaymodes.md + - Primer on Display Modes: webformsasinput.md + - IIIF Server Settings: iiif_server_settings.md - Archipelago's File Persistence Strategy: archifilepersistencestrategy.md - Strawberry Runners Background/Post-Processing: - strawberryrunners.md - Pager and OCR Post-processor: strawberryrunners_pager_ocr.md - Webpage Text Post-processor: strawberryrunners_webpage_text.md - WACZ Binary Post-processor: strawberryrunners_wacz_binary.md + - Subtitle Post-processor: strawberryrunners_subtitle.md - Search & Solr: - search_solr_index.md - Strawberry Key Name Providers, Solr Field, and Facet Configuration: strawberry_key_name_providers.md - Advanced Search: search_advanced.md - Search Within Collections: search-within-collection.md + - IIIF Content Search: iiif-content-search.md - Fragaria Redirects: fragaria.md - Embargo & Access Restrictions: embargo.md - Content and Metadata Tools: - Ingesting Your First Object: firstobject.md - Webforms in Archipelago: - webforms.md - - How to Create a Webform as an Input Method for Archipelago Digital Objects (ADO): webformsasinput.md - - Customizing Webforms (Modifying allowable file extensions): modifyingfileextensionsinwebform.md + - How to Create a Webform as an Input Method: webformsasinput.md + - Modifying allowable file extensions: modifyingfileextensionsinwebform.md - Archipelago Custom Webform Elements: customwebformelements.md - Using Archipelago's Webform LoD from CSV attached to an ADO suggest: webformLoDfromCSV.md - Find and Replace: