Skip to content

Authority Control

Nicholas Kontovas (professional account) edited this page Jul 20, 2023 · 12 revisions

Updating additions files

Before the Fihrist web site can be re-indexed each week, the authority files must be updated. This is so that authority entries are created for all the new key attributes added to TEI records for works, persons, and subjects by cataloguers over the past week.

This is done by the person running the re-indexing, not the editor of the authority files. But it is included here to give a complete picture of the entire process. It is achieved by running a script. The script finds any author/editor/persName tags with a new VIAF-based ID in their key attribute (e.g. key="person_12345678") and adds them to the persons_additions.xml file in the authority folder. Likewise, new LCSH-based IDs in key attributes of term elements are added to subjects_additions.xml. Finally, any blank key attributes (i.e. key="") for persons or titles of works will have a unique local ID generated for them, and are also added to the appropriate *_additions.xml file.

  1. Pull the latest changes from GitHub to your local copy.
  2. cd processing
  3. ./update-authority-files.sh
  4. If the script reports "XQuery failed" then look in update-authority-files.log. Mostly likely a TEI record (or possibly one of the authority files) is broken. Inform the person who last edited it, then try again when they've fix it.
  5. Commit and push the modified authority files to GitHub.

The additions files are effectively a queue of new entries awaiting editorial review, as described below.

Reviewing additions files

There are currently three additions files: persons, works and subjects. These contain new system generated identifiers or identifiers imported from VIAF as (@xml:id)s. For example:

  • person xml:id="person_f7927" or person xml:id="person_120703673" for personal names
  • bibl xml:id="work_22333" for titles
  • item xml:id="subject_sh85062204" for subject headings

Entries in the additions files include at least one comment (displayed in green): The relative file path(s) to the TEI record(s) containing the relevant element . For example:

  • !-- ../collections/the%20university%20of%20manchester%2FPersian_MS_500.xml#Persian_MS_500%2Ditem1%2Ditem40 -- Or comments suggesting possible duplication, for example:
  • !-- POSSIBLE DUPLICATE OF: persons_base.xml#person_f7891 --
  • !-- POSSIBLE DUPLICATE OF: works_base.xml#work_20170 --

Updating key attributes in TEI records

This is a manual process. Open the additions file, and the base file for the elements for which new @key s have been generated (i.e. persons_additions and persons_base or works_additions and works_base). Retrieve the TEI record by placing the curser in the comment (relative path) of the additions file and click Ctrl+Enter (for Windows).

It's best to have only three files open : additions, base and TEI record.

The @key for new names and titles in the TEI record is likely to be empty: persName key="". title key=""

Copy the value of the xml:id (from the element in the additions file) and paste it into the empty @key (of the respective element) in the TEI record.

For example copy person_f7927 from person xml:id="person_f7927" in the additions file and paste it into the empty @key in the TEI record file: persName key="person_f7891" or author key="person_f7891"

Ensure no @key for the same value is left empty as this will create ambiguities! All encoded occurrences of the same person (or work), regardless of variations in form or language must contain the value corresponding to the xml:id in the additions file.

When the new identifier is entered in the TEI record:

  1. save the TEI record.
  2. grab the entire element (i.e. person/bibl/item respectively!) from the additions file
  3. paste the element in the base file (click Enter twice to leave an empty line between elements)

Note: persons with VIAF IDs, should have the identifier entered in the record at the time of creation. The entries from the persons_additions file can therefore be transferred to the persons_base file without further editing of the TEI record file. items from the subjects_additions file can also be grabbed and pasted into the subjects_base file without further editing of the TEI record file unless duplications or ambiguities are detected. biblIDs from the works_additions file must always be entered manually in the TEI record file before the element bibl is pasted in the works_additions file.

Deduplicating

When the same person or work is represented more than once in the respective base file or given a new identifier in the additions file, the entry must be deduplicated. For persons, give precedence to the entry, which contains a VIAF ID. If all duplicates contain local IDs (for example works), select the one represented in the largest number of records to be kept in the base file. Grab the duplicate entries and paste them in to the deletions file with an @sameAs denoting the ID number, which remains in the base file, for example:

  • person xml:id="person_f7881" sameAs="persons_base.xml#person_f7872 Transfer any additional variants of the name (or title for works) to the base file. The relative path comments may also be moved to the base file. Open the TEI record containing the @keys with the deleted IDs and overwrite these with the ID, which remains in the base file.

Disambiguating

People with multiple VIAF/LC entries

Occasionally, you might come across a person entity with more than one authority entry in the Library of Congress database. Often, this duplication has been copied by VIAF, from which Fihrist generally derives its ID numbers for poeple. An example of this can be found here, where the person محمد بن علي سباهي زاده can be found under both this VIAF entry under the Arabic spelling of his name and a separate VIAF entry under the Turkish spelling. This seems to be replicating a mistake made by LC, which similarly has two entries, one for "Sibāhī Zādah, Muḥammad ibn ʻAlī, -1589" and another for "Sipâhîzâde Mehmed, -1589".

Fihrist only uses IDs based on VIAF ID numbers for the sake of convenience. In fact, it doesn't really matter which ID the person entry in our authority files has as long as that ID is unique. Fihrist cataloguers faced with such a situation should therefore make a choice between the numbers of the two existing VIAF entries.

If the person does not already exist in the persons_base.xml file, the choice is largely arbitrary; however, it may be a good idea to choose the number corresponding to the entry with the most information attached to it in VIAF, since this is the one most likely to be deleted if the two entries in VIAF and/or LC are ever merged.

Even if only one of these IDs is used, we can (and should) still record the other VIAF and LC entries in the Fihrist authority entry so that users are aware that these are the same person. Whether the person already exists as an entry in Fihrist or you have chosen one of the VIAF IDs to create that entry yourself, you should also add links to the alternate VIAF & LC entries. To do this, simply add links to the pages for those entries as new <item> tags within <list type="links"> under <note type="links"> within the <person> tag. Additionally, it may be a good idea to mark these as something to the effect of "(alternate)" within the textContent of the <title> tags for those links. An example can be seen below with the links to the alternate entries indicated as comments.

            <person xml:id="person_21607038" role="ant aut">
                <persName type="display">Sibāhī Zādah, Muḥammad ibn ʻAlī, d. 1589</persName>
                <persName type="variant">Sipāhīzādah</persName>
                <persName type="variant">Sipāhīzādah, Muḥammad ibn ʻAlī, d. 1589</persName>
                <persName type="variant">محمد بن علي سباهي زاده</persName>
                <note type="links">
                    <list type="links">
                        <item>
                            <ref target="https://viaf.org/viaf/21607038/">
                                <title>VIAF</title>
                            </ref>
                        </item>
                        <item>                                                           <!--ALTERNATE-->
                            <ref target="https://viaf.org/viaf/18164115083910052666/">   <!--ALTERNATE-->
                                <title>VIAF (alternate)</title>                          <!--ALTERNATE-->
                            </ref>                                                       <!--ALTERNATE-->
                        </item>                                                          <!--ALTERNATE-->
                        <item>
                            <ref target="http://id.loc.gov/authorities/names/n2007227637.html">
                                <title>LC</title>
                            </ref>
                        </item>
                        <item>                                                                    <!--ALTERNATE-->
                            <ref target="https://id.loc.gov/authorities/names/no2021151092.html"> <!--ALTERNATE-->
                                <title>LC (alternate)</title>                                     <!--ALTERNATE-->
                            </ref>                                                                <!--ALTERNATE-->
                        </item>                                                                   <!--ALTERNATE-->
                        <item>
                            <ref target="http://www.isni.org/isni/0000000040362656">
                                <title>ISNI</title>
                            </ref>
                        </item>
                    </list>
                </note>
                <!-- ../collections/cambridge%20university%2FOr_918.xml#Or_918%2Ditem1 -->
                <!-- ../collections/oxford%20university%2FMS_Bodl_Or_309.xml#MS_Bodl_Or_309%2Ditem1 -->
                <!-- ../collections/oxford%20university%2FMS_Bodl_Or_310.xml#MS_Bodl_Or_310%2Ditem1 -->
                <!-- ../collections/oxford%20university%2FMS_Marsh_596.xml#MS_Marsh_596%2Ditem1 -->
                <!-- ../collections/oxford%20university%2FMS_Pococke_302.xml#MS_Pococke_302%2Ditem1 -->
                <!-- ../collections/oxford%20university%2FMS_Turk_e_79.xml#MS_Turk_e_79%2Ditem1 -->
            </person>