From 9bc5d57911030fb7c36de0a429978b3b70afff3b Mon Sep 17 00:00:00 2001 From: Yann Ryan <45659603+yann-ryan@users.noreply.github.com> Date: Thu, 25 Jan 2024 16:18:05 +0100 Subject: [PATCH] Update space-place-gazetteers.md --- en/drafts/originals/space-place-gazetteers.md | 101 ++++++++---------- 1 file changed, 45 insertions(+), 56 deletions(-) diff --git a/en/drafts/originals/space-place-gazetteers.md b/en/drafts/originals/space-place-gazetteers.md index 1a2f08ff6..041d6a962 100644 --- a/en/drafts/originals/space-place-gazetteers.md +++ b/en/drafts/originals/space-place-gazetteers.md @@ -25,17 +25,17 @@ doi: XX.XXXXX/phen0000 ## Lesson Overview -This lesson introduces you to gazetteers, which are spatial knowledge organization systems about places that record names, spatial footprints, and other characteristics that have historically been associated with any locale. The lesson will explain how to think about the concept of place, why gazetteers are useful for spatial history, how to use historical information to create a gazetteer, and how to enhance and share a gazetteer. +This lesson introduces you to digital gazetteers, which are spatial knowledge organization systems about places that record names, spatial footprints, and other characteristics that have historically been associated with any locale. The lesson will explain how to think about the concept of place, why gazetteers are useful for spatial history, how to use historical information to create a gazetteer, and how to enhance and share a gazetteer. -A well-structured gazetteer reflects the fact that places are conceptual entities, not simply names or points on maps. Any given place may have had multiple names in numerous languages over the course of history, potentially involving conflicts about who has the power to enforce any of those names. The spatial extents, names, and feature types (settlements, buildings, nations, mountains, and so on) of places frequently change over time. Gazetteers are essential resources for spatial history. Unlike maps, gazetteers can readily connect named spatial entities with one another and with modern locations, and gazetteers make it easy to annotate any identified place with information about any texts, events, people, or other places that have been associated with it. +A well-structured gazetteer reflects the fact that places are conceptual entities, not simply names or points on maps. Any given place may have had multiple names in numerous languages over the course of history, potentially involving conflicts about who has the power to enforce any of those names. The spatial extents, names, and feature types (settlements, buildings, nations, mountains, and so on) of places frequently change over time. This lesson focuses on digital gazetteers, which are essential resources for spatial history. Unlike maps, gazetteers can readily connect named spatial entities with one another and with modern locations, and gazetteers make it easy to annotate any identified place with information about any texts, events, people, or other places that have been associated with it. The term gazetteer also refers to certain printed historical documents such as geographical indexes, directories, and encyclopdias. [^1] This lesson is focused on their digital equivalents. ### Learning Outcomes -This lesson will demonstrate how to build gazetteers, starting with simple spreadsheets and then building them into linked open data resources to share with other projects. +This lesson will demonstrate how to build digital gazetteers, starting with simple spreadsheets and then building them into linked open data resources to share with other projects. At the end of this lesson, you will be able to: -- Define what a gazetteer is, understand the concept of place, and distinguish gazetteers from other forms of spatial information. +- Define what a gazetteer is, understand the concept of place, and distinguish gazetteers from other forms of spatial information. - Identify scenarios for which creating a gazetteer may be preferable to using a geographic information system. - Transform a historical text into a gazetteer. - Share a gazetteer with other platforms to enhance it and use it for analytical purposes. @@ -46,15 +46,15 @@ No coding experience is needed to complete this lesson. You should be comfortab ## Historical Example -This lesson will show you how to create a gazetteer based on the online [Itinerary of Benjamin of Tudela](https://depts.washington.edu/silkroad/texts/tudela.html), an English translation of a Hebrew-language itinerary composed by Benjamin of Tudela (1130-1173), a Jewish traveler who journeyed between the Iberian Peninsula and West Asia in the twelfth century. Benjamin transited through three hundred cities along his route, recording information about geography, ethnography, commerce, Jewish life, and Jewish-Muslim relations.[^1] This text is a major work of medieval geography and Jewish history. You will build a gazetteer of places that Benjamin visited. This example will teach you to extract place names from written historical texts and use them to build a succinct gazetteer. The waypoints along Benjamin’s journey are cities with synagogues, so the lesson will explain how to build a gazetteer that includes historic place names as well as other feature types that are important in the historical record. This fulfills two lesson components: first, why and how a scholar might choose to build a basic gazetteer, and second, how a gazetteer can support historical analysis. +This lesson will show you how to create a gazetteer based on the online [Itinerary of Benjamin of Tudela](https://depts.washington.edu/silkroad/texts/tudela.html), an English translation of a Hebrew-language itinerary composed by Benjamin of Tudela (1130-1173), a Jewish traveler who journeyed between the Iberian Peninsula and West Asia in the twelfth century. Benjamin transited through three hundred cities along his route, recording information about geography, ethnography, commerce, Jewish life, and Jewish-Muslim relations.[^2] This text is a major work of medieval geography and Jewish history. You will build a gazetteer of places that Benjamin visited. This example will teach you to extract place names from written historical texts and use them to build a succinct gazetteer. The waypoints along Benjamin’s journey are cities with synagogues, so the lesson will explain how to build a gazetteer that includes historic place names as well as other feature types that are important in the historical record. This fulfills two lesson components: first, why and how a scholar might choose to build a basic gazetteer, and second, how a gazetteer can support historical analysis. ## Background: Space, Place, Gazetteers, and Knowledge Organization Systems ### What is a Place? -What is a place? You might think that a place is simply a geographic location, but it is more helpful to think of a place as a concept. The geographer John Agnew has postulated that when we say something is a place, we are talking about three different ideas. First, any place has a specific *location*. It lies somewhere on the surface of the earth. Second, the place is a setting for social relations. A place is a *locale* that shapes values, attitudes, or behaviors. Any workplace, school, or prison is a locale. Finally, any given place evokes a unique *sense of place* for each of its denizens, evoking specific impressions and sensations of belonging or unbelonging. That is to say, a place is a location where memorable events have transpired.[^2] Cultural geographers tend to distinguish the concept of place, with its references to unique and distinctive settings for human activity, from that of space, which refers to the totality of all possible geographical expanses, many of which may exist regardless of whether they are sites of human meaning. +What is a place? You might think that a place is simply a geographic location, but it is more helpful to think of a place as a concept. The geographer John Agnew has postulated that when we say something is a place, we are talking about three different ideas. First, any place has a specific *location*. It lies somewhere on the surface of the earth. Second, the place is a setting for social relations. A place is a *locale* that shapes values, attitudes, or behaviors. Any workplace, school, or prison is a locale. Finally, any given place evokes a unique *sense of place* for each of its denizens, evoking specific impressions and sensations of belonging or unbelonging. That is to say, a place is a location where memorable events have transpired.[^3] Cultural geographers like Yi-fu Tuan tend to distinguish the concept of place, with its references to unique and distinctive settings for human activity, from that of space, which refers to the totality of all possible geographical expanses, many of which may exist regardless of whether they are sites of human meaning.[^4] -Many theorists of place describe the concept in historical and temporally dynamic terms. The Marxist feminist geographer Doreen Massey defines places as sites of "meeting up of history in space," where people with different relations to authority and security encounter one another.[^3] The anthropologist Tim Ingold emphasizes the fact that "places do not just have locations but histories," because they are networks of habitation where people’s pathways become entangled.[^4] The Black activist geographer Ruth Wilson Gilmore underscores the fact that struggles for social justice are always spatial, and thus they are always about processes of placemaking.[^5] For these scholars, place can never be distinguished from travel, activity, relations of power, and human interaction. With its focus on human activity, meaning, contestation, and change over time, place - the purview of names, lists, descriptions, and gazetteers - is often a more meaningful concept for spatial historians than space – the domain of maps, which cannot easily represent human interaction and meaning. Place is an essential concept for many types of historical analysis and data management. +Many theorists of place describe the concept in historical and temporally dynamic terms. The Marxist feminist geographer Doreen Massey defines places as sites of "meeting up of history in space," where people with different relations to authority and security encounter one another.[^5] The anthropologist Tim Ingold emphasizes the fact that "places do not just have locations but histories," because they are networks of habitation where peoples' pathways become entangled.[^6] The Black activist geographer Ruth Wilson Gilmore underscores the fact that struggles for social justice are always spatial, and thus they are always about processes of placemaking.[^7] For these scholars, place can never be distinguished from travel, activity, relations of power, and human interaction. With its focus on human activity, meaning, contestation, and change over time, place - the purview of names, lists, descriptions, and gazetteers - is often a more meaningful concept for spatial historians than space – the domain of maps, which cannot easily represent human interaction and meaning. Place is an essential concept for many types of historical analysis and data management. The set of values, institutions, and relationships associated with any given locale are multitudinous, dynamic and unstable. A place may change substantially in all its particulars even as it persists as a spatial entity. Names for places may coexist, or they may succeed each other after regime changes or major events. Constantinople (also known historically as Lygos, Byzantium, Nova Roma, Rūmiyyat al-Kubra, and other monikers) also took on the name Istanbul after the Ottoman conquest in the fifteenth century, though both names were used officially until 1928. When Dutch settlers colonized the "hilly island" at the mouth of the Hudson River that Lenape residents called Manahatta, they named their settlement Neuwe Amsterdam, which became New York in 1664 after the English took over the Dutch colony. Informally, people might also refer to the city by the 1807 term Gotham or the 1921 term Big Apple. If they are speaking or writing in Chinese, they would call it Niuyue (纽约). @@ -62,11 +62,14 @@ Conversely, places may retain stable names even as their spatial footprints chan ### Gazetteer or GIS? -The first task for anybody embarking on a digital spatial history project is to decide whether to begin with a dataset-based gazetteer, or a map-based geographic information system. A project emphasizing the conflicting, contested, and dynamic characteristics of places, as well as spatial information reflected in textual attestations, should begin with a gazetteer. A GIS is only the logical starting point for a spatial history project centered on geography and spatial relations *per se*. +The first task for anybody embarking on a digital spatial history project is to decide whether to begin with a dataset-based gazetteer, or a map-based geographic information system. A project emphasizing the conflicting, contested, and dynamic characteristics of places, as well as spatial information reflected in textual attestations, should begin with a gazetteer. An example of this would be the [*Heritage Gazetteer of Libya*](https://slsgazetteer.org/). This project aims to provide as information about unique identifiers, locations, and monuments within modern Libya that were important to its history before 1950. The emphasis of the project in on compiling names and variants produced by the research of the Society for Libyan Studies. +A GIS is only the logical starting point for a spatial history project centered on geography and spatial relations *per se*. Both gazetteers and GIS are based on spatial data structured in particular formats. The focus of GIS is primarily on the projection of geospatial geometries in the form of points, lines, and polygons. An example GIS project would be the [*Bomb Site: Mapping the WW2 bomb census*](http://bombsight.org/#17/51.50595/-0.10680) project, which aims to emphasize first and foremost the visualization of the targets of the Luftwaffe Blitz bombing raids in London from October 7, 1940 to June 6, 1941. While a gazetteer may also contain geographical information, its primary focus is on depicting other kinds of information about places and not merely points, lines, or polygons on a map base. Indeed, although geometry is necessary for making maps, the symbols on maps only tell a small part of the story of a place. The way to model rich and multivocal data about place making events, contestation and power, places as settings for social events, and to represent the sense of place, is with a gazetteer, not a map. Gazetteers are excellent for collecting information such as what a place has been called, by whom, why, and when; who has been there; what has occurred there; who has contended for authority over it; or what texts have referred to it. These are all questions that are of special interest to historians. Not every spatial project requires a map. In many cases, a gazetteer is a more useful way of capturing and analyzing historical spatial information. -In its simplest form, a gazetteer is an index or dictionary of place names. Gazetteers do not need to include geographic coordinates, though many do so to enable visualization of spatial data. Gazetteers often include a controlled vocabulary of information about feature types that describe places as well: whether a place is a settlement, a waypoint on a travel itinerary, or a geographical feature such as a mountain or river. A gazetteer, especially a historical one, is a kind of knowledge organization system (KOS). A KOS is a tool "that brings together related concepts and their names in a meaningful way, such that users of the KOS can easily comprehend the relationships represented."[^6] Historical gazetteers link discourses about a place or places over time. The shape and organization of the KOS is determined by the shared characteristics of the places that need to be modeled. +In its simplest form, a gazetteer is an index or dictionary of place names. Gazetteers do not need to include geographic coordinates, though many do so to enable visualization of spatial data. Gazetteers, thus, are not merely limited to the historical realm. They can be used to track the movements of a character or characters throughout a fictional realm, for example tracing Frodo's travels from The Shire to Mordor. + +Gazetteers often include a controlled vocabulary of information about feature types that describe places as well: whether a place is a settlement, a waypoint on a travel itinerary, or a geographical feature such as a mountain or river. A gazetteer, especially a historical one, is a kind of knowledge organization system (KOS). A KOS is a tool "that brings together related concepts and their names in a meaningful way, such that users of the KOS can easily comprehend the relationships represented."[^8] Historical gazetteers link discourses about a place or places over time. The shape and organization of the KOS is determined by the shared characteristics of the places that need to be modeled. ### Considerations @@ -74,27 +77,28 @@ Based on the above discussion, it should be clear that the most important consid The author of a historical gazetteer that includes information about New York, the great metropolis situated at the mouth of the Hudson River, would do well to group information about its many names into one complex entity associated with a single ID number: Lenape Manahatta, Dutch Neuwe Amsterdam, British colonial New York, and Washington Irving’s 1807 coinage of Gotham. Grouping multiple names and attestations into a single gazetteer record allows for several affordances. First, it makes the gazetteer into a powerful thesaurus. Second, it makes it possible to map as much information as possible onto a single geographical referent. Third, it makes the gazetteer into a compelling and potentially decolonial work of history which, by collecting names and attestations together, tells a story of sovereignty, colonialism, and culture. Finally, it improves search and discovery, especially in the context of linked open data. -To be sure, it is a matter of personal and scholarly judgement and of research strategy to decide whether these names do indeed refer to a single place. After all, Manahatta was the name of an island, not an inhabited place, and that island today is the site of only one of the five boroughs of New York City. There is no objective way to decide whether to group these names together as references to a single place. In an ambiguous case like this, the consideration is simply whether one’s own research and visualization tasks would be enhanced more effectively by grouping these names together, or by leaving some of them separate and potentially specified as "relations" using the [GeoJSON-LD Linked Places format](https://github.com/LinkedPasts/linked-places-format) or another similar data format. Names and attestations that are grouped into a single entity are easiest to find and use together, but the decision to group disparate pieces of information together may come at the expense of precision, accuracy and nuance. Beyond human judgement, these questions are the domain of entity resolution, an open and unresolved topic in information science, natural language processing, and geoscience.[^7] - -### Data Standards: Linked Places GeoJSON and LP-TSV - -In a widely cited 2006 book, the geospatial librarian Linda Hill suggested that each entry in a well-structured gazetteer should include at least one name, at least one set of coordinates, and one or more feature types.[^8] For historians, it is often especially important to include modern place names if the name has changed, as well as a temporal range for when the older name was attested in a source. For those operating with multilingual sources or projects, it may also be important to note different names or transliterations for a given place, for example, Moscow (EN), Moskau (DE), Moscou (FR), Москва (RU). - -The [Linked Places GeoJSON Format](https://whgazetteer.org/tutorials/choosing/) is an interconnection standard for contributions of historical place data to linked open data projects. It permits temporal scoping of entire place records and temporal scoping of individual name variants, geometries, place types, and place relations, expressed either as timespans or as named time periods. It supports any number of names, geometries, and relations, as well as information about the sources of such assertions. LP-TSV is a delimited file format derived from Linked Places. It is intended for gazetteer developers whose data is relatively simple. For example, an LP-TSV row can include a timespan for an entire record, but does not permit temporal scoping of individual components of the record. LP and LP-TSV are widely used historical gazetteer data standards. The next section provides a hands-on example of how to model and build an LP-TSV compatible gazetteer from a historical text. +To be sure, it is a matter of personal and scholarly judgement and of research strategy to decide whether these names do indeed refer to a single place. After all, Manahatta was the name of an island, not an inhabited place, and that island today is the site of only one of the five boroughs of New York City. There is no objective way to decide whether to group these names together as references to a single place. In an ambiguous case like this, the consideration is simply whether one’s own research and visualization tasks would be enhanced more effectively by grouping these names together, or by leaving some of them separate and potentially specified as "relations" using the [GeoJSON-LD Linked Places format](https://github.com/LinkedPasts/linked-places-format) or another similar data format. Names and attestations that are grouped into a single entity are easiest to find and use together, but the decision to group disparate pieces of information together may come at the expense of precision, accuracy and nuance. A given gazetteer author or project team may choose to articulate disambiguation principles in order to assist with interoperability and reusability. Beyond human judgement, these questions are the domain of entity resolution, an open and unresolved topic in information science, natural language processing, and geoscience.[^9] Spatial historians, as well as information scientists interested in questions of temporality, have also begun to publish on this topic. [^10] ## Building a Gazetteer from a Historical Text Historians often work with detailed written texts such as memoirs or travelogues that may contain a wealth of spatial information. *The Itinerary of Benjamin of Tudela* is one such example of rich, descriptive historical text that can be mined for data for spatial research. -Benjamin of Tudela was a twelfth century Spanish Jewish traveler whose text describes his expedition and his interactions with different Jewish communities. A spatial historian interested in this text may want to discover where Benjamin of Tudela travelled on his grand journey, and how he interacted with Jewish communities in the locations he visited. These questions suggest the outline for a gazetteer spreadsheet. The authors of this tutorial recommend using either Microsoft Excel or Google Sheets for the process of creating a simple gazetteer that is compatible with the LP-TSV format. +Benjamin of Tudela was a twelfth century Spanish Jewish traveler whose text describes his expedition and his interactions with different Jewish communities. A spatial historian interested in this text may want to discover where Benjamin of Tudela traveled on his grand journey, and how he interacted with Jewish communities in the locations he visited. A scholar might also use this source as one of a large corpus of texts to examine questions about travel in the post-classical period, European exploration, or Eurasian Jewish studies. The places named in this itinerary could be cross-referenced with those named in other accounts from a similar period to see if there were certain stops that were more popular than others or to see if different travelers described the locations in the same ways. + +The structure of Tudela's travelogue suggests the outline for a gazetteer spreadsheet. The authors of this tutorial recommend using either Microsoft Excel or Google Sheets for the process of creating a simple gazetteer that is compatible with the LP-TSV format. To begin, navigate to the section entitled "The Itinerary of Benjamin of Tudela" on the [web version of this text](https://depts.washington.edu/silkroad/texts/tudela.html#itinerary_1). ### Building Spreadsheet Fields -Our first task is to create the fields in a spreadsheet that we will populate with data from the historical text. Open Excel or whichever program is your preferred spreadsheet software. From the first paragraph of *The Itinerary*, we know that we need a column of place names that are travel stops. Start by creating a column called, "TravelStop." In keeping with good practices for making spreadsheets that may need to be shared or exported into other software, we will also include an ID number column. Insert a new column before the "TravelStop" one and fill in, "ID" for this column header. The next section of this tutorial, on Linked Open Data, will explain why it is also a gazetteer best practice to include a place type to describe the travel stops. For now, we will assume that all travel stops are some kind of inhabited place. +Our first task is to create the fields in a spreadsheet that we will populate with data from the historical text. Open Excel or whichever program is your preferred spreadsheet software. + +In a widely cited 2006 book, the geospatial librarian Linda Hill suggested that each entry in a well-structured gazetteer should include at least one name, at least one set of coordinates, and one or more feature types.[^11] For historians, it is often especially important to include modern place names if the name has changed, as well as a temporal range for when the older name was attested in a source. For those operating with multilingual sources or projects, it may also be important to note different names or transliterations for a given place, for example, Moscow (EN), Moskau (DE), Moscou (FR), Москва (RU). + +From the first paragraph of *The Itinerary*, we know that we need a column of place names that are travel stops. Start by creating a column called, "TravelStop." In keeping with good practices for making spreadsheets that may need to be shared or exported into other software, we will also include an ID number column. Insert a new column before the "TravelStop" one and fill in, "ID" for this column header. The next section of this tutorial, on Linked Open Data, will explain why it is also a gazetteer best practice to include a place type to describe the travel stops. For now, we will assume that all travel stops are some kind of inhabited place. + +You can use whatever column headers you want for your own research, but we are using ones based on the [Linked Places format](https://github.com/LinkedPasts/linked-places-format/blob/main/tsv_0.4.md) for the ease of future data interoperability. The [Linked Places GeoJSON Format](https://whgazetteer.org/tutorials/choosing/) is an interconnection standard for contributions of historical place data to linked open data projects. It permits temporal scoping of entire place records and temporal scoping of individual name variants, geometries, place types, and place relations, expressed either as timespans or as named time periods. It supports any number of names, geometries, and relations, as well as information about the sources of such assertions. LP-TSV is a delimited file format derived from Linked Places. It is intended for gazetteer developers whose data is relatively simple. For example, an LP-TSV row can include a timespan for an entire record, but does not permit temporal scoping of individual components of the record. LP and LP-TSV are widely used historical gazetteer data standards. Using a standard from the start might be cumbersome at first, but it will save you lots of time later if you wish to share this project in a variety of ways. -You can use whatever column headers you want for your own research, but we are using ones based on the [Linked Places format](https://github.com/LinkedPasts/linked-places-format/blob/main/tsv_0.4.md) for the ease of future data interoperability. Using a standard from the start might be cumbersome at first, but it will save you lots of time later if you wish to share this project in a variety of ways. Based on our source material, including a controlled vocabulary for the type of place is smart for the dataset. The Linked Places format recommends the use of a Place Type. Include a column called "PlaceType." Using an established standard like this means that the data we create in this tutorial, or that you create for your own research informed by this tutorial, can be shared with other likeminded researchers to create new knowledge. We will thus also include a column for aat_type, another strongly recommended standardized form of attribute data that makes it easier to share historical spatial project data. Type in "aat_type" for one of the columns. @@ -104,16 +108,12 @@ Please add two other columns as well. We need a column that accounts for where w For now, your spreadsheet should look something like this table below. -