Skip to content

Commit

Permalink
Update visualizations.html
Browse files Browse the repository at this point in the history
  • Loading branch information
gilliansmac92 committed Nov 20, 2024
1 parent e4acd87 commit a7ab4f2
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions visualizations.html
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,8 @@ <h4>Methodology</h4>
<p>Phase one required turning qualitative data into quantitative data. We started by recording all of the pertinent data from the PDF version of the Leven and Melville Papers. We then turned this data into spreadsheet data, here we used categories: Id, Sender, Receiver, Location from, Location to, Latitude and Longitude, Type and Date. This allowed us to parse the networking data in the form of nodes and edges. The data was then cleaned using OpenRefine to split up latitude and longitude and keywords into different columns; we also made sure there were no blank tiles and duplicates. After creating a master spreadsheet information file, we then set about creating different sheets for different visualizations including people, places, keywords, nodes, and then edges (relationships). We were able to get all the data from the 599 letters contained in the digitized copy of the Leven and Melville papers into a csv file. Given the large dataset of Network Letters, it allowed for exploratory data analysis and investigation on different digital tools to identify the best representation of the relationships presented in the papers. One of the main tools we ended up using was the programming language Python; which contains a large number of libraries that extend the capabilities of the language, allowing for complex visualizations of the Network Data. One of the most prominent libraries used was networkx, which allowed the creation of network graphs along with the application of the Girvan-Newman algorithm to detect communities within the network. This algorithm works by repeatedly removing edges on the shortest path within the network. Additionally, the nodes are given corresponding colors to highlight their community, enabling an easier identification of groups in the network. The algorithm is important for understanding the network graph because a node with higher betweenness centrality would have more control over the network, due to the fact that more information will pass through that node. The implementation of these libraries and creation of visuals were carried out on Jupyter notebook, which is an open-source software for interactive computing. In addition, we experimented with tools such as Leaflet, Flourish, and Gephi for further analysis of the letters.</p>
<h4>Degree Centrality</h4>
<figure>
<img align="right" style="margin: 10px" src="data/images/centered-nodes.png" width="450" height="400">
<p>Our exploration began with a degree centrality analysis of the letter corpus. This is a calculation of the number of edges or connections of a singular node. Degree centrality is the easiest calculation of the significance of the node on the network i.e. where it is structurally important and often reveals clusters that dominate the network. Unsurprisingly Melville has the largest centrality count of 499, he is followed by Crawford with a score of 87, Hamilton with a score of 56, and John Dalrymple of Stair with 46.</p>
<img align="right" style="margin: 10px" src="data/images/centered-nodes.png" width="450" height="500">
<p>Our exploration began with a degree centrality analysis of the letter corpus. This is a calculation of the number of edges or connections of a singular node. Degree centrality is the easiest calculation of the significance of the node on the network i.e. where it is structurally important and often reveals clusters that dominate the network. Unsurprisingly, Melville has the largest centrality count of 499, he is followed by Crawford with a score of 87, Hamilton with a score of 56, and John Dalrymple of Stair with 46. </p>
</figure>
<h5>Most Connected Nodes</h5>
<p>What we have found within this corpus is 123 unique senders and receivers; each was assigned metadata in the form of their allegiance, cabinet position, and weight. We found 554 individual letters, packets, orders, sets of instructions, declarations, commissions, attestations, memorials, intercepted letters, and more. The corpus also contained 72 places of origin and target which were then geocoded. We aregues similarly to Ruth and Sebastian Ahnert's suggestions that network hubs correspond broadly with the centers of government.<sup>1</sup> This also includes somewhat symbolic hubs of the monarch and the principal secretaries. There are imbalances present throughout the corpus especially since more letters received survive than letters sent. If there was a complete record of the correspondence there would be less imbalance. There are also intercepted letters present in the collection which might account for some of the anomalies in the network.</p>
Expand Down

0 comments on commit a7ab4f2

Please sign in to comment.