Details for Milestone are available on Canvas (left sidebar, Course Project).
On a very high level, we hope to perform an analysis on the ways in which pride correlates with different economic and social factors geographically. In particular, we've chosen to bring political inclinations and taxation into our consideration. As queer people, we have "skin in the game," so to speak, and making our culture statistically relevant is existentially important in a world that operates more and more strictly in terms of information (specifically, information first passed through the sieve of institutions designed to ignore or actively eliminate us). We're interested in this topic because we believe that through the application of data science, we can gain insights into queerness in the context of the United States' polarized geography.
Geography is a subject neither of us have a formal background in, although it's something that interests us both. Geography has become deplatformed as a subject in a certain regard since the near-universalization of globalism, creating a world that, at least on its surface, seems completely interconnected. Despite this, understanding our local community is more important than ever for marginalized groups. So, we hope that this project will enable us to reestablish that perspective on a more macro level.
The "gaybourhoods" data set we're using for our project was produced by The Pudding for their 2018 article Men are from Chelsea, Women are from Park Slope. Although published in 2018, the data was actually collected in 2015. The article cites its purpose as being an attempt to properly quantify what they call "gaybourhoods," which is "an overarching term to describe areas with a visible LGBTQ and queer presence." There's different reasons why this may or may not be something we want. For one, it provides us with perspective on the geographic imprint of queer people on the United States. The data was collected from a composite of sources, including local pride organizations and the federal government.
To quantify gaybourhoods, writers and data analysts from The Pudding collected information on the number of tax-filing households classified as "same-sex," whether or not pride parades march through a given zip code, and the number of bars tagged as "gay bar" in the region on Yelp. This is, of course, very limited in quantifying the diversity of the queer community, but it provides us with a base to work on.
- Nat Scott: Student of computer and environmental science
- Sami Almuallim: Student of computer and data science
Images coming soon.
- Men are from Chelsea, Women are from Park Slope
- The article for which the data was originally collected.
- The Gaybourhoods data set on Github
- Data set relating US ZIP codes to their coordinates
- Geographic situation of taxes payed in the US
- County Presidential Election Returns 2000-2020
- The coordinates of each US county
- Most of the data here was irrelevant to us so we deleted it with LibreOffice Calc so that we could upload it to GitHub