Skip to content

Latest commit

 

History

History
6 lines (6 loc) · 800 Bytes

project-plan.md

File metadata and controls

6 lines (6 loc) · 800 Bytes

Project Plan (2/5/2020)

  1. Gather and clean data; filter it into dataframes
  2. General analysis of the corpus of Japanese words in general; figure out what percentage of the words in the corpus are in katakana
  3. Filter out anything non-katakana and analyze the length; try to come up with a metric for what shortens a word (I'm not sure what else I could analyze; try to come up with more ideas later)
  4. Maybe bring in Google n-gram data to track usage of these loan words over time? Not sure if this is possible, but it's something I'm interested in
  5. Perhaps some sociolinguistic analysis based on the data - I'm interested in the idea of linguistic imperialism and since I know many gairaigo are of English origin, the analysis of these loanwords could have something of a connection there.