Skip to content

Latest commit

 

History

History
45 lines (42 loc) · 4.44 KB

to_do.md

File metadata and controls

45 lines (42 loc) · 4.44 KB
  1. Brainstorm for aspects of your proposal [X]

  2. Develop metrics that will help inform the domain decisions [X]

  3. Get familiar with the content of potential datasets; understand what might inform your metrics, and what holes there still are

  4. Consider best way to communicate metrics; for example:

    • Should you use time-series graphs? Density/heat-maps? You're certainly not limited in the number of visualizations you can include.
    • Should your report only be at one level of detail, or should you include a break-down by sub-geography (neighborhood, district, etc)?
  5. Write proposal and develop wireframes [X]

    • Include boxes for metrics and sample prose on wireframes [X]
  6. Develop scripts to extract data from sources and load into PostgreSQL and/or BigQuery i.Retrieve datasets: i.Population (Zipcodes)

  7. Create the structure for your Airflow pipeline and add your extract/load scripts to it i.Deploy airflow - create environment file (.yaml) i.Extract processes: i.Dataset 1 = Daily cases i.Dataset 2 = Vaccinations i.Dataset 3 = Hospitalizations i.Dataset 4 = Deaths * Build and test on local machines, then move to GCP

  8. Deploy your pipeline to a cloud server (and document your deployment steps for when -- not if -- you forget them) i.Entity Relationship diagram(?)

  9. Dive deeper into data

    • Experiment and develop queries for metrics, using tools such as PGAdmin, BigQuery, or Jupyter Notebooks
    • Note useful data transformations and queries i.Transformations - two ways to do it:
    • SQL (Complex(?))
    • Python - may be easier
  10. Convert explorations into SQL and Python scripts to transform ingested data

  11. Experiment with visualizations of metrics

  12. Create "live mockup(s)" in HTML of dashboard page(s)

  13. Configure a GCS

  14. Convert mockup(s) to template(s)

  15. Create scripts to render template(s) for dashboard page(s)

Meeting 11/19/21: Action items:

  • Kelly will get extract processes for datasets up and running on local machine
  • Johnathan will get Airflow up and running on GCP for group to use
  • Lan will develop a timeline & look at Mjumbe's example project to see what we need to do in the future