injury_tables.Rmd

---
title: "Injury Outcomes"
date: "`r format(Sys.time(), '%d %B, %Y')`"
params:
  output_version: ''
output:
  html_document:
    toc: true
    toc_depth: 5
    toc_float: true
---

```{r setup, include=FALSE, echo=FALSE, results='hide'}
library(knitr)
library(pracma)
library(summarytools)
library(dplyr)
library(purrr)
library(flextable)
library(ggplot2)
library(tidyverse)
library(writexl)
library(cli)

# Script to summarise injury outcomes

# WARNING: Script assumes that the Baseline scenario corresponds to the known injury fatality counts in the city


#### It produces the following output documents:

### - html file containing the following information:

#   - fatality counts by strike and casualty mode for each scenario and city (incl lower and upper confidence interval limits were applicable); one table for each city, scenario and mean, upper or lower confidence interval values

#   - fatality counts giving lower, mean and upper bounds (where they exist) for each casualty mode by strike mode for each scenario and city; one table for each casualty mode, scenario and city

#   - tables showing the fatalities per billion km travelled for each city, scenario and mean, upper and lower confidence interval values (where applicable). 

#   - tables showing the number of deaths per 100,000 people for each city, scenario and mean, upper and lower confidence interval values (where applicable). 

#   - tables showing the fatalities per 100 million hours travelled for each city, scenario and mean, upper and lower confidence interval values (where applicable). 


### - output .csv files containing the following information. (Note this files are also saved with their output_version in the file name):

#   - The inj_counts.csv file ('results/multi_city/inj/inj_counts.csv') contains the fatality counts (mean and upper and lower confidence interval boundaries if required) for each city, scenario and casualty and strike mode pair. It also includes the observed baseline fatality counts and the active travel fatality counts (the sum of cycle and walk casualty mode fatalities)

# - The injury_risks_per_billion_kms.csv file ('results/multi_city/inj/injury_risks_per_billion_kms.csv') gives the fatalities per billion km travelled for each city, scenario and casualty mode. It also includes the upper and lower confidence interval boundary values if required and active travel counts

# - The injury_risks_per_100k_pop.csv file ('results/multi_city/inj/injury_risks_per_100k_pop.csv') gives the fatalities per 100k people for each city, scenario and casualty mode. It also includes the upper and lower confidence interval boundary values if required and active travel counts 

# - The distances.csv file ('results/multi_city/inj/distances.csv') contains the population distances travelled per day for each city, scenario and mode

# - The injury_risks_per_100million_h.csv file ('results/multi_city/inj/injury_risks_per_100million_h.csv') gives the fatalities per 100 million hours travelled for each city, scenario and casualty mode. It also includes the upper and lower confidence interval boundary values if required and active travel counts

# - The inj_data_*output_version*.xlsx file (results/multi_city/inj/inj_data_*output_version*.xlsx) contains all injury output csv files


## The script performs the following steps, assuming that ITHIM-GLOBAL has been run and the ithim_objects object has been saved in "results/multi_city/io.rds":

# - read in ithim_objects and extract relevant information such as city and scenario names 
# - define various functions to summarise and print data and to re-name scenarios

# - loop through cities:
#   - loop through the mean injury values but also the upper and lower confidence boundary values if given:
#     - loop through scenarios incl baseline (original counts) and baseline predicted:
#       - create dataframe with nov and whw injury counts by casualty and strike mode pair (if only one
#         of nov or whw exists, only create dataframe with existing data), add row and column sums
#       - delete casualty and strike modes that do not exist in list of standardized modes
#       - print dataframe to html file
#       - create one list containing the predicted injury counts for each city, each scenario and for each scenario
#         both the mean and the upper and lower confidence interval limits are given if they exist
#       - create a dataframe with the original fatality counts for each casualty and strike mode pair
#         (adjusted by weight and injury reporting rate)
#       - create dataframe containing the mode distances for all scenarios including the predicted baseline
#       - create dataframes containing the population counts, mode speeds and durations travelled

# - create dataframe that contains the fatality counts for each city, scenario and casualty and strike mode pair.
#   It also includes the observed fatality counts and the active travel fatality counts (the sum of cycle and 
#   walk casualty mode fatalities) ('results/multi_city/inj/inj_counts.csv') 

# - For each city, scenario and casualty mode, print a table to the html file giving the fatality counts by strike mode
#   for the mean (and upper and lower confidence value boundaries) values

# - For each city and scenario give mean (and upper and lower confidence boundaries where required) fatalities per 
#   billion km travelled for each casualty mode. Create html tables. 

# - Create csv file ('injury_risks_per_billion_kms.csv') for fatalities per billion km travelled

# - Calculate the number of fatalities per 100,000 people for each city, scenario and mean, upper and lower confidence
#   interval values (where applicable). Print tables to .html file and also create a csv file ('injury_risks_per_100k_pop.csv')

# - Create dataframe containing all daily population mode distances for each city and scenario ('distances.csv')

# - Calculate tables showing the fatalities per 100 million hours for each city, scenario and mean, upper
#   and lower confidence interval values (where applicable). Print tables to .html file

# - Create a csv file with the fatalities per 100 million hours (injury_risks_per_100million_h.csv)

# - Export all injury datasets into a single excel file (inj_data.xlsx)


knitr::opts_chunk$set(comment=NA, prompt=FALSE, cache=FALSE, echo=F, message = F, warning = F, results='asis')

# read in table detailing the standardized modes
# smodes <- read_csv('data/global/modes/standardized_modes.csv')
smodes <- read_csv('results/multi_city/standardized_modes.csv')

```

```{js echo=FALSE}
window.location.href='#Injury_outputs_(normalized_to_annual_figures)';
```


```{r loadLibraries, echo = F, message = F}
# set up 
st_options(bootstrap.css     = FALSE,       # Already part of the theme so no need for it
           plain.ascii       = FALSE,       # One of the essential settings
           style             = "rmarkdown", # Idem.
           dfSummary.silent  = TRUE,        # Suppresses messages about temporary files
           footnote          = NA,          # Keeping the results minimalistic
           subtitle.emphasis = FALSE)        # For the vignette theme, this gives much better results. Your mileage may vary
                 

```

```{r load_objects = "asis", echo=FALSE, message = F}
#output_version <- "v0.3"
#

if (!exists("output_version")){
  ## Get the current repo sha
  gitArgs <- c("rev-parse", "--short", "HEAD", ">", file.path("repo_sha"))
  # Use shell command for Windows as it's failing with system2 for Windows (giving status 128)
  if (.Platform$OS.type == "windows"){
    shell(paste(append("git", gitArgs), collapse = " "), wait = T)
  } else {
    system2("git", gitArgs, wait = T)
  }
  
  repo_sha <-  as.character(readLines(file.path("repo_sha")))
  output_version <- paste0(repo_sha, "_test_run")
}

# Assumes that multi_city_script.R has been run - read in ithim_objects saved as io.rds
io <- readRDS(paste0("results/multi_city/io_", output_version, ".rds"))

# Get names of cities from the io object
cities <- names(io)[!names(io) %in% c('scen_prop','ithim_run' )]

# Get number of scenarios
NSCEN <- length(names(io[[1]]$outcomes$whw)) - 1

# Get scenario names
scen_names <- names(io[[1]]$outcomes$whw)

# input parameter file name
input_parameter_file <<- io$ithim_run$input_parameter_file
  

# further model run information
compute_mode <- io$ithim_run$compute_mode 
timestamp_model <- io$ithim_run$timestamp
comments_model <- io$ithim_run$comment

# Remove extra bits of info - irrelevant outputs for injury related outputs
# This makes the io object lighter
for (city in cities){
  io[[city]][["trip_scen_sets"]] <- NULL
  io[[city]][["inj_distances"]] <- NULL
  io[[city]][["synth_pop"]] <- NULL
  io[[city]]$outcomes$mmets <- io[[city]]$outcomes$pm_conc_pp <- NULL
}

# define which decimal place to round to
round_to <- 1

# function to round and sum the data and then to print it to
# the html output document
sum_and_round_and_print <- function(data,text=''){
  data <- lapply(data, function(x)round(x,round_to))
  data <- lapply(data,function(x)rbind(x,Total=colSums(x)))
  for(city in cities) {
    print(kable(data[[city]], caption = paste(text, city)))
    cat("\n")
  }
}

# function to round and print data
round_and_print <- function(data,text=''){
  data <- lapply(data, function(x)round(x,round_to))
  for(city in cities) {
    print(kable(data[[city]], caption = paste(text, city)))
    cat("\n")
  }
}


# function to rename scenarios
get_qualified_scen_name <- function(cs){
  qualified_scen_name <- ""
  if (cs == 'baseline')
    qualified_scen_name <- 'Baseline predicted'
  else if(cs == "sc_walk")
    qualified_scen_name <- 'Walk'
  else if(cs == "sc_cycle")
    qualified_scen_name <- 'Cycling'
  else if(cs == "sc_car")
    qualified_scen_name <- 'Car'
  else if(cs == "sc_motorcycle")
    qualified_scen_name <- 'Motorcycling'
  else if(cs == "sc_bus")
    qualified_scen_name <- 'Bus'
  else if(cs == "Baseline predicted")
    qualified_scen_name <- 'Baseline predicted'
  
  return(qualified_scen_name)
}


# print model run information to screen:
cat(
   cli::style_hyperlink(
      text = paste0("https://github.com/ITHIM/ITHIM-R/tree/", stringr::str_remove(output_version, "_test_run")),
      url = paste0("https://github.com/ITHIM/ITHIM-R/tree/", stringr::str_remove(output_version, "_test_run"))
   )
)
cat("  \n")
cat(paste0('Scenario: ', SCENARIO_INCREASE * 100, "%")) 
cat("  \n")
cat(paste0('Input Parameter version: ', io$ithim_run$input_parameter_file)) 
cat("  \n")
cat(paste0('Output version: ', output_version)) 
cat("  \n")
cat(paste0('Timestamp of model run: ', timestamp_model))
cat("  \n")
cat(paste0('Comments from model run: ', comments_model))
cat("  \n")


```


# Injury outcome (normalized to annual figures)

Injury outcome for all scenarios for each scenario including the predicted and observed baseline figures

## Injury outcome for each scenario (incl predicted and observed baseline figures)


```{r whw, results = 'asis', fig.width=7, fig.height=4, echo=FALSE, message = F}
# create dataframe with nov and whw injury counts by casualty and strike mode pair (if only one
# of nov or whw exists, only create dataframe with existing data), add row and column sums
# create distance, speed and duration dataframes

all_whw <- list()

city_df <- list()

for (city in cities) { # loop through all cities
  
  cat('\n')
  
  cat('###', stringr::str_to_title(stringr::str_replace(city, '_', ' ')), '\n')
  
  # loop through either mean values only or, if upper and lower confidence intervals are given, loop
  # through them as well
  #  if upper and lower bounds are given, 'pref' is set to 'NA', 'ub' and 'lb'
  for (pref in (word(grep("whw",names(io[[city]]$outcomes$whw$baseline), value = T), 2, sep = "_"))){ 
    # pref <- "ub"
    pref_name <- ""
    if (!is.na(pref))
      pref_name <- paste0("_", pref) # if pref is lb or ub e.g. 
    
    print(pref)
    
    whw_qn <- paste0("whw", pref_name)   # if upper and lower bounds are given, this takes one of whw, whw_ub, whw_lb
    nov_qn <- paste0("nov", pref_name)
    
    # loop through scenarios incl baseline predicted
    for (cs in names(io[[city]]$outcomes$whw)){ 
      # cs <- 'baseline'
      
      # if data only exists for nov but not for whw, only create data frame from nov part
      if (is.null(io[[city]]$outcomes$whw[[cs]][[whw_qn]])){ # if no 'whw' part exists
        if (!is.null(io[[city]]$outcomes$whw[[cs]][[nov_qn]])){ # but if 'nov' part exists
          
          td2 <- t(io[[city]]$outcomes$whw[[cs]][[nov_qn]]) %>% as.data.frame()
          td2$mode <- 'NOV'
          td2 <- td2 %>% dplyr::select(mode, names(.))
          td3 <- td2
        }
        
      }else{ # if whw exists
        td1 <- (io[[city]]$outcomes$whw[[cs]][[whw_qn]]) %>% as.data.frame() %>% tibble::rownames_to_column("mode") # whw fatalities table
        td3 <- td1
        if (!is.null(io[[city]]$outcomes$whw[[cs]][[nov_qn]])){ # if nov exists as well
          td2 <- t(io[[city]]$outcomes$whw[[cs]][[nov_qn]]) %>% as.data.frame()  # nov fatalities
          td2$mode <- 'NOV'
          
          td3 <- plyr::rbind.fill(td1, td2) # nov and whw fatalities
        }
      }
      td3 <- td3 %>% mutate(rowSum = rowSums(.[2:ncol(td3)], na.rm = T)) # add row sums
      td3 <- td3 %>% janitor::adorn_totals("row") # add column sums
      td3[, 2:ncol(td3)] <- round(td3[, 2:ncol(td3)], 2)
      
      qualified_scen_name <- get_qualified_scen_name(cs)
      
      cat('\n')
      
      cat('####', qualified_scen_name, ' ', ifelse(is.na(pref), "mean", pref), ' fatalities by cas and str mode', '\n')
      
      cat('\n')
      
      # remove column modes that do not appear in list of standardized modes
      td3 <- td3[,c(1, na.omit(match(smodes$exhaustive_list, colnames(td3))), ncol(td3))] 
      
      # remove rows with mode names that do not appear in list of standardized modes
      td3 <- td3[c(na.omit(match(smodes$exhaustive_list, td3$mode)), nrow(td3)),] 
      
      # convert numbers to characters
      temp <- td3 %>% mutate_if(is.numeric, .funs = funs(case_when( . < 1 ~ round(., 2) ,  . >= 1 ~  as.numeric(round(.))))
                                ) %>%  mutate_if(is.numeric, ~as.character(.))  
      
      ft <- flextable(temp)   # convert table in correct format for printing to html file
      ft <- add_header_row(ft, values = c('Strike Mode', 'Casualty Mode'), colwidths = c(1, ncol(temp) - 1))

      cat(knit_print(ft)) # print table to html file

      cat("\n")
      
      td <- td3
      names(td)[2:ncol(td)] <- paste(names(td)[2:ncol(td)], city, sep = "_") # add 'city_' at beginning of column names
      td$measure <- ifelse(is.na(pref), "mean", pref)
      
      city_df[[cs]][[city]][[unique(td$measure)]] <- td   # save all data in one list called city_df
      
      if (length(city_df[[cs]][[city]]) == 3) # if for each fatality sum we have mean, lb and ub
        # combine mean, lb and ub counts into one table for each scenario
        all_whw[[cs]][[city]] <- data.table::rbindlist(city_df[[cs]][[city]]) 
    }
  }
  
  # create df with location information
  cityname <- city
  if (city == cities[[1]]){
    location <- data.frame(city = cityname, country = io[[city]]$location$country, continent = io[[city]]$location$continent)
  } else {
    dummy_row <- data.frame(city = cityname, country = io[[city]]$location$country, continent = io[[city]]$location$continent)
    location <- rbind(location, dummy_row)
  }
  
  # create dataframe with original fatality counts adjusted by weight and injury_reporting_rate 
  # Baseline - whw
  injury_data_count <- as.data.frame(io[[city]]$injury_table$whw)
  injury_data_count <- injury_data_count %>% rename(value = count, str_mode = strike_mode) # rename columns
  
  # recalculate value based on weight and injury reporting rate
  injury_data_count$value <- injury_data_count$value / injury_data_count$weight / injury_data_count$injury_reporting_rate
  injury_data_count <- injury_data_count[c('str_mode', 'value', 'cas_mode')] # re order columns and drop column weight
  
  # aggregate as age and gender column has been removed
  injury_data_count2 <- injury_data_count %>% group_by(cas_mode, str_mode) %>% summarise(value = sum(value))
 
  
  # Baseline - nov
  injury_data_count_nov <- as.data.frame(io[[city]]$injury_table$nov)
  
  if( length(injury_data_count_nov) > 0){
    injury_data_count_nov <- injury_data_count_nov %>% rename(value = count, str_mode = strike_mode) # rename columns
    
    # recalculate value based on weight and injury reporting rate
    injury_data_count_nov$value <- injury_data_count_nov$value / injury_data_count_nov$weight / injury_data_count_nov$injury_reporting_rate
    # re order columns and drop column weight
    injury_data_count_nov <- injury_data_count_nov[c('str_mode', 'value', 'cas_mode')] 

    # aggregate as age and gender column has been removed
    injury_data_count_nov2 <- injury_data_count_nov %>% group_by(cas_mode, str_mode) %>% summarise(value = sum(value))
  
    # combine whw and nove data
    city_baseline_count <- rbind(injury_data_count2, injury_data_count_nov2)
  } else {
    city_baseline_count <- injury_data_count2
  }

  # add further city, scenario and location information
  city_baseline_count$city <- city
  city_baseline_count$scenario <- 'Baseline'
  city_baseline_count$country <- io[[city]]$location$country
  city_baseline_count$continent <- io[[city]]$location$continent
  
  # create one dataframe for all cities
  if (city == cities[[1]]){
    baseline_counts <- city_baseline_count
  } else {
    baseline_counts <- rbind(baseline_counts, city_baseline_count)
  }
  
  baseline_counts$measure <- 'mean'
  
  
  ###### create casualty distance df
  dummy_distances <- io[[city]]$true_dist
  
  dummy_distances2 <- pivot_longer(dummy_distances, cols = -c(stage_mode), names_to = 'scenario', values_to = 'mode_distance')
  
  baseline_pred <- dummy_distances2 %>% filter(scenario == 'baseline' | scenario == 'Baseline')
  baseline_pred$scenario <- 'Baseline'
  
  # update scenario names
  dummy_distances2 <- dummy_distances2 |> rowwise() |> mutate(scenario = get_qualified_scen_name(scenario))

  dummy_distances3 <- rbind(dummy_distances2, baseline_pred)
  
  dummy_distances3$city <- city
  
  if (city == cities[1]){
    true_distances <- dummy_distances3
  } else {
    true_distances <- rbind(true_distances,dummy_distances3)
  }
  
  
  ####### create population df
  pop_df <- data.frame(model_population = sum(io[[city]]$demographic$population))
  pop_df$city <- city
  
  if (city == cities[1]){
    all_pop_df <- pop_df
  } else {
    all_pop_df <- rbind(all_pop_df, pop_df)
  }
  
  # create speed dataframe
  speed_city <- io[[city]]$vehicle_inventory %>% dplyr::select(-c( "PM_emission_inventory",  "CO2_emission_inventory"))
  speed_city$city <- cityname
  
  if (city == cities[1]){
    speed_df <- speed_city
  } else {
    speed_df <- rbind(speed_df, speed_city)
  }
  

}

# calculate duration based on distance and speed
duration <- left_join(true_distances, speed_df, by = c('city','stage_mode' ))
duration$mode_duration <- duration$mode_distance / duration$speed
duration <- duration %>% dplyr::select(-c(mode_distance, speed))

true_distances <- true_distances %>% rename(cas_mode = stage_mode)

```


```{r whw_for_all_cs, results = 'asis', fig.width=7, fig.height=4, echo=FALSE, message = F}


# create one dataframe for all cities and scenarios (incl original baseline counts) containing one column with fatality counts (and columns for city, strike mode, cas mode, scenario, distances, location)
# calculate active_travel fatality counts (where cas mode is either walk or cycle) and add to fatality dataframe
# write csv file 


st <- list()
file_list <- list()
td <- NULL
inj_counts_list <- list()
for (cs in names(all_whw)){ # loop through scenarios
  # cs <- names(all_whw)[1]
  
  # join list elements of strike and cas mode fatilty predictions for all cities for the given scenario
  td <- all_whw[[cs]] %>% purrr::reduce(full_join, by = c("mode", "measure")) %>% as.data.frame() %>% dplyr::select(mode, sort(names(.))) 
  
  td[is.na(td)] <- 0 # set na values to 0
  
  # remove duplicate values
  td <- (td[!duplicated(td), ])
  
  # re-arrange order of rows
  td <- rbind(td %>% dplyr::filter(!mode %in% c("Total", "NOV")) %>% arrange(mode), td %>% filter(mode == 'NOV'), td %>% filter(mode == 'Total')) 
  
  td[is.na(td)] <- 0
  
  td <- td %>% mutate_if(is.numeric, round, 2)
  
  backup_td <- td
  
  # re-name auto_rickshaw as ar
  colnames(td) = gsub("auto_rickshaw", "ar", colnames(td))
  
  # create df with 1 column containing all the fatality counts
  inj_counts <- reshape2::melt(td) 
  
  # create one city and one strike mode column
  col_split <- stringr::str_split(inj_counts$variable, "_", simplify = TRUE, n = 2) 
  
  # tidy and rename some of the columns
  inj_counts <- cbind(inj_counts, col_split)
  names(inj_counts)[5] <- 'strike_mode'
  names(inj_counts)[6] <- 'city'
  inj_counts <- inj_counts %>% dplyr::select(-variable)
  inj_counts <- inj_counts %>% dplyr::filter(strike_mode != "rowSum")
  inj_counts <- inj_counts %>% rename(str_mode = mode, cas_mode = strike_mode)
  
  # re-name 'ar' to auto_rickshaw again ('_' in 'auto_rickshaw' would have caused issues when splitting the city and strike mode column)
  if (nrow(inj_counts[inj_counts$cas_mode == 'ar',]) > 0){
    inj_counts[inj_counts$cas_mode == 'ar',]$cas_mode <- "auto_rickshaw"
  }
  
  if (nrow(inj_counts[inj_counts$str_mode == 'ar',]) > 0){
    inj_counts[inj_counts$str_mode == 'ar',]$str_mode <- "auto_rickshaw"
  }
  
  inj_counts$str_mode <- factor(inj_counts$str_mode, levels = unique(inj_counts$str_mode)) # assign factors
  
  scen <- 'Scenario'
  
  # rename scenarios
  qualified_scen_name <- get_qualified_scen_name(cs)
  
  inj_counts$scenario <- qualified_scen_name
  
  if (length(inj_counts_list) == 0)
    inj_counts_list <- inj_counts
  else
    inj_counts_list <- rbind(inj_counts_list, inj_counts)
  

  qual_name <- paste(qualified_scen_name, scen)
  st[[cs]] <- format(td, scientific = F)
  
}


# add country and continent information
inj_counts_list <- left_join(inj_counts_list, location, by = 'city')


inj_counts_list <- rbind(inj_counts_list, baseline_counts)

# replace capital with lower letters
inj_counts_list$str_mode <- tolower(inj_counts_list$str_mode)


# add mode distances
inj_counts_list <- left_join(inj_counts_list, true_distances, by = c('city', 'cas_mode','scenario'))


# combine active travel modes
active_travel <- inj_counts_list %>% filter(cas_mode == 'cycle' | cas_mode == 'pedestrian')
active_travel <- active_travel %>% group_by(str_mode, measure, city, scenario, country, continent) %>% summarise(
                                  value = sum(value), mode_distance = sum(mode_distance), cas_mode ='active_travel')

# add active travel modes to inj_counts_list
inj_counts_list <- rbind(inj_counts_list, active_travel)

# add totals for each cas_mode
inj_counts_list <- inj_counts_list %>% filter(str_mode != 'Total' & str_mode != 'total')
inj_counts_list_total <- inj_counts_list %>% group_by(measure, cas_mode, city, scenario, country, continent, mode_distance) %>%
                                        summarise(str_mode = 'Total', value = sum(value))
inj_counts_list <- rbind(inj_counts_list, inj_counts_list_total)


# write to csv with and without output version number
readr::write_csv(inj_counts_list, paste0('results/multi_city/inj/inj_counts.csv'))
```


## Casualty mode fatality counts for all case cities 

```{r whw_print, results = 'asis', fig.width=7, fig.height=4, echo=FALSE, message = F}

### Tables for each casualty mode, giving fatalities by strike mode -> print to html file

list_temp <- list()
index <- 1
scen_names <- unique(inj_counts_list$scenario)

# loop through scenarios
for (i in 1:length(scen_names)){
  sn <- scen_names[i]
  cat('###', sn, ' scenario', '\n')
  
  temp <- inj_counts_list %>% filter(scenario == sn) %>% dplyr::select(cas_mode)
  
  cas_mode <- unique(temp$cas_mode)
  
  for (j in 1:length(cas_mode)){ # loop through casualty modes
    cm <- cas_mode[j]
    
    # filter by scenario and cas mode, add separate columns for each strike mode
    temp1 <- inj_counts_list %>% filter(scenario == sn & cas_mode == cm) %>% spread(value = value, key = str_mode)
    
    # rename nov column to NOV 
    if ('nov' %in% names(temp1)){
      temp1 <- temp1 %>% rename(NOV = nov)
      }

    # keep first three columns and strike modes in given list
    temp1 <- temp1[,c(1, 2, 3, na.omit(match(smodes$exhaustive_list, colnames(temp1))))]
    
    # round fatality counts
    temp1 <- temp1 %>% mutate_if(is.numeric, .funs = funs(case_when( . < 1 ~ round(., 2) ,  . >= 1 ~  as.numeric(round(.))))) %>%  mutate_if(is.numeric, ~as.character(.))
    
    list_temp[[index]] <- temp1
    
    index <- index + 1
    cat('\n')
    cat('#### Casualty mode: ', stringr::str_to_title(cm), '\n')
    
    # updated formating 
    ft <- flextable(temp1)
    ft <- add_header_row(ft, values = c(' ', 'strike mode'), colwidths = c(3, ncol(temp1) - 3))
    ft <- merge_v(ft, j = 1:3, part = 'body')
    
    cat('\n')
    
    cat(knit_print(ft))
    
    cat('\n')
    
  }
  
  cat('\n')
  
}
```


### Fatalities per billion km travelled

```{r inj_100k = "asis", echo = F, message = F}

# create html tables for injuries per billion km 
# calculate fatalities per capita

require(tibble)
overall_el <- list() # To save all cities results
overall_dd <- list() # To save all cities results

city_df <- list()
for (city in cities){ # loop through cities
  
  # loop through mean (and upper and lower confidence interval boundary values where applicable)
  for (pref in (word(grep("whw",names(io[[city]]$outcomes$whw$baseline), value = T), 2, sep = "_"))){ # NA, 'ub', 'lb'
    
    el <- list() # To save city specific results
    # pref <- "ub"
    pref_name <- ""
    if (!is.na(pref))
      pref_name <- paste0("_", pref)
    
    print(pref)
    
    whw_qn <- paste0("whw", pref_name)
    nov_qn <- paste0("nov", pref_name)
    
    
    for (cs in names(io[[city]]$outcomes$whw)){ # loop through scenarios
      # cs <- 'baseline'
      
        whw_nov_list <- word(names(io[[city]]$outcomes$whw$baseline), 1, sep = "_") %>% unique()
        whw_nov_list <- whw_nov_list[whw_nov_list != 'combined']
      
      if (length(whw_nov_list) == 2){ # if there are whw and nov matrices
         # td1: Number of fatalities in nov for each casualty mode
        td1 <- io[[city]]$outcomes$whw[[cs]][[nov_qn]] %>% as.data.frame() %>% rownames_to_column() %>% rename(mode = rowname) %>% rename_at(2, ~"count")
        # td2: Number of fatalities in whw for each casualty mode
        td2 <- colSums(io[[city]]$outcomes$whw[[cs]][[whw_qn]]) %>% as.data.frame() %>% rownames_to_column() %>% rename(mode = rowname) %>% rename_at(2, ~"count")
        # td3: Number of fatalities: sum of nov and whw for each casualty mode
        td3 <- full_join(td2, td1, by = 'mode') %>% mutate(count = rowSums(.[2:3], na.rm = T)) %>% dplyr::select(-c('count.x', 'count.y'))
     
        
      }else if(length(whw_nov_list) == 1 && 'whw' %in% whw_nov_list){ # if there are only whw matrices
        # sum of fatalities for each casualty mode
        td3 <- colSums(io[[city]]$outcomes$whw[[cs]][[whw_qn]]) %>% as.data.frame() %>% rownames_to_column() %>% rename(mode = rowname) %>% rename_at(2, ~"count")
       
      
        }else if(length(whw_nov_list) == 1 && 'nov' %in% whw_nov_list){ # if there are only nov matrices
        # sum of fatalities for each casualty mode
        td3 <- io[[city]]$outcomes$whw[[cs]][[nov_qn]] %>% as.data.frame() %>% rownames_to_column() %>% rename(mode = rowname) %>% rename_at(2, ~"count")
        
        
      }
      # td4: distance travelled per mode
      td4 <- io[[city]]$true_dist %>% filter(stage_mode %in% td3$mode) %>% dplyr::select(stage_mode, cs) %>% as.data.frame()
      
      if (length(el) == 0){
        el <- td4 %>% dplyr::select(stage_mode)
        dd <- el
      }
      # td4: merge total fatalities to distance
      td4 <- full_join(td4, td3 %>% dplyr::select(mode, count) %>% rename(stage_mode = mode), by = 'stage_mode') 
      
      # get updated scenario names
      var <- get_qualified_scen_name(cs)
      
      names(td4)[2] <- var
      
      td4[, 2] <- as.numeric(td4[,2])

      
      # create total
      total <- td4 %>% group_by() %>% summarise(stage_mode = 'Total', 
                                                                  x = sum(td4[,2]),
                                                                  y = sum(td4[,3]))
      colnames(total) <- colnames(td4)
      
      # add active travel mode
      # filter out pedestrian and cycle trips, calculate sum of counts and distances and add to existing data
      active_travel <- td4 %>% filter(stage_mode == 'pedestrian' | stage_mode == 'cycle')
      active_travel <- active_travel %>% group_by() %>% summarise(stage_mode = 'active_travel', 
                                                                  x = sum(active_travel[,2]),
                                                                  y = sum(active_travel[,3]))
      colnames(active_travel) <- colnames(td4)
      
      td4 <- rbind(td4, active_travel, total) # add total and active travel
      

      td5 <- td4 # save copy for later risk per population calculation
      
      # Compute risk per bn km travelled
      td4[, 2] <- round((td4[,3] / ( td4[,2] * 365)) * 
                          1000000000, 4)
      
      names(td4)[3] <- paste(names(td4)[2], names(td4)[3], sep = "_")
      
      el <- full_join(el, td4, by = c('stage_mode')) # output for fatalities per bn km travelled
      
      # add city's population
      pop <- sum(io[[city]]$demographic$population)
     
       # Compute distance travelled per capita
      td6 <- td5
      td6[[paste0(var,"_risk")]] <- td4[2]
      td6[[paste0(var,"_percapita")]] <- td6[[var]] * 365 / pop # calculate fatalities per capita
      names(td6)[3] <- paste(names(td6)[2], names(td6)[3], sep = "_")
      dd <- full_join(dd, td6, by = 'stage_mode') 
    }
    
    # remove count columns
    # el <- el %>% dplyr::select(-contains('count'))
    
    # round numbers 
    el2 <- el %>% mutate_if(is.numeric, round, digits = round_to)
    
    # print fatalities per bn km travelled to .html document
    print(kable(el2, caption = paste(city, ifelse(is.na(pref), "mean", pref))))
    
    td <- el #%>% dplyr::select(-contains('count'))
    names(td)[2:ncol(td)] <- paste(names(td)[2:ncol(td)], city, sep = "_")
    td$measure <- ifelse(is.na(pref), "mean", pref)
    dd$measure <- ifelse(is.na(pref), "mean", pref)
      
    city_df[["td"]][[city]][[unique(td$measure)]] <- td
    city_df[["dd"]][[city]][[unique(td$measure)]] <- dd
      
    if (length(city_df[["td"]][[city]]) == 3){
      overall_el[[city]] <- data.table::rbindlist(city_df[["td"]][[city]])
      overall_dd[[city]] <- data.table::rbindlist(city_df[["dd"]][[city]])
    }
    
    
    cat("\n")
  }
}
```


```{r inj_100k_all_cities = "asis"}

# create csv file per billion km travelled

require(tibble)
td <- overall_el %>% purrr::reduce(full_join, by = c("stage_mode", "measure")) %>% as.data.frame()
td[is.na(td)] <- 0
td <- td %>% dplyr::select(stage_mode, sort(names(.)))

injury_risks_b <- td |> dplyr::select(contains("mode") | !contains("count"))

# change shape and update some column names
injury_risks_lng <- reshape2::melt(injury_risks_b)
col_split <- stringr::str_split(injury_risks_lng$variable, "_", simplify = TRUE, n = 2)
injury_risks_lng <- cbind(injury_risks_lng, col_split)
names(injury_risks_lng)[ncol(injury_risks_lng) - 1] <- 'scenario'
names(injury_risks_lng)[ncol(injury_risks_lng)] <- 'city'
injury_risks_lng <- injury_risks_lng %>% dplyr::select(-variable)
injury_risks_lng$scenario <- as.character(injury_risks_lng$scenario)

rd <- rename(injury_risks_lng, mode = stage_mode)
#rd <- rd %>% filter(mode != 'Total')

# add country and continent information
rd <- left_join(rd, location, by = 'city')


count_m <- td |> dplyr::select(contains("mode") | contains("count") | contains("measure"))

# change shape and update some column names
count_m <- reshape2::melt(count_m)
count_m <- count_m |> mutate(variable = str_replace_all(variable, "count_", ""))
col_split <- stringr::str_split(count_m$variable, "_", simplify = TRUE, n = 2)
count_m <- cbind(count_m, col_split)
names(count_m)[ncol(count_m) - 1] <- 'scenario'
names(count_m)[ncol(count_m)] <- 'city'
count_m <- count_m %>% dplyr::select(-variable)
count_m$scenario <- as.character(count_m$scenario)

count_m <- rename(count_m, mode = stage_mode, count = value)
#rd <- rd %>% filter(mode != 'Total')

# add country and continent information
# count_m <- left_join(count_m, location, by = 'city') |> rename(count = value)

rd <- left_join(rd, count_m, by = c("mode", "measure", "scenario", "city"))

# get true_distances into correct format
true_distances <- true_distances  %>% rename(mode = cas_mode)

# calculate total distances
true_distances_total <- true_distances %>% group_by(city,scenario) %>% summarise(
                                              mode_distance = sum(mode_distance), mode = 'Total')

# calculate active travel distances
true_distances_at <- true_distances %>% filter(mode == 'cycle' | mode == 'pedestrian')
true_distances_at <- true_distances_at %>% group_by(city,scenario) %>% summarise(
                                              mode_distance = sum(mode_distance), mode = 'active_travel')


# add active travel and total distances
true_distances <- rbind(true_distances, true_distances_at ,true_distances_total)


# add distance information 
rd <- left_join(rd, true_distances, by = c('mode', 'city','scenario'))


## add baseline counts 
# aggregate baseline counts by cas_mode
baseline_counts_agg <- baseline_counts %>% group_by(city,cas_mode, country, continent, measure, scenario) %>% summarise(value = sum(value)) %>% rename(mode = cas_mode)


# add total
baseline_counts_agg_total <- baseline_counts_agg %>% group_by(city, country, continent, measure, scenario) %>% summarise(value = sum(value))
baseline_counts_agg_total$mode <- 'Total'

baseline_counts_agg <- rbind(baseline_counts_agg, baseline_counts_agg_total)

# merge baseline count with distance
baseline_counts_km <- left_join(baseline_counts_agg, true_distances, by = c('mode', 'city', 'scenario'))


# add active travel 
active_travel <- baseline_counts_km %>% filter(mode == 'cycle' | mode == 'pedestrian')
active_travel <- active_travel %>% group_by(city, scenario, country, continent, measure, ) %>% summarise(
                                   value = sum(value), mode_distance = sum(mode_distance), mode ='active_travel')

baseline_counts_km <- rbind(baseline_counts_km, active_travel)


# calculate value per distance
baseline_counts_km$value <- round((baseline_counts_km$value /(baseline_counts_km$mode_distance * 365))*1000000000,4)

rd <- plyr::rbind.fill(rd, baseline_counts_km)

# output csv with and without output version number
readr::write_csv(rd,paste0( 'results/multi_city/inj/injury_risks_per_billion_kms.csv'))
```


### Fatalities per 100,000 people

```{r inj_100k = "asis"}
require(tibble)
overall_el_normalized <- list()
city_df <- list()

for (city in cities){ # loop through cities
  
  print(city)
  
  # loop through mean and upper and lower confidence interval boundaries if they exist
  for (pref in (word(grep("whw",names(io[[city]]$outcomes$whw$baseline), value = T), 2, sep = "_"))){ 
    
    el <- list() # To save city specific results
    pref_name <- ""
    if (!is.na(pref))
      pref_name <- paste0("_", pref)
    
    print(pref)
    
    whw_qn <- paste0("whw", pref_name)
    nov_qn <- paste0("nov", pref_name)
    
    for (cs in names(io[[city]]$outcomes$whw)){ # loop through scenarios
      
      whw_nov_list <- word(names(io[[city]]$outcomes$whw$baseline), 1, sep = "_") %>% unique()
      whw_nov_list <- whw_nov_list[whw_nov_list != 'combined']
      
      if (length(whw_nov_list) == 2){ # if both whw and nov outputs exist
        # extract nov part and get into correct format
        td1 <- (io[[city]]$outcomes$whw[[cs]][[nov_qn]]) %>% as.data.frame() %>% rownames_to_column() %>% rename(mode = rowname) %>% rename_at(2, ~"count")
        # extract whw part and get into correct format
        td2 <- colSums((io[[city]]$outcomes$whw[[cs]][[whw_qn]])) %>% as.data.frame() %>% rownames_to_column() %>% rename(mode = rowname) %>% rename_at(2, ~"count")
        # join nov and whw parts
        td3 <- full_join(td2, td1, by = 'mode') %>% mutate(count = rowSums(.[2:3], na.rm = T)) %>% dplyr::select(-c('count.x', 'count.y'))
        
      }else if(length(whw_nov_list) == 1 && 'whw' %in% whw_nov_list){ # if only whw outputs exist
        # extract whw part and get into correct format
        td3 <- colSums((io[[city]]$outcomes$whw[[cs]][[whw_qn]])) %>% as.data.frame() %>% rownames_to_column() %>% rename(mode = rowname) %>% rename_at(2, ~"count")
        
      }else if(length(whw_nov_list) == 1 && 'nov' %in% whw_nov_list){ # if only nov outputs exist
        # extract nov part and get into correct format
        td3 <- (io[[city]]$outcomes$whw[[cs]][[nov_qn]]) %>% as.data.frame() %>% rownames_to_column() %>% rename(mode = rowname) %>% rename_at(2, ~"count")
        
      }
      
      # extract true distances
      td4 <- io[[city]]$true_dist %>% filter(stage_mode %in% td3$mode) %>% dplyr::select(stage_mode)
      
      if (length(el) == 0){ # list which gets intitialised with stage modes only
        el <- td4 %>% dplyr::select(stage_mode)
      }
      
      # add nov and whw counts to distances
      td4 <- full_join(td4, td3 %>% dplyr::select(mode, count) %>% rename(stage_mode = mode), by = 'stage_mode')
      
      var <- get_qualified_scen_name(cs)
      
      names(td4)[2] <- var
    
      
      ## add active travel mode
      # filter out pedestrian and cycle trips, calculate sum of counts and distances and add to existing data
      active_travel <- td4 %>% filter(stage_mode == 'pedestrian' | stage_mode == 'cycle')
      active_travel <- active_travel %>% group_by() %>% summarise(stage_mode = 'active_travel', 
                                                                  x = sum(active_travel[,2]))
      colnames(active_travel) <- colnames(td4)
      
      # add active travel and total
      td4 <- rbind(td4, active_travel)

      
      td5 <- td4
      
      # calculate fatalities per 100,000 people
      td4[, 2] <- round(td4[, 2] / sum(io[[city]]$demographic$population) * 100000, 2)
      
      el <- full_join(el, td4, by = 'stage_mode')
    }

    # calculate total
    el_no_at <- el %>% filter(!stage_mode == 'active_travel')
    el_total <- el_no_at %>% ungroup() %>% janitor::adorn_totals(c('row'))
   
    el <- rbind(el, el_total %>% filter(stage_mode == 'Total'))
    
    # round numbers
    el2 <- el %>% mutate_if(is.numeric, round, digits = round_to)
    
    # print fatalities per 100,000 people to html
    print(kable(el2, caption = city))
    
    # get into correct format and add to city_df list
    td <- el
    names(td)[2:ncol(td)] <- paste(names(td)[2:ncol(td)], city, sep = "_")
    td$measure <- ifelse(is.na(pref), "mean", pref)

    city_df[["td"]][[city]][[unique(td$measure)]] <- td

    # create one table containing mean and upper and lower limits where applicable
    if (length(city_df[["td"]][[city]]) == 3){
      overall_el_normalized[[city]] <- data.table::rbindlist(city_df[["td"]][[city]])
    }
    
    cat("\n")
  }
}

# get fatalities per 100,000 people into correct format
td <- overall_el_normalized %>% purrr::reduce(full_join, by = c("stage_mode", "measure")) %>% as.data.frame()
td[is.na(td)] <- 0
td <- td %>% dplyr::select(stage_mode, sort(names(.)))
td <- rename(td, mode = stage_mode)
#td <- td %>% filter(!mode == 'Total')

# get fatalities per 100,000 people into correct format where all scenarios are in the same column
injury_risks_per_100k <- reshape2::melt(td)
col_split <- stringr::str_split(injury_risks_per_100k$variable, "_", simplify = TRUE, n = 2)
injury_risks_per_100k <- cbind(injury_risks_per_100k, col_split)
names(injury_risks_per_100k)[5] <- 'scenario'
names(injury_risks_per_100k)[6] <- 'city'
injury_risks_per_100k <- injury_risks_per_100k %>% dplyr::select(-variable)
injury_risks_per_100k$scenario <- as.character(injury_risks_per_100k$scenario)


# add distance, population and location information
injury_risks_per_100k <- left_join(injury_risks_per_100k, true_distances, by = c('mode', 'city','scenario'))
injury_risks_per_100k <- left_join(injury_risks_per_100k, all_pop_df, by = c('city'))
injury_risks_per_100k <- left_join(injury_risks_per_100k, location, by = 'city')

# add counts but first divide by population
baseline_counts_pop <- left_join(baseline_counts_agg, pop_df, by = c('city'))


# add active travel 
active_travel <- baseline_counts_pop %>% filter(mode == 'cycle' | mode == 'pedestrian')
active_travel <- active_travel %>% group_by(city, scenario, country, continent, measure, model_population 
                                            ) %>% summarise(
                                   value = sum(value), 
                                   mode ='active_travel')

baseline_counts_pop <- rbind(baseline_counts_pop, active_travel)


baseline_counts_pop$value <- round(baseline_counts_pop$value /baseline_counts_pop$model_population *  100000, 2)
baseline_counts_pop <- left_join(baseline_counts_pop, true_distances, by = c('city', 'mode','scenario'))

injury_risks_per_100k <- rbind(injury_risks_per_100k, baseline_counts_pop)

# write to csv with and without output version number
readr::write_csv(injury_risks_per_100k %>% filter(scenario != 'Total'), paste0('results/multi_city/inj/injury_risks_per_100k_pop.csv'))
```


```{r }

### True distances in a single file

# create dataframe containing all mode distances for each city and scenario
true_distances_city <- list()
for (city in cities) { # Loop for each city
  true_distances_city[[city]] <- io[[city]]$true_dist %>% 
    pivot_longer(!stage_mode, names_to = "scenario", values_to = "distance") %>% 
    mutate(city = city)
}
true_distances_df <- bind_rows(true_distances_city)


# calculate total distances
true_distances_df_total <- true_distances_df %>% group_by(city,scenario) %>% summarise(
                                              distance = sum(distance), stage_mode = 'Total')

# add active travel distances
true_distances_df_at <- true_distances_df %>% filter(stage_mode == 'cycle' | stage_mode == 'pedestrian')
true_distances_df_at <- true_distances_df_at %>% group_by(city, scenario) %>% summarise(stage_mode = 'active_travel', distance = sum(distance))

true_distances_df <- rbind(true_distances_df, true_distances_df_at, true_distances_df_total) 


# write to csv with and without output version number
readr::write_csv(true_distances_df, paste0('results/multi_city/inj/distances.csv'))
```


### Fatalities per 100 million hours travelled

```{r inj_100k = "asis"}
require(tibble)
overall_el <- list() # To save all cities results
overall_dd <- list() # To save all cities results

city_df <- list()
for (city in cities){ # loop through cities
  
  # loop through mean and upper and lower confidence interval boundary values where required
  for (pref in (word(grep("whw",names(io[[city]]$outcomes$whw$baseline), value = T), 2, sep = "_"))){ 
    
    el <- list() # To save city specific results
    # pref <- "ub"
    
    # set up names
    pref_name <- ""
    if (!is.na(pref))
      pref_name <- paste0("_", pref)
    
    print(pref)
    
    whw_qn <- paste0("whw", pref_name)
    nov_qn <- paste0("nov", pref_name)
    
    for (cs in names(io[[city]]$outcomes$whw)){ # loop through scenarios
      # cs <- 'baseline'
      
      whw_nov_list <- word(names(io[[city]]$outcomes$whw$baseline), 1, sep = "_") %>% unique()
      whw_nov_list <- whw_nov_list[whw_nov_list != 'combined']
      
      
      if (length(whw_nov_list) == 2){ # if there exist nov and whw matrices
        # td1: Number of fatalities in nov
        td1 <- io[[city]]$outcomes$whw[[cs]][[nov_qn]] %>% as.data.frame() %>% rownames_to_column() %>% rename(mode = rowname) %>% rename_at(2, ~"count")
        # td2: Number of fatalities in whw
        td2 <- colSums(io[[city]]$outcomes$whw[[cs]][[whw_qn]]) %>% as.data.frame() %>% rownames_to_column() %>% rename(mode = rowname) %>% rename_at(2, ~"count")
        # td3: Number of fatalities: sum of nov and whw
        td3 <- full_join(td2, td1, by = 'mode') %>% mutate(count = rowSums(.[2:3], na.rm = T)) %>% dplyr::select(-c('count.x', 'count.y'))

        
      }else if(length(whw_nov_list) == 1 && 'whw' %in% whw_nov_list){ # if there exist only whw matrices
        
        # td3: number of fatalities for whw only
        td3 <- colSums(io[[city]]$outcomes$whw[[cs]][[whw_qn]]) %>% as.data.frame() %>% rownames_to_column() %>% rename(mode = rowname) %>% rename_at(2, ~"count")
        
        
      }else if(length(whw_nov_list) == 1 && 'nov' %in% whw_nov_list){ # if there exist only nov matrices
        
        # td3: number of fatalities for nov only
        td3 <- io[[city]]$outcomes$whw[[cs]][[nov_qn]] %>% as.data.frame() %>% rownames_to_column() %>% rename(mode = rowname) %>% rename_at(2, ~"count")

      }
      # td4: duration travelled per mode
      #td4 <- io[[city]]$dur %>% filter(stage_mode %in% td3$mode) %>% dplyr::select(stage_mode, cs) %>% as.data.frame()
      td4 <- duration %>% filter(stage_mode %in% td3$mode & scenario == get_qualified_scen_name(cs) & 
                                   city == city) %>% dplyr::select(stage_mode, mode_duration) 
      
      if (length(el) == 0){
        el <- td4 %>% dplyr::select(stage_mode)
        dd <- el
      }
      # td4: merge total fatalities to distance
      td4 <- full_join(td4, td3 %>% dplyr::select(mode, count) %>% rename(stage_mode = mode), by = 'stage_mode') 
      
      var <- get_qualified_scen_name(cs)
      
      # rename mode duration column as scenario column
      names(td4)[2] <- var
      td4[, 2] <- as.numeric(unlist(td4[,2]))
      td4[, 3] <- as.numeric(unlist(td4[,3]))
      
      
      # create total
      total <- td4 %>% group_by() %>% summarise(stage_mode = 'Total', 
                                                                  x = sum(td4[,2]),
                                                                  y = sum(td4[,3]))
      colnames(total) <- colnames(td4)
      
      
      # add active travel mode
      # filter out pedestrian and cycle trips, calculate sum of counts and distances and add to existing data
      active_travel <- td4 %>% filter(stage_mode == 'pedestrian' | stage_mode == 'cycle')
      active_travel <- active_travel %>% group_by() %>% summarise(stage_mode = 'active_travel', 
                                                                   x = sum(active_travel[,2]),
                                                                  y = sum(active_travel[,3]))
      
      colnames(active_travel) <- colnames(td4)
      
      # add active travel and total
      td4 <- rbind(td4, active_travel, total)
      
      
      # Compute risk per 100 million hours travelled
      td4[, 2] <- round((td4[,3] / ( td4[,2] * 365)) *
                          100000000, 4)

      names(td4)[3] <- paste(names(td4)[2], names(td4)[3], sep = "_")


      # remove count columns
      td4 <- td4 %>% dplyr::select(-contains('count'))
      
      # join mean and upper and lower confidence interval limits if given
      el <- full_join(el, td4, by = c('stage_mode'))
    }
    
    # round numbers
    el2 <- el %>% mutate_if(is.numeric, round, digits = round_to)
    
    # print fatalities per 100 mil h travelled and total counts to html document
    print(kable(el2, caption = paste(city, ifelse(is.na(pref), "mean", pref))))
    
    
    td <- el # %>% dplyr::select(-contains('count'))
    
    # add city name to column names
    names(td)[2:ncol(td)] <- paste(names(td)[2:ncol(td)], city, sep = "_")
    # add 'measure' columns
    td$measure <- ifelse(is.na(pref), "mean", pref)
 
    # add to city_df output list  
    city_df[["td"]][[city]][[unique(td$measure)]] <- td
 
      
    if (length(city_df[["td"]][[city]]) == 3){
      overall_el[[city]] <- data.table::rbindlist(city_df[["td"]][[city]])
         }
    
    
    cat("\n")
  }
}
```


```{r inj_100k_all_cities = "asis"}

# Change in deaths due to injury per 100 million hours across cities - produce csv file

#require(tibble)

# initialise dataframe 
td <- overall_el %>% purrr::reduce(full_join, by = c("stage_mode", "measure")) %>% as.data.frame()
td[is.na(td)] <- 0
td <- td %>% dplyr::select(stage_mode, sort(names(.)))

injury_risks_100mil_hours <- td

# get data into correct format
injury_risks_lng_hours <- reshape2::melt(injury_risks_100mil_hours)
col_split <- stringr::str_split(injury_risks_lng_hours$variable, "_", simplify = TRUE, n = 2)
injury_risks_lng_hours <- cbind(injury_risks_lng_hours, col_split)
names(injury_risks_lng_hours)[5] <- 'scenario'
names(injury_risks_lng_hours)[6] <- 'city'
injury_risks_lng_hours <- injury_risks_lng_hours %>% dplyr::select(-variable)
injury_risks_lng_hours$scenario <- as.character(injury_risks_lng_hours$scenario)

rd_hours <- rename(injury_risks_lng_hours, mode = stage_mode)
#rd_hours <- rd_hours %>% filter(mode != 'Total')

# add country and continent information
rd_hours <- left_join(rd_hours, location, by = 'city')


# add distance information
rd_hours <- left_join(rd_hours, true_distances, by = c('city','mode','scenario'))

# get duration into correct format
duration <- duration  %>% rename(mode = stage_mode)

# add total mode duration
duration_total <- duration %>% group_by(scenario, city) %>% summarise(mode = 'Total',
                                                                      mode_duration = sum(mode_duration))

# add active travel mode duration
duration_at <- duration %>% filter(mode == 'pedestrian' | mode == 'cycle')
duration_at <- duration_at %>% group_by(scenario, city) %>% summarise(mode = 'active_travel',
                                                                      mode_duration = sum(mode_duration))
# add total and active travel duration to duration table
duration <- rbind(duration, duration_at, duration_total)


# add duration information
rd_hours <- left_join(rd_hours, duration, by = c('mode', 'city','scenario'))

# add baseline counts - join with duration information
baseline_counts_dur <- left_join(baseline_counts_agg, duration, by = c('mode', 'city', 'scenario'))

# add active travel to baseline counts
active_travel <- baseline_counts_dur %>% filter(mode == 'cycle' | mode == 'pedestrian')
active_travel <- active_travel %>% group_by(city, scenario, country, continent, measure ) %>% summarise(
                                   value = sum(value), mode_duration = sum(mode_duration), 
                                   mode ='active_travel')

baseline_counts_dur <- rbind(baseline_counts_dur, active_travel)

# calculate fatalities per 100mil h travelled for baseline counts
baseline_counts_dur$value <- round((baseline_counts_dur$value /(baseline_counts_dur$mode_duration * 365))*100000000,4)
baseline_counts_dur <- left_join(baseline_counts_dur,true_distances, by = c('city', 'mode','scenario'))

# add baseline rate to predicted rates
rd_hours <- rbind(rd_hours, baseline_counts_dur )

# write as csv with and without output version number
readr::write_csv(rd_hours, paste0('results/multi_city/inj/injury_risks_per_100million_h.csv'))
```


```{r}
# Export injury datasets as a single excel file
path_inj_wb <- paste0('results/multi_city/inj/inj_data.xlsx')
 writexl::write_xlsx(list(inj_counts = inj_counts_list, #%>% filter(str_mode != 'Total'),
                          injury_risks_billion_kms = rd, #%>% filter(mode != 'Total'),
                          injury_risks_100k_people  = injury_risks_per_100k, # %>% filter(mode != 'Total' & scenario != 'Total'),
                          injury_risks_100mil_h = rd_hours, #%>% filter(mode != 'Total'),
                          true_distances = true_distances_df),
                     path = path_inj_wb)
```