`forecast_opts()` or `add_horizon()` #867

sbfnk · 2024-11-25T10:26:10Z

Just surfacing a discussion mixed up with other discussions in #346 (comment)

In order to address #640 we need to decide how to specify accumulation for future data. Options on the table are (for an example of weekly forecasts):

obs |>
  estimate_infections(forecast = forecast_opts(horizon = 14, accumulate = 7)

or

obs |>
  add_horizon(horizon = 14, accumulate = 7) |>
  estimate_infections()

My opinion is option 2 is more elegant ("future dates are just another form of missing data and the model doesn't actually need to know when the present is") but option 1 has the advantage more in line with what we've been doing in the package so far.

seabbs · 2024-11-25T14:19:39Z

Option 2. is how epinowcast works and I agree is more elegant. I'm in favour of it I think. I might also be in favour of some attempt to detect the accumulation in the current data to avoid having to respecify it but that might be dangerous.

Here the only thing we might want to know when a forecast is is for the allocation of things to estimate, estimate from partial data, and forecast which I think currently uses the supplied data vs being based on the presence of data.

Something to note that just occurred to me is have we made a breaking change in the accumulation PR in that before the predicted cases were always on the daily scale and now they are always on the accumulated scale? That is perhaps no bad thing but maybe we want to supply some way to get back to daily scale forecasts at a later date.

sbfnk · 2024-11-25T14:45:15Z

Here the only thing we might want to know when a forecast is is for the allocation of things to estimate, estimate from partial data, and forecast which I think currently uses the supplied data vs being based on the presence of data.

Hmm, yes this is true and would be an argument against add_horizon() (as then we also have to specify somewhere the date of making the forecast).

Something to note that just occurred to me is have we made a breaking change in the accumulation PR in that before the predicted cases were always on the daily scale and now they are always on the accumulated scale?

I don't think so - what's reported as predicted cases is imputed_reports which uses the untruncated, unaccumulated reports.

sbfnk · 2024-11-25T15:45:18Z

I might also be in favour of some attempt to detect the accumulation in the current data to avoid having to respecify it

We could do that if all the gaps are equal. Same with the initial_accumulate argument.

seabbs · 2024-11-26T11:32:45Z

I don't think so - what's reported as predicted cases is

So we would need to change this for forecasting or distinguish accumulation for forecasting vs for inference in order to get accumulation to show up in outputs?

sbfnk · 2024-11-26T11:40:39Z

So we would need to change this for forecasting

We could change it but now that we're accumulating after truncating it means we'd then also show truncation (potentially a good thing for comparing to data, potentially confusing if it's interpreted as final cases). I think this would make most sense.

seabbs · 2024-11-26T11:45:58Z

my view is that I think we want to consider accumulation of the forecast as a standalone feature from accumulation of the data in the likelihood or have a clear communication of what is a posterior prediction of the data and what is a prediction of some latent quantity (i.e the unaccumulated data). I think we probably can't make the breaking change where accumulation and truncation shows up in the current forecast output as a lot of people have grown used to interpreting that as a forecast of a latent (potentially observed in the future) thing

sbfnk · 2024-11-26T12:34:28Z

This makes sense. Perhaps the plot with accumulation and truncation could be done in a separate plot_fit() function or the like.

It does come back to the issue raised here though and how we can improve the plot which currently could be wrongly interpreted to mean a poor fit to / underestimate of the data.
#640

seabbs · 2024-11-27T10:49:31Z

we can improve the plot which currently could be wrongly interpreted to mean a poor fit to / underestimate of the data.

One no code option is to clearly document exactly what is being plot here.

I like the idea of introducing new more clearly scoped plotting tools as the solution and maybe gradually depreciating what is in

sbfnk changed the title ~~forecast_opts() or `add_horizon()~~ forecast_opts() or add_horizon() Nov 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`forecast_opts()` or `add_horizon()` #867

`forecast_opts()` or `add_horizon()` #867

sbfnk commented Nov 25, 2024

seabbs commented Nov 25, 2024

sbfnk commented Nov 25, 2024

sbfnk commented Nov 25, 2024

seabbs commented Nov 26, 2024

sbfnk commented Nov 26, 2024

seabbs commented Nov 26, 2024

sbfnk commented Nov 26, 2024

seabbs commented Nov 27, 2024

forecast_opts() or add_horizon() #867

forecast_opts() or add_horizon() #867

Comments

sbfnk commented Nov 25, 2024

seabbs commented Nov 25, 2024

sbfnk commented Nov 25, 2024

sbfnk commented Nov 25, 2024

seabbs commented Nov 26, 2024

sbfnk commented Nov 26, 2024

seabbs commented Nov 26, 2024

sbfnk commented Nov 26, 2024

seabbs commented Nov 27, 2024

`forecast_opts()` or `add_horizon()` #867

`forecast_opts()` or `add_horizon()` #867