Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A forecasting/predicting example would be useful #15

Closed
SebastianCallh opened this issue Nov 7, 2022 · 6 comments
Closed

A forecasting/predicting example would be useful #15

SebastianCallh opened this issue Nov 7, 2022 · 6 comments
Assignees
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers

Comments

@SebastianCallh
Copy link

Hi!

I read through your examples and let me say they are very nice. I specifically appreciate all the visualisations.
However, I noticed there are no examples of forecasting/predicting on new data once the model is fit. The GP regression examples does forecasting by working with data with type Array{Union{Float64,Missing}} but I suspect this reduces performance.

What is the recommended way to run a fitted model on new data?

@bvdmitri bvdmitri added good first issue Good for newcomers documentation Improvements or additions to documentation labels Nov 7, 2022
@ismailsenoz
Copy link

ismailsenoz commented Nov 7, 2022

Hi @SebastianCallh ! It's good to hear that you are enjoying the examples. You are correct that there are no forecasting examples besides the GP regression example. Indeed, there is no unique solution to the forecasting problem within the RxInfer framework. Supplying missing values to the observations is just a way to obtain predictions. There is no recommended way to run a fitted model on new data as it heavily depends on the model, inference procedure, etc. However, we will upload more forecasting examples to illustrate the possible options and update the documentation accordingly. Thanks for pointing us to the issue.

I am curious about what insight makes you think that data with type Array{Union{Float64,Missing}} will reduce performance. We discussed it among ourselves but could not find out why this would reduce the performance. It would be great if you could share why you suspect a performance reduction so we can address this issue.

@SebastianCallh
Copy link
Author

SebastianCallh commented Nov 7, 2022

Thanks for the response @ismailsenoz . You are of course right in that the posterior predictions are problem dependent, particularly when it comes to plotting. In my experience it is fairly common for PPLs to offer utilities for posterior (and of course prior) predictive sampling so I guess that's where I'm coming from with my question. Without that one might have to re-implement the mechanism of the model outside of the @model function which of course duplicates work.

About the performance impact of Array{Union{Float64,Missing}}: I am not an expert on the Julia compiler but I am fairly certain Union{Float64,Missing} causes each value in the array to be boxed, so when performing operations on them the code has to follow pointers everywhere. See the example below for what I mean. Of course, I have not checked the RxInfer source code so perhaps this does not apply here.

using BenchmarkTools

julia> a = vcat(missing, rand(9999));

julia> b = rand(10000);

julia> @btime sum(a)
  16.372 μs (0 allocations: 0 bytes)
missing

julia> @btime sum(skipmissing(a))
  8.686 μs (7 allocations: 112 bytes)
4982.93969474356

julia> @btime sum(b)
  983.278 ns (1 allocation: 16 bytes)
4966.982509542625

@ismailsenoz
Copy link

Thanks for illustrating your concern and warning us about a potential performance issue. Currently, ReactiveMP (the inference engine of RxInfer) does not allow any operation in case a missing value occurs. Also, the message update rules for the factor nodes in the models need to be extended to return missing, as done in the GP regression example. Your point is valid in the case a user defines a rule that involves Array{Union{Float64,Missing}} in such a way that there is an operation on Array{Union{Float64,Missing}}. Then there will be a performance decrease, as you pointed out. By default, in ReactiveMP, the rules involving missing values are not present. For RxInfer, the appeal of passing Missing is that it allows the unification of learning, hyper-parameter tuning, and prediction in a single inference function call. I am sure we can come up with benchmarks and better implementation of handling missing in case we encounter performance degradation @bvdmitri @HoangMHNguyen.

We will keep this as an issue regarding the utilities for posteriors and try to address it.

@SebastianCallh
Copy link
Author

Thank you for explaining, and for your work on RxInfer!

@albertpod
Copy link
Member

The technicality of the issue was addressed in #51; however, the example for predictions is still missing.

@albertpod albertpod moved this to 👉 Assigned in RxInfer Oct 11, 2023
@bvdmitri bvdmitri added this to the RxInfer update Nov 28th milestone Nov 14, 2023
@albertpod albertpod moved this from 👉 Assigned to 📝 In progress in RxInfer Nov 17, 2023
@albertpod albertpod moved this from 📝 In progress to ❓ Under review in RxInfer Nov 17, 2023
@albertpod albertpod moved this from ❓ Under review to ✅ Done in RxInfer Nov 20, 2023
@albertpod
Copy link
Member

Closed by #184

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

6 participants