Replies: 8 comments 1 reply
-
Thanks for your interest in the project. You can derive any metric of interest from the predicted distribution, e.g., mean, variance, quantiles, etc. Probably the easiest would be to use the samples that are drawn from the predicted distribution. In the following example, it is assumed you have a trained xgblss-model. # Number of samples to draw from predicted distribution (make sure to set this to a high number in case quantiles are needed)
n_samples = 10000
# Sample from predicted distribution
pred_samples = xgblss.predict(dtest, pred_type="samples", n_samples=n_samples, seed=123)
# Calculate Mean
mean_pred = pred_samples.mean(axis=1)
# Calculate Std-Dev
std_pred = pred_samples.std(axis=1)
# Calculate Quantiles
quantile_pred = pred_samples.quantile(q=[0.1, 0.5, 0.9], axis=1).T Let me know in case of open questions. |
Beta Was this translation helpful? Give feedback.
-
Thank you very much for your prompt response. I have few queries regarding this which are as follows: Can we use XGBoostLSS for active learning process? What I understood, we can easily use the difference of upper quantile and lower quantile as an uncertainty parameter and based on that I can determine the data with maximum uncertainty from my unknown dataset. However, after running this process iteratively, we have to stop at a certain point when we will achieve our required accuracy. So, for measuring this, metrics for some point predictions such as RMSE, MAE etc. are required and also some connection with the algorithms for point prediction (not prediction interval) such as XGBoost is required. So, can we use XGBoost (or any other such kind of algorithms) merging with XGBoostLSS so that it can provide us uncertainty quantification for active learning process (to include all the most uncertain data in training) and atlast the final point prediction for unknown data (finally we need point prediction, not the prediction interval). |
Beta Was this translation helpful? Give feedback.
-
I am not exactly sure I properly understand your question. Can you be more specific, ideally with a step-by-step process that you would need to follow to solve your task? That'll help to answer the question. |
Beta Was this translation helpful? Give feedback.
-
Ok. Let me illaborate pointwise.
|
Beta Was this translation helpful? Give feedback.
-
Since you can draw samples from the predicted distribution, you get point predictions (as shown above) as well as intervals via quantiles (as shown above). You can derive any quantity from the samples.
Since XGBoostLSS is trained on an objective function, but cross-validation or early stopping are evaluated on a metric function, you can use a measure that evaluates the conditional mean (such as RMSE, MSE) for early stopping of the model via the metric function. Is this what you are asking? |
Beta Was this translation helpful? Give feedback.
-
Yes, this is what I was asking. Many thanks for the answer. I have still one doubt whether I can execute active learning process using this.
|
Beta Was this translation helpful? Give feedback.
-
What you are describing sounds interesting. I have never used XGBoost or XGBoostLSS for active learning, but you are invited to give it a go.
Given that you would use a metric function for early stopping that evaluates the conditional mean (e.g., MSE), I am not sure how this affects estimation of all distributional parameters (for the Normal it is mu and sigma that the model estimates) or the quality of the uncertainty estimation. Hence, there is no guarantee that the estimated distributional parameters are close to the "true" ones. Please feel free to give it a go. Since right now, the metric function evaluates the NLL for cross-validation or early stopping, do you need assistence in replacing it with the MSE? |
Beta Was this translation helpful? Give feedback.
-
Any update on this? Can we close it? |
Beta Was this translation helpful? Give feedback.
-
Is it possible to do point prediction using XGBoostLSS? Because, if it can only predict a prediction interval, then after uncertainty quantification, how could we know whether the predictions are improving and most importantly along with uncertainty quantification we need the point prediction of the parameters in our research. So, any help regarding this problem will be very helpful.
Beta Was this translation helpful? Give feedback.
All reactions