The optimization step : set_epsilon #46

yuvrajiro · 2023-05-30T06:08:37Z

The set_epsilon function in original COBRA implementation uses grid search cv to get the best $\epsilon$ on the dataset X_epsilon, y_epsilon, which fits and predicts on X_epsilon and y_epsilon; I do not think this is the correct way to do this, I have read the original COBRA paper carefully and have read its implementation on CRAN Package by authors, archived version is available here

The problem with the current implementation is it takes general $\epsilon$. Still, if we take a careful look at the paper, it is $\epsilon_l$, not $\epsilon$, it means that $\epsilon_l$ is specific for a selected $D_l$, so if we change $D_l$ the optimal $\epsilon_l$ will be different, in that way may be cross-validation will not work here.

In the original implementation on CRAN, they first take set $D_{k}$, $D_{l}$, $D_{val}$ and $D_{test}$ then to get a prediction on $D_{test}$, did the following steps :

Train initial machine on $D_k$
Get a prediction of the initial machine on $D_l$
Set $\epsilon_l$ and $\alpha$, by grid search for the best quadratic error on $D_{val}$
then get the prediction on $D_{test}$ based upon the best best $\epsilon_l$, choose in step 3

Let me know, If I understand correctly (or incorrectly); I am working on using the $D_{val}$ to be attached in $D_l$ or $D_k$ so that it will get more data points and give better results for that $\epsilon_l$ needs to be changed accordingly, let me know if anyone wants to work in collaboration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The optimization step : set_epsilon #46

The optimization step : set_epsilon #46

yuvrajiro commented May 30, 2023

The optimization step : set_epsilon #46

The optimization step : set_epsilon #46

Comments

yuvrajiro commented May 30, 2023