You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The set_epsilon function in original COBRA implementation uses grid search cv to get the best $\epsilon$ on the dataset X_epsilon, y_epsilon, which fits and predicts on X_epsilon and y_epsilon; I do not think this is the correct way to do this, I have read the original COBRA paper carefully and have read its implementation on CRAN Package by authors, archived version is available here
The problem with the current implementation is it takes general $\epsilon$. Still, if we take a careful look at the paper, it is $\epsilon_l$, not $\epsilon$, it means that $\epsilon_l$ is specific for a selected $D_l$, so if we change $D_l$ the optimal $\epsilon_l$ will be different, in that way may be cross-validation will not work here.
In the original implementation on CRAN, they first take set $D_{k}$, $D_{l}$, $D_{val}$ and $D_{test}$ then to get a prediction on $D_{test}$, did the following steps :
Train initial machine on $D_k$
Get a prediction of the initial machine on $D_l$
Set $\epsilon_l$ and $\alpha$, by grid search for the best quadratic error on $D_{val}$
then get the prediction on $D_{test}$ based upon the best best $\epsilon_l$, choose in step 3
Let me know, If I understand correctly (or incorrectly); I am working on using the $D_{val}$ to be attached in $D_l$ or $D_k$ so that it will get more data points and give better results for that $\epsilon_l$ needs to be changed accordingly, let me know if anyone wants to work in collaboration.
The text was updated successfully, but these errors were encountered:
The set_epsilon function in original COBRA implementation uses grid search cv to get the best$\epsilon$ on the dataset X_epsilon, y_epsilon, which fits and predicts on X_epsilon and y_epsilon; I do not think this is the correct way to do this, I have read the original COBRA paper carefully and have read its implementation on CRAN Package by authors, archived version is available here
The problem with the current implementation is it takes general$\epsilon$ . Still, if we take a careful look at the paper, it is $\epsilon_l$ , not $\epsilon$ , it means that $\epsilon_l$ is specific for a selected $D_l$ , so if we change $D_l$ the optimal $\epsilon_l$ will be different, in that way may be cross-validation will not work here.
In the original implementation on CRAN, they first take set$D_{k}$ , $D_{l}$ , $D_{val}$ and $D_{test}$ then to get a prediction on $D_{test}$ , did the following steps :
Let me know, If I understand correctly (or incorrectly); I am working on using the$D_{val}$ to be attached in $D_l$ or $D_k$ so that it will get more data points and give better results for that $\epsilon_l$ needs to be changed accordingly, let me know if anyone wants to work in collaboration.
The text was updated successfully, but these errors were encountered: