You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I notice that the model performance reported in your paper is very different to the performance in original paper.
For example, MINER (Li et al. 2019) got AUC=69.61 on MIND-small dataset but your reported performance is only AUC=51.2.
Compared to other work reproduced MINER model, this performance is much lower than others. For example, this paper reported that their reproduced MINER model got AUC of 63.88.
In general, most GeneralRec models in your Table 1 got AUC < 52.00, which are largely different to the performance reported in other papers.
Could you give any comments on this?
The text was updated successfully, but these errors were encountered:
The data splits used in the other papers are most likely different than the one used by us. Neither the MINER paper, nor the one referenced by you explicitly mention which split of the MIND dataset they use, so I assume they used the test portion, without the publicly available labels. In contrast, as explained in our paper (Section 2.5), we use the MINDdev portion of the dataset as our test split, and further split the MINDtrain dataset into training and validation portions, respectively.
Yes, I understand the different data split can lead to some variances but 10+ AUC differences is too large. The Dev and Test are come from same dataset and should not have dramatically shifting.
Have you verify the performance by running the official codes from the original paper (e.g., MINER) on your data splits?
Poseidondon
added a commit
to Poseidondon/newsreclib-ru
that referenced
this issue
Jun 11, 2024
Hi Andreea,
I notice that the model performance reported in your paper is very different to the performance in original paper.
For example, MINER (Li et al. 2019) got AUC=69.61 on MIND-small dataset but your reported performance is only AUC=51.2.
Compared to other work reproduced MINER model, this performance is much lower than others. For example, this paper reported that their reproduced MINER model got AUC of 63.88.
In general, most GeneralRec models in your Table 1 got AUC < 52.00, which are largely different to the performance reported in other papers.
Could you give any comments on this?
The text was updated successfully, but these errors were encountered: