Adjustment of the validation of the number of target neighbors #353

JanekBlankenburg · 2022-08-19T13:16:19Z

Before the actual optimization process, it is checked whether the parameters are valid. In the lines 175 - 177 it is checked if the chosen k is valid in the context of the training data. According to the definition of LMNN by Weinberger et al. each class must have at least k+1 elements, so that there are at least k target neighbors for each data point. In the implementation, however, it is only checked whether self.n_neighbors<= required_k (in fact the code checks the opposite in order to throw an error), where required_k is the number of elements of the smallest class. This check indicates that the choice of k is valid for a class that has exactly k elements, which shouldn’t be the case.
However, this leads to selecting a point as its own target neighbor, if this small class. For the determination of the target neighbors, a distance matrix of all points within the class is computed. To prevent that the point itself is recognized as nearest neighbor, the diagonal of this matrix is set to infinity. If a class has only k elements, all elements of the class are chosen as target neighbors, including the current point itself (even if it has a distance of infinity to itself according to the distance matrix). This results in each point of such a class effectively having one target neighbor less than classes with more training data, which can have unintended influences on the final transformation depending on the dataset used.

To prevent this, it is sufficient to adjust the validation so that self.n_neighbors < required_k must apply.

Adjustment of the validation of the number of target neighbors to fit the original LMNN definition by Weinberger et al. and by that prevent unexpected behavior for small classes.

perimosocordiae · 2022-08-27T04:42:34Z

Thanks for sending a PR, @JanekBlankenburg! The test suite is showing some errors, some of which appear to be simple matters of changing the expected exception message, but some look more substantial. Can you take a look?

Adjustment of the validation of the number of target neighbors

e8af7e5

Adjustment of the validation of the number of target neighbors to fit the original LMNN definition by Weinberger et al. and by that prevent unexpected behavior for small classes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adjustment of the validation of the number of target neighbors #353

Adjustment of the validation of the number of target neighbors #353

JanekBlankenburg commented Aug 19, 2022

perimosocordiae commented Aug 27, 2022

Adjustment of the validation of the number of target neighbors #353

Are you sure you want to change the base?

Adjustment of the validation of the number of target neighbors #353

Conversation

JanekBlankenburg commented Aug 19, 2022

perimosocordiae commented Aug 27, 2022