Faster alternative to GapEncoder #943

jeromedockes · 2024-06-13T08:12:33Z

Problem Description

For encoding text/high-cardinality categories, ATM we have MinHashEncoder, which only works when the downstream learner is based on decision trees, and GapEncoder, which gives high-quality representations but is very slow. It would be good to have something similar to the GapEncoder but faster, maybe a SVD or scikit-learn's NMF

Feature Description

an encoder that works similarly to GapEncoder but is faster, possibly at the cost of less interpretable topics or slightly reduced prediction performance

jeromedockes · 2024-06-13T08:31:33Z

related: #139

jeromedockes · 2024-10-23T12:37:10Z

closing in favor of #1121

jeromedockes added the enhancement New feature or request label Jun 13, 2024

jeromedockes closed this as completed Oct 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster alternative to GapEncoder #943

Faster alternative to GapEncoder #943

jeromedockes commented Jun 13, 2024 •

edited

Loading

jeromedockes commented Jun 13, 2024

jeromedockes commented Oct 23, 2024

Faster alternative to GapEncoder #943

Faster alternative to GapEncoder #943

Comments

jeromedockes commented Jun 13, 2024 • edited Loading

Problem Description

Feature Description

jeromedockes commented Jun 13, 2024

jeromedockes commented Oct 23, 2024

jeromedockes commented Jun 13, 2024 •

edited

Loading