You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! Thanks for publishing this software. It's quite helpful to potentially be able to use your library instead of re-implementing TFIDF myself. I am grateful for the time and attention you've given to this.
For my use-case, I am working with a large corpus of documents and trying to understand if I can use this library in a way which will have suitable performance.
Examples in this repo show examples of the form:
for term in terms:
for doc in docs:
score = compute_tfidf(term, doc)
where compute_tfidf is either TfIdfDefault::tfidf or MyTfIdfStrategy::tfidf
Is it true that the idf implementations exposed by this crate all require a O(n) linear iteration over the documents/corpus?
Is it possible to use the idf functions on their own, without going through tfidf?
Presented in pseudocode here, I would like to do the following:
for term in terms:
idf = compute_idf(term, docs)
for doc in docs:
score = compute_tf(term, doc) * idf
Concretely, I have tried to use the library in the following way, but ran into an error that I don't quite understand yet:
use tfidf::idf::{InverseFrequencySmoothIdf};use tfidf::tf::DoubleHalfNormalizationTf;use tfidf::{Tf,Idf};for term in terms {let idf = idf::InverseFrequencyIdf::idf(term, docs)for doc in docs {let tf = tf::DoubleHalfNormalizationTf::tf(term, doc)let tfidf = tf * idf
}}
Error 1:
Unresolved import `tfidf::Idf`
Error 2:
No function or associated item named `idf` found for struct `InverseFrequencySmoothIdf` in the current scope
Further exploration has led me to discover that this may be occurring simply because Idf isn't exposed. Would it be okay for me to submit a patch which modifies lib.rs to expose Idf?
Further exploration has led me to discover that this may be occurring simply because Idf isn't exposed. Would it be okay for me to submit a patch which modifies lib.rs to expose Idf?
Hi! That would be fine to submit that patch. I'm surprised that that trait isn't exposed already!
Is it true that the idf implementations exposed by this crate all require a O(n) linear iteration over the documents/corpus?
Just an FYI, I'm only passively maintaining this project. I'm not sure if I'm able to answer your question, since I haven't look at this code in a long time.
Hi! Thanks for publishing this software. It's quite helpful to potentially be able to use your library instead of re-implementing TFIDF myself. I am grateful for the time and attention you've given to this.
For my use-case, I am working with a large corpus of documents and trying to understand if I can use this library in a way which will have suitable performance.
Examples in this repo show examples of the form:
where
compute_tfidf
is eitherTfIdfDefault::tfidf
orMyTfIdfStrategy::tfidf
idf
implementations exposed by this crate all require a O(n) linear iteration over the documents/corpus?idf
functions on their own, without going throughtfidf
?Presented in pseudocode here, I would like to do the following:
Concretely, I have tried to use the library in the following way, but ran into an error that I don't quite understand yet:
Error 1:
Error 2:
Further exploration has led me to discover that this may be occurring simply because
Idf
isn't exposed. Would it be okay for me to submit a patch which modifieslib.rs
to exposeIdf
?by changing:
proposed:
The text was updated successfully, but these errors were encountered: