You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi :) Thanks for the great work. DiscSense deserves more recognition.
It reveals so much more potential for discourse analysis, especially pertaining to its role in semantics.
As a peer researcher in the similar field, a great use case of DiscSense is the understanding of text semantics through simple token matching.
If the semantic labels presented in DiscSense were meaningful enough, such a semantic analysis system would be possible even without sophisticated BERT-like encoders involved.
However, you mention Confidence (Prior) calculations.
I read your Paper but it is difficult to conceptually grasp what you mean by "Confidence" and "Prior".
How exactly are these values computed (I find it unclear in your LREC paper)? And what do you qualitatively mean by "Confidence" and "Prior"?
I hope I could receive some help here.
The text was updated successfully, but these errors were encountered:
Hi, thank you for these kind words ! I also think that it can reveal dataset biases and connotations of markers.
I heavily relied on the association rules terminology, which is a bit old fashion now. I mine marker=>label rules in specific datasets. But labels are unbalanced and a label can be dominant. If a label y is dominant, any marker=>label rule will be accurate. The prior is the probability of getting the label regardless of the discourse marker presence.
The confidence is the probability of the rule marker=>label being true in a dataset.
In the CR dataset, if you encounter "sadly," the review has a 95.2% chance of being negative. It is the confidence of the sadly=>negative association in CR.
In the CR dataset, a review has a 21.8% chance to be negative in general (which is the prior for negative in CR). See table 2 of the paper
Hi :) Thanks for the great work. DiscSense deserves more recognition.
It reveals so much more potential for discourse analysis, especially pertaining to its role in semantics.
As a peer researcher in the similar field, a great use case of DiscSense is the understanding of text semantics through simple token matching.
If the semantic labels presented in DiscSense were meaningful enough, such a semantic analysis system would be possible even without sophisticated BERT-like encoders involved.
However, you mention Confidence (Prior) calculations.
I read your Paper but it is difficult to conceptually grasp what you mean by "Confidence" and "Prior".
How exactly are these values computed (I find it unclear in your LREC paper)? And what do you qualitatively mean by "Confidence" and "Prior"?
I hope I could receive some help here.
The text was updated successfully, but these errors were encountered: