-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
frequency is inefficient and wrong #337
Comments
Another option to deal with correctness is to document the issue and throw an error when overflow is detected. A third is to change the type to use |
The big advantage of sticking with |
* Make `frequency` detect negative frequencies and frequency total overflow. * Rewrite `frequency` to build an `IntMap` to represent the frequency list, which greatly improves efficiency for long lists. Fixes hedgehogqa#337.
* Make `frequency` detect negative frequencies, frequency total overflow, and zero total frequency. * Rewrite `frequency` to build an `IntMap` to represent the frequency list, which greatly improves efficiency for long lists. Fixes hedgehogqa#337.
* Make `frequency` detect negative frequencies, frequency total overflow, and zero total frequency. * Rewrite `frequency` to build an `IntMap` to represent the frequency list, which greatly improves efficiency for long lists. * Bump `containers` lower bound to 0.5.11. Fixes hedgehogqa#337.
fwiw |
Correctness
frequency
is a little bit wrong because it relies on the sum of the frequencies being representable by anInt
, but does not document this. Sofrequency [(maxBound `quot` 2, x), ((maxBound `quot` 4) * 3, y), (2034, z)]
will do something hard to predict. The best fix is likely to use double-word integers in the calculations:This is almost certainly available in a library somewhere, but it wouldn't be too hard to roll our own if necessary.
Efficiency
The argument list is traversed for every value generated. When the list is long, that's really lousy. The solution is to build a tree representing the input list, making lookup much faster. A simple hack using
DW
above would use anIntMap
ofIntMap
s and some lookup functions likelookupLE
,lookupGE
, etc. There are almost certainly better specialized data structures available for the purpose, but I don't know if they'd be worth the trouble.The text was updated successfully, but these errors were encountered: