You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
there are some places that i think can use improvement, this task is to list and improve them
accuracy:
combining histograms between leaf nodes might lose accuracy, because the leaf nodes are not necessarily using histograms with the same size buckets.
the "auto" histogram size is based off the extents of all data in a column. if you use "auto" and filter to a subset with a smaller range, the histogram will be inaccurate. can set the hist bucket manually or use a log hist to remediate
a large group by might have intermediate results pruned out during aggregation. the pruning limit is 1000 internal rows for a group of block specs (typically 4 - 8 blocks)
safety:
writing a new block of data involves loading all data from the unfinished block and then re-saving it all (instead of appending). this is easier / safer with gob, but maybe not as fast
memory:
a large group by with log hists can blow memory up
The text was updated successfully, but these errors were encountered:
there are some places that i think can use improvement, this task is to list and improve them
accuracy:
safety:
memory:
The text was updated successfully, but these errors were encountered: