-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Make notebook with plots for columns (#152)
* lower and upper more consistently * one more * handle bounds/bins/counts the same way * lots of reactive dicts, but the UI has not changed * data dump on the results page * add a pragma: no cover * reset widget values after checkbox change * do not clean up values * put tooltips in labels * pull warning up to analysis panel. TODO: conditional * move warning to bottom of list * analysis definition JSON * stubs for python * stub a script on results page * include column info in generated script * closer to a runable notebook * stuck on split_by_weight... maybe a library bug? * margin stubs * format python identifiers correctly * script has gotten longer: does not make sense to check for exact equality * fix syntactic problems in generated code * fill in columns, but still WIP * fix column names; tests pass * move confidence * simplify download panel * add markdown cells * tidy up * fix copy-paste of util functions * sort the intervals
- Loading branch information
Showing
16 changed files
with
290 additions
and
163 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
# These functions are used both in the application and in generated notebooks. | ||
|
||
|
||
def make_cut_points(lower_bound, upper_bound, bin_count): | ||
""" | ||
Returns one more cut point than the bin_count. | ||
(There are actually two more bins, extending to | ||
-inf and +inf, but we'll ignore those.) | ||
Cut points are evenly spaced from lower_bound to upper_bound. | ||
>>> make_cut_points(0, 10, 2) | ||
[0.0, 5.0, 10.0] | ||
""" | ||
bin_width = (upper_bound - lower_bound) / bin_count | ||
return [round(lower_bound + i * bin_width, 2) for i in range(bin_count + 1)] | ||
|
||
|
||
def interval_bottom(interval): | ||
""" | ||
>>> interval_bottom("(10, 20]") | ||
10.0 | ||
""" | ||
return float(interval.split(",")[0][1:]) | ||
|
||
|
||
def df_to_columns(df): | ||
""" | ||
Transform a Dataframe into a format that is easier to plot, | ||
parsing the interval strings to sort them as numbers. | ||
>>> import polars as pl | ||
>>> df = pl.DataFrame({ | ||
... "bin": ["(-inf, 5]", "(10, 20]", "(5, 10]"], | ||
... "len": [0, 20, 10], | ||
... }) | ||
>>> df_to_columns(df) | ||
(('(-inf, 5]', '(5, 10]', '(10, 20]'), (0, 10, 20)) | ||
""" | ||
sorted_rows = sorted(df.rows(), key=lambda pair: interval_bottom(pair[0])) | ||
return tuple(zip(*sorted_rows)) | ||
|
||
|
||
def plot_histogram(histogram_df, error, cutoff): # pragma: no cover | ||
""" | ||
Given a Dataframe for a histogram, plot the data. | ||
""" | ||
import matplotlib.pyplot as plt | ||
|
||
bins, values = df_to_columns(histogram_df) | ||
mod = (len(bins) // 12) + 1 | ||
majors = [label for i, label in enumerate(bins) if i % mod == 0] | ||
minors = [label for i, label in enumerate(bins) if i % mod != 0] | ||
_figure, axes = plt.subplots() | ||
bar_colors = ["blue" if v > cutoff else "lightblue" for v in values] | ||
axes.bar(bins, values, color=bar_colors, yerr=error) | ||
axes.set_xticks(majors, majors) | ||
axes.set_xticks(minors, ["" for _ in minors], minor=True) | ||
axes.axhline(cutoff, color="lightgrey", zorder=-1) | ||
axes.set_ylim(bottom=0) |
Oops, something went wrong.