Skip to content

Commit

Permalink
tweak exercise
Browse files Browse the repository at this point in the history
  • Loading branch information
nickeubank committed Mar 18, 2024
1 parent e22eb3a commit 32db274
Show file tree
Hide file tree
Showing 9 changed files with 23 additions and 7 deletions.
Binary file modified docs/html/.doctrees/environment.pickle
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,11 @@
"\n",
"Since we're comparing means in a continuous variable (expenditures) from two samples of households, we will use `TTestIndPower` in `statsmodels.stats.power`. Import this class and instantiate a new instance (for some reason this is class based, so you have to start of with a command like `my_power = TTestIndPower()`). \n",
"\n",
"Note that a common situation in data science is testing a difference in *proportions*, as in situations where your dependent variable is binary and each group's mean is the share for whom the binary variable is 1. This comes up a lot with apps and websites — e.g., \"clicked an ad,\" \"subscribed,\" \"made a purchase.\" For that reason, there's actually a full sub-class of power calculating tools for [proportions you can read about here.](https://www.statsmodels.org/stable/stats.html#proportion) Basically, because the standard deviation of a binary variable is just $\\sqrt{p * (1-p)}$, power calculations become really simple."
"Note that a common situation in data science is testing a difference in *proportions* between groups (e.g., across treatment arms). This situation arises when your dependent variable is binary, and so each group's mean is just the share of observations for whom the binary variable is 1. This comes up a lot with apps and websites — e.g., \"clicked an ad,\" \"subscribed,\" \"made a purchase.\" \n",
"\n",
"For that reason, there's actually a full sub-class of power calculating tools for [proportions you should be aware of.](https://www.statsmodels.org/stable/stats.html#proportion) Basically, because the standard deviation of a binary variable is just $\\sqrt{p * (1-p)}$, power calculations become really simple. For example, you may wish to identify the sample size required to get confidence intervals of a given size using a tool like [confint_proportions_2indep](https://www.statsmodels.org/stable/generated/statsmodels.stats.proportion.confint_proportions_2indep.html#statsmodels.stats.proportion.confint_proportions_2indep).\n",
"\n",
"But the most common use of a power test remains evaluating whether one can reject a null hypothesis of no effect, so we'll start with that here."
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,11 @@
"\n",
"Since we're comparing means in a continuous variable (expenditures) from two samples of households, we will use `TTestIndPower` in `statsmodels.stats.power`. Import this class and instantiate a new instance (for some reason this is class based, so you have to start of with a command like `my_power = TTestIndPower()`). \n",
"\n",
"Note that a common situation in data science is testing a difference in *proportions*, as in situations where your dependent variable is binary and each group's mean is the share for whom the binary variable is 1. This comes up a lot with apps and websites — e.g., \"clicked an ad,\" \"subscribed,\" \"made a purchase.\" For that reason, there's actually a full sub-class of power calculating tools for [proportions you can read about here.](https://www.statsmodels.org/stable/stats.html#proportion) Basically, because the standard deviation of a binary variable is just $\\sqrt{p * (1-p)}$, power calculations become really simple."
"Note that a common situation in data science is testing a difference in *proportions* between groups (e.g., across treatment arms). This situation arises when your dependent variable is binary, and so each group's mean is just the share of observations for whom the binary variable is 1. This comes up a lot with apps and websites — e.g., \"clicked an ad,\" \"subscribed,\" \"made a purchase.\" \n",
"\n",
"For that reason, there's actually a full sub-class of power calculating tools for [proportions you should be aware of.](https://www.statsmodels.org/stable/stats.html#proportion) Basically, because the standard deviation of a binary variable is just $\\sqrt{p * (1-p)}$, power calculations become really simple. For example, you may wish to identify the sample size required to get confidence intervals of a given size using a tool like [confint_proportions_2indep](https://www.statsmodels.org/stable/generated/statsmodels.stats.proportion.confint_proportions_2indep.html#statsmodels.stats.proportion.confint_proportions_2indep).\n",
"\n",
"But the most common use of a power test remains evaluating whether one can reject a null hypothesis of no effect, so we'll start with that here."
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/html/exercises/exercise_power_calculations.html

Large diffs are not rendered by default.

6 changes: 5 additions & 1 deletion docs/html/exercises/exercise_power_calculations.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,11 @@
"\n",
"Since we're comparing means in a continuous variable (expenditures) from two samples of households, we will use `TTestIndPower` in `statsmodels.stats.power`. Import this class and instantiate a new instance (for some reason this is class based, so you have to start of with a command like `my_power = TTestIndPower()`). \n",
"\n",
"Note that a common situation in data science is testing a difference in *proportions*, as in situations where your dependent variable is binary and each group's mean is the share for whom the binary variable is 1. This comes up a lot with apps and websites — e.g., \"clicked an ad,\" \"subscribed,\" \"made a purchase.\" For that reason, there's actually a full sub-class of power calculating tools for [proportions you can read about here.](https://www.statsmodels.org/stable/stats.html#proportion) Basically, because the standard deviation of a binary variable is just $\\sqrt{p * (1-p)}$, power calculations become really simple."
"Note that a common situation in data science is testing a difference in *proportions* between groups (e.g., across treatment arms). This situation arises when your dependent variable is binary, and so each group's mean is just the share of observations for whom the binary variable is 1. This comes up a lot with apps and websites — e.g., \"clicked an ad,\" \"subscribed,\" \"made a purchase.\" \n",
"\n",
"For that reason, there's actually a full sub-class of power calculating tools for [proportions you should be aware of.](https://www.statsmodels.org/stable/stats.html#proportion) Basically, because the standard deviation of a binary variable is just $\\sqrt{p * (1-p)}$, power calculations become really simple. For example, you may wish to identify the sample size required to get confidence intervals of a given size using a tool like [confint_proportions_2indep](https://www.statsmodels.org/stable/generated/statsmodels.stats.proportion.confint_proportions_2indep.html#statsmodels.stats.proportion.confint_proportions_2indep).\n",
"\n",
"But the most common use of a power test remains evaluating whether one can reject a null hypothesis of no effect, so we'll start with that here."
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/html/searchindex.js

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/html/sitemap.xml
Original file line number Diff line number Diff line change
@@ -1 +1 @@
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><url><loc>http://unifyingdatascience.orgclass_schedule.html</loc></url><url><loc>http://unifyingdatascience.orgindex.html</loc></url><url><loc>http://unifyingdatascience.orggenindex.html</loc></url><url><loc>http://unifyingdatascience.orgsearch.html</loc></url></urlset>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><url><loc>http://unifyingdatascience.orgexercises/exercise_power_calculations.html</loc></url><url><loc>http://unifyingdatascience.orgindex.html</loc></url><url><loc>http://unifyingdatascience.orggenindex.html</loc></url><url><loc>http://unifyingdatascience.orgsearch.html</loc></url></urlset>
6 changes: 5 additions & 1 deletion source/exercises/exercise_power_calculations.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,11 @@
"\n",
"Since we're comparing means in a continuous variable (expenditures) from two samples of households, we will use `TTestIndPower` in `statsmodels.stats.power`. Import this class and instantiate a new instance (for some reason this is class based, so you have to start of with a command like `my_power = TTestIndPower()`). \n",
"\n",
"Note that a common situation in data science is testing a difference in *proportions*, as in situations where your dependent variable is binary and each group's mean is the share for whom the binary variable is 1. This comes up a lot with apps and websites — e.g., \"clicked an ad,\" \"subscribed,\" \"made a purchase.\" For that reason, there's actually a full sub-class of power calculating tools for [proportions you can read about here.](https://www.statsmodels.org/stable/stats.html#proportion) Basically, because the standard deviation of a binary variable is just $\\sqrt{p * (1-p)}$, power calculations become really simple."
"Note that a common situation in data science is testing a difference in *proportions* between groups (e.g., across treatment arms). This situation arises when your dependent variable is binary, and so each group's mean is just the share of observations for whom the binary variable is 1. This comes up a lot with apps and websites — e.g., \"clicked an ad,\" \"subscribed,\" \"made a purchase.\" \n",
"\n",
"For that reason, there's actually a full sub-class of power calculating tools for [proportions you should be aware of.](https://www.statsmodels.org/stable/stats.html#proportion) Basically, because the standard deviation of a binary variable is just $\\sqrt{p * (1-p)}$, power calculations become really simple. For example, you may wish to identify the sample size required to get confidence intervals of a given size using a tool like [confint_proportions_2indep](https://www.statsmodels.org/stable/generated/statsmodels.stats.proportion.confint_proportions_2indep.html#statsmodels.stats.proportion.confint_proportions_2indep).\n",
"\n",
"But the most common use of a power test remains evaluating whether one can reject a null hypothesis of no effect, so we'll start with that here."
]
},
{
Expand Down

0 comments on commit 32db274

Please sign in to comment.