tweak exercise

nickeubank · Mar 18, 2024 · 32db274 · 32db274
1 parent e22eb3a
commit 32db274
Show file tree

Hide file tree

Showing 9 changed files with 23 additions and 7 deletions.
diff --git a/docs/html/.doctrees/environment.pickle b/docs/html/.doctrees/environment.pickle
diff --git a/docs/html/.doctrees/exercises/exercise_power_calculations.doctree b/docs/html/.doctrees/exercises/exercise_power_calculations.doctree
diff --git a/docs/html/.doctrees/nbsphinx/exercises/exercise_power_calculations.ipynb b/docs/html/.doctrees/nbsphinx/exercises/exercise_power_calculations.ipynb
@@ -69,7 +69,11 @@
     "\n",
     "Since we're comparing means in a continuous variable (expenditures) from two samples of households, we will use `TTestIndPower` in `statsmodels.stats.power`. Import this class and instantiate a new instance (for some reason this is class based, so you have to start of with a command like `my_power = TTestIndPower()`). \n",
     "\n",
-    "Note that a common situation in data science is testing a difference in *proportions*, as in situations where your dependent variable is binary and each group's mean is the share for whom the binary variable is 1. This comes up a lot with apps and websites — e.g., \"clicked an ad,\" \"subscribed,\" \"made a purchase.\" For that reason, there's actually a full sub-class of power calculating tools for [proportions you can read about here.](https://www.statsmodels.org/stable/stats.html#proportion) Basically, because the standard deviation of a binary variable is just $\\sqrt{p * (1-p)}$, power calculations become really simple."
+    "Note that a common situation in data science is testing a difference in *proportions* between groups (e.g., across treatment arms). This situation arises when your dependent variable is binary, and so each group's mean is just the share of observations for whom the binary variable is 1. This comes up a lot with apps and websites — e.g., \"clicked an ad,\" \"subscribed,\" \"made a purchase.\" \n",
+    "\n",
+    "For that reason, there's actually a full sub-class of power calculating tools for [proportions you should be aware of.](https://www.statsmodels.org/stable/stats.html#proportion) Basically, because the standard deviation of a binary variable is just $\\sqrt{p * (1-p)}$, power calculations become really simple. For example, you may wish to identify the sample size required to get confidence intervals of a given size using a tool like [confint_proportions_2indep](https://www.statsmodels.org/stable/generated/statsmodels.stats.proportion.confint_proportions_2indep.html#statsmodels.stats.proportion.confint_proportions_2indep).\n",
+    "\n",
+    "But the most common use of a power test remains evaluating whether one can reject a null hypothesis of no effect, so we'll start with that here."
    ]
   },
   {

diff --git a/docs/html/_sources/exercises/exercise_power_calculations.ipynb.txt b/docs/html/_sources/exercises/exercise_power_calculations.ipynb.txt
@@ -69,7 +69,11 @@
     "\n",
     "Since we're comparing means in a continuous variable (expenditures) from two samples of households, we will use `TTestIndPower` in `statsmodels.stats.power`. Import this class and instantiate a new instance (for some reason this is class based, so you have to start of with a command like `my_power = TTestIndPower()`). \n",
     "\n",
-    "Note that a common situation in data science is testing a difference in *proportions*, as in situations where your dependent variable is binary and each group's mean is the share for whom the binary variable is 1. This comes up a lot with apps and websites — e.g., \"clicked an ad,\" \"subscribed,\" \"made a purchase.\" For that reason, there's actually a full sub-class of power calculating tools for [proportions you can read about here.](https://www.statsmodels.org/stable/stats.html#proportion) Basically, because the standard deviation of a binary variable is just $\\sqrt{p * (1-p)}$, power calculations become really simple."
+    "Note that a common situation in data science is testing a difference in *proportions* between groups (e.g., across treatment arms). This situation arises when your dependent variable is binary, and so each group's mean is just the share of observations for whom the binary variable is 1. This comes up a lot with apps and websites — e.g., \"clicked an ad,\" \"subscribed,\" \"made a purchase.\" \n",
+    "\n",
+    "For that reason, there's actually a full sub-class of power calculating tools for [proportions you should be aware of.](https://www.statsmodels.org/stable/stats.html#proportion) Basically, because the standard deviation of a binary variable is just $\\sqrt{p * (1-p)}$, power calculations become really simple. For example, you may wish to identify the sample size required to get confidence intervals of a given size using a tool like [confint_proportions_2indep](https://www.statsmodels.org/stable/generated/statsmodels.stats.proportion.confint_proportions_2indep.html#statsmodels.stats.proportion.confint_proportions_2indep).\n",
+    "\n",
+    "But the most common use of a power test remains evaluating whether one can reject a null hypothesis of no effect, so we'll start with that here."
    ]
   },
   {

diff --git a/docs/html/exercises/exercise_power_calculations.html b/docs/html/exercises/exercise_power_calculations.html
diff --git a/docs/html/exercises/exercise_power_calculations.ipynb b/docs/html/exercises/exercise_power_calculations.ipynb
@@ -69,7 +69,11 @@
     "\n",
     "Since we're comparing means in a continuous variable (expenditures) from two samples of households, we will use `TTestIndPower` in `statsmodels.stats.power`. Import this class and instantiate a new instance (for some reason this is class based, so you have to start of with a command like `my_power = TTestIndPower()`). \n",
     "\n",
-    "Note that a common situation in data science is testing a difference in *proportions*, as in situations where your dependent variable is binary and each group's mean is the share for whom the binary variable is 1. This comes up a lot with apps and websites — e.g., \"clicked an ad,\" \"subscribed,\" \"made a purchase.\" For that reason, there's actually a full sub-class of power calculating tools for [proportions you can read about here.](https://www.statsmodels.org/stable/stats.html#proportion) Basically, because the standard deviation of a binary variable is just $\\sqrt{p * (1-p)}$, power calculations become really simple."
+    "Note that a common situation in data science is testing a difference in *proportions* between groups (e.g., across treatment arms). This situation arises when your dependent variable is binary, and so each group's mean is just the share of observations for whom the binary variable is 1. This comes up a lot with apps and websites — e.g., \"clicked an ad,\" \"subscribed,\" \"made a purchase.\" \n",
+    "\n",
+    "For that reason, there's actually a full sub-class of power calculating tools for [proportions you should be aware of.](https://www.statsmodels.org/stable/stats.html#proportion) Basically, because the standard deviation of a binary variable is just $\\sqrt{p * (1-p)}$, power calculations become really simple. For example, you may wish to identify the sample size required to get confidence intervals of a given size using a tool like [confint_proportions_2indep](https://www.statsmodels.org/stable/generated/statsmodels.stats.proportion.confint_proportions_2indep.html#statsmodels.stats.proportion.confint_proportions_2indep).\n",
+    "\n",
+    "But the most common use of a power test remains evaluating whether one can reject a null hypothesis of no effect, so we'll start with that here."
    ]
   },
   {

diff --git a/docs/html/searchindex.js b/docs/html/searchindex.js
diff --git a/docs/html/sitemap.xml b/docs/html/sitemap.xml
@@ -1 +1 @@
-<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><url><loc>http://unifyingdatascience.orgclass_schedule.html</loc></url><url><loc>http://unifyingdatascience.orgindex.html</loc></url><url><loc>http://unifyingdatascience.orggenindex.html</loc></url><url><loc>http://unifyingdatascience.orgsearch.html</loc></url></urlset>
+<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><url><loc>http://unifyingdatascience.orgexercises/exercise_power_calculations.html</loc></url><url><loc>http://unifyingdatascience.orgindex.html</loc></url><url><loc>http://unifyingdatascience.orggenindex.html</loc></url><url><loc>http://unifyingdatascience.orgsearch.html</loc></url></urlset>
diff --git a/source/exercises/exercise_power_calculations.ipynb b/source/exercises/exercise_power_calculations.ipynb
@@ -69,7 +69,11 @@
     "\n",
     "Since we're comparing means in a continuous variable (expenditures) from two samples of households, we will use `TTestIndPower` in `statsmodels.stats.power`. Import this class and instantiate a new instance (for some reason this is class based, so you have to start of with a command like `my_power = TTestIndPower()`). \n",
     "\n",
-    "Note that a common situation in data science is testing a difference in *proportions*, as in situations where your dependent variable is binary and each group's mean is the share for whom the binary variable is 1. This comes up a lot with apps and websites — e.g., \"clicked an ad,\" \"subscribed,\" \"made a purchase.\" For that reason, there's actually a full sub-class of power calculating tools for [proportions you can read about here.](https://www.statsmodels.org/stable/stats.html#proportion) Basically, because the standard deviation of a binary variable is just $\\sqrt{p * (1-p)}$, power calculations become really simple."
+    "Note that a common situation in data science is testing a difference in *proportions* between groups (e.g., across treatment arms). This situation arises when your dependent variable is binary, and so each group's mean is just the share of observations for whom the binary variable is 1. This comes up a lot with apps and websites — e.g., \"clicked an ad,\" \"subscribed,\" \"made a purchase.\" \n",
+    "\n",
+    "For that reason, there's actually a full sub-class of power calculating tools for [proportions you should be aware of.](https://www.statsmodels.org/stable/stats.html#proportion) Basically, because the standard deviation of a binary variable is just $\\sqrt{p * (1-p)}$, power calculations become really simple. For example, you may wish to identify the sample size required to get confidence intervals of a given size using a tool like [confint_proportions_2indep](https://www.statsmodels.org/stable/generated/statsmodels.stats.proportion.confint_proportions_2indep.html#statsmodels.stats.proportion.confint_proportions_2indep).\n",
+    "\n",
+    "But the most common use of a power test remains evaluating whether one can reject a null hypothesis of no effect, so we'll start with that here."
    ]
   },
   {
Original file line number	Diff line number	Diff line change
		@@ -1 +1 @@
		<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><url><loc>http://unifyingdatascience.orgclass_schedule.html</loc></url><url><loc>http://unifyingdatascience.orgindex.html</loc></url><url><loc>http://unifyingdatascience.orggenindex.html</loc></url><url><loc>http://unifyingdatascience.orgsearch.html</loc></url></urlset>
		<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><url><loc>http://unifyingdatascience.orgexercises/exercise_power_calculations.html</loc></url><url><loc>http://unifyingdatascience.orgindex.html</loc></url><url><loc>http://unifyingdatascience.orggenindex.html</loc></url><url><loc>http://unifyingdatascience.orgsearch.html</loc></url></urlset>