Skip to content

Commit

Permalink
Merge pull request #4 from ihmeuw-msca/addfunctions
Browse files Browse the repository at this point in the history
Addfunctions
  • Loading branch information
mbi6245 authored Sep 16, 2024
2 parents 4e45ec1 + a444c1f commit b406e79
Show file tree
Hide file tree
Showing 7 changed files with 201 additions and 263 deletions.
1 change: 1 addition & 0 deletions .github/workflows/deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ on:
push:
tags:
- "v[0-9]+.[0-9]+.[0-9]+"
workflow_dispatch:

permissions:
contents: write
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,18 @@ in various state counties. The data may look something like the following,

and our goal is to find the percentage change in the prevalence of cancer with its appropriate SE.

The first step is to import the required function from the distrx package.
Since we have counts and distrx expects mean/standard error (SE), we must first convert the data
appropriately. Counts data is common at IHME, so a function is provided to return sample mean and
SE given incidence count and sample size. We can import it and save the necessary variables like so.

.. code-block:: python
from distrx import process_counts
mu_x, sigma_x = process_counts(cases_1, sample_1)
mu_y, sigma_y = process_counts(cases_2, sample_2)
Then, we can import the required function from the distrx package.

.. code-block:: python
Expand All @@ -39,13 +50,12 @@ transform you would like to apply to your data. In this case, it is the followin

.. code-block:: python
mu_tx, sigma_tx = transform_bivariate(c_x=df["cases_1"],
n_x=df["sample_1"],
c_y=df["cases_2"],
n_y=df["sample_2"],
mu_tx, sigma_tx = transform_bivariate(mu_x=mu_x,
sigma_x=sigma_x,
mu_y=mu_y,
sigma_y=sigma_y,
transform="percentage_change")
``mu_tx`` and ``sigma_tx`` are simply the percentage change for each county and their corresponding
standard errors, respectively. ``sigma_tx`` has already been scaled the appropriate sample size so
we **should not** scale it additionally with some function of othe sample size to obtain a
confidence interval.
standard errors, respectively. If a CI for the mean is desired, simply use
``mu_tx +/- Q * sigma_tx``.
4 changes: 2 additions & 2 deletions docs/user_guide/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ User guide
:hidden:
:numbered:

simple_transformations
percentage_change
univariate
bivariate

.. note::

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,11 @@ order Taylor expansion of the transformation function.
Example: Log Transform
----------------------

Suppose that we have some means and standard errors (SEs) of systolic blood pressure (SBP) from
Suppose that we have some means and standard deviations (SDs) of systolic blood pressure (SBP) from
several different samples. The data may look something like the following,

.. csv-table::
:header: mean, se, n
:header: mean, SD, n
:widths: 10, 10, 10
:align: center

Expand All @@ -30,9 +30,18 @@ several different samples. The data may look something like the following,
124, 15, 226
134, 7, 509

and our goal is to obtain the appropriate SEs for the data after applying the log transform.
and our goal is to obtain the appropriate standard errors (SEs) for the mean after applying the log
transform.

The first step is to import the required function from the distrx package.
Since we are interested in the transformed SEs and *not* the transformed SDs, we must provide the
SEs to distrx. **If you already have SEs and are performing the same task, you should skip this
step!**

.. code-block:: python
df["SE"] = df["SD"] / df["n"]
Now, import the appropriate function from distrx.

.. code-block:: python
Expand All @@ -44,10 +53,9 @@ transform you would like to apply to your data. In this case, it is the followin
.. code-block:: python
mu_tx, sigma_tx = transform_univariate(mu=df["means"],
sigma=df["se"],
n=df["n"],
sigma=df["SE"],
transform="log")
``mu_tx`` and ``sigma_tx`` are simply the means with the transformation function applied and their
corresponding standard errors, respectively. ``sigma_tx`` has already been scaled by :math:`\sqrt{n}`
so the we **should not** scale it by square root of the sample size to obtain a confidence interval.
appropriately transformed standard errors, respectively. If a CI for the mean is desired, simply
use ``mu_tx +/- Q * sigma_tx``.
109 changes: 83 additions & 26 deletions simulations.ipynb

Large diffs are not rendered by default.

Loading

0 comments on commit b406e79

Please sign in to comment.