diff --git a/README.md b/README.md index 0a74c17..9ed136e 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,6 @@ cf-pandas ============================== +[![Build Status](https://img.shields.io/static/v1.svg?logo=Jupyter&label=Binder&message=Binder&color=blue&style=for-the-badge)](https://mybinder.org/v2/gh/axiom-data-science/cf-pandas/HEAD?labpath=docs%2Fdemo_overview.ipynb) [![Build Status](https://img.shields.io/github/workflow/status/axiom-data-science/cf-pandas/Tests?logo=github&style=for-the-badge)](https://github.com/axiom-data-science/cf-pandas/actions) [![Code Coverage](https://img.shields.io/codecov/c/github/axiom-data-science/cf-pandas.svg?style=for-the-badge)](https://codecov.io/gh/axiom-data-science/cf-pandas) [![License:MIT](https://img.shields.io/badge/License-MIT-green.svg?style=for-the-badge)](https://opensource.org/licenses/MIT) @@ -9,6 +10,7 @@ cf-pandas [![Python Package Index](https://img.shields.io/pypi/v/cf-pandas.svg?style=for-the-badge)](https://pypi.org/project/cf-pandas) + an accessor for pandas objects that interprets CF attributes. Most if not all of the logic and structure is directly from [cf-xarray](https://github.com/xarray-contrib/cf-xarray). -------- diff --git a/docs/demo_overview.md b/docs/demo_overview.md new file mode 100644 index 0000000..b2a8e31 --- /dev/null +++ b/docs/demo_overview.md @@ -0,0 +1,90 @@ +--- +jupytext: + text_representation: + extension: .md + format_name: myst + format_version: 0.13 + jupytext_version: 1.14.0 +kernelspec: + display_name: Python 3 (ipykernel) + language: python + name: python3 +--- + +# How to use `cf-pandas` + +The main use of `cf-pandas` currently is for selecting a variable from a `pandas DataFrame` using the accessor and a custom vocabulary that searches column names for a match to the regular expressions. There are several class and utilities that support this functionality that are used internally but are also helpful for other packages. + +```{code-cell} ipython3 +import cf_pandas as cfp +import pandas as pd +``` + +## Select variable + ++++ + +### Create custom vocabulary + +More information about custom vocabularies and using the `Vocab` class here: https://cf-pandas.readthedocs.io/en/latest/demo_vocab.html + +You can make regular expressions for your vocabulary by hand or use the `Reg` class in `cf-pandas` to do so. + +```{code-cell} ipython3 +# initialize class +vocab = cfp.Vocab() + +# define a regular expression to represent your variable +reg = cfp.Reg(include="salinity", exclude="soil", exclude_end="_qc") + +# Make an entry to add to your vocabulary +vocab.make_entry("salt", reg.pattern(), attr="standard_name") + +vocab +``` + +### Get some data + +```{code-cell} ipython3 +# Some data +url = "https://files.stage.platforms.axds.co/axiom/netcdf_harvest/basis/2013/BE2013_/data.csv.gz" +df = pd.read_csv(url) +df +``` + +### Access variable + +Refer to the column of data you want by the nickname described in your custom vocabulary. + +You can do this with a context manager, especially if you are using more than one vocabulary: + +```{code-cell} ipython3 +with cfp.set_options(custom_criteria=vocab.vocab): + print(df.cf["salt"]) +``` + +Or you can set one for use generally in this kernel: + +```{code-cell} ipython3 +cfp.set_options(custom_criteria=vocab.vocab) +df.cf["salt"] +``` + +## Other utilities + ++++ + +### Access all CF Standard Names + +```{code-cell} ipython3 +sn = cfp.standard_names() +sn[:5] +``` + +### Use vocabulary to match any list + +This is the logic under the hood of the `cf-pandas` accessor that selects what column matches a variable nickname according to the custom vocabulary. This comes from `cf-xarray` almost exactly. It is available as a separate function because it is useful to use in other scenarios too. Here we filter the standard names just found by our custom vocabulary from above. + +```{code-cell} ipython3 +cfp.match_criteria_key(sn, "salt", vocab.vocab) +``` diff --git a/docs/demo_vocab.md b/docs/demo_vocab.md index dcd9c1b..e370eda 100644 --- a/docs/demo_vocab.md +++ b/docs/demo_vocab.md @@ -138,3 +138,20 @@ vocab2.make_entry("other_variable_nickname", "match_that_string", attr="standard vocab1 + vocab2 ``` + +## Use the `Reg` class to write regular expressions + +We used simple exact matching regular expressions above, but for anything more complicated it can be hard to write regular expressions. You can use the `Reg` class in `cf-pandas` to write regular expressions with several options, as demonstrated more in [another doc page](https://cf-pandas.readthedocs.io/en/latest/demo_reg.html), and briefly here. + +```{code-cell} ipython3 +# initialize class +vocab = cfp.Vocab() + +# define a regular expression to represent your variable +reg = cfp.Reg(include="temperature", exclude="air", exclude_end="_qc", include_start="sea") + +# Make an entry to add to your vocabulary +vocab.make_entry("temp", reg.pattern(), attr="standard_name") + +vocab +``` diff --git a/docs/index.rst b/docs/index.rst index 5775de6..51a6457 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -6,9 +6,17 @@ Welcome to cf-pandas's documentation! ===================================== +Installation +------------ + +To install from PyPI: + >>> pip install cf-pandas + .. toctree:: :maxdepth: 2 + :hidden: + :caption: Documentation demo_reg.md demo_vocab.md