diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md index 2764ce0c5..c29622e00 100644 --- a/.github/ISSUE_TEMPLATE/bug_report.md +++ b/.github/ISSUE_TEMPLATE/bug_report.md @@ -1,29 +1,29 @@ --- name: Bug report about: Create a report to help us improve +labels: bug +title: "BUG: " --- -**Describe the bug** -A clear and concise description of what the bug is. +# Description +A clear and concise description of what the bug is, including a description +of what you expected the outcome to be. -**To Reproduce** +# To Reproduce this bug: Steps to reproduce the behavior: 1. Go to '...' 2. Click on '....' 3. Scroll down to '....' 4. See error -**Expected behavior** -A clear and concise description of what you expected to happen. +Consider including images or test files to help others reproduce the bug and +solve the problem. -**Screenshots** -If applicable, add screenshots to help explain your problem. - -**Desktop (please complete the following information):** - - OS: [e.g. iOS] - - Version [e.g. 22] +## Test configuration + - OS: [e.g., Hal] + - Version [e.g., Python 3.47] - Other details about your setup that could be relevant -**Additional context** -Add any other context about the problem here. +# Additional context +Add any other context about the problem here, including expected behaviour. diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md index 066b2d920..d02da2ef4 100644 --- a/.github/ISSUE_TEMPLATE/feature_request.md +++ b/.github/ISSUE_TEMPLATE/feature_request.md @@ -1,17 +1,27 @@ --- name: Feature request about: Suggest an idea for this project +title: "ENH: " +labels: enhancement --- -**Is your feature request related to a problem? Please describe.** -A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] +# Description +A clear and concise description of the new feature or behaviour you would like. -**Describe the solution you'd like** +## Potential impact + +- Is the feature related to an existing problem? +- How critical is this feature to your workflow? +- How wide of an impact to you anticipate this enhancement having? +- Would this break any existing functionality? + +## Potential solution(s) A clear and concise description of what you want to happen. -**Describe alternatives you've considered** -A clear and concise description of any alternative solutions or features you've considered. +# Alternatives +A clear description of any alternative solutions or features you've considered. -**Additional context** -Add any other context or screenshots about the feature request here. +# Additional context +Add any other context or screenshots about the feature request here, potentially +including your operational configuration. diff --git a/.github/ISSUE_TEMPLATE/question.md b/.github/ISSUE_TEMPLATE/question.md new file mode 100644 index 000000000..463725bae --- /dev/null +++ b/.github/ISSUE_TEMPLATE/question.md @@ -0,0 +1,19 @@ +--- +name: Question +about: A question about this project +title: "QUEST: " +labels: question + +--- + +# Description +A clear and concise summary of your query + +## Example code (optional) +If relevant, include sample code, images, or files so that others can understand +the full context of your question. + +## Configuration + - OS: [e.g., Hal] + - Version: [e.g., Python 3.47] + - Other details about your setup that could be relevant diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md index 692a10c51..75ee10d47 100644 --- a/.github/pull_request_template.md +++ b/.github/pull_request_template.md @@ -1,6 +1,6 @@ # Description -Addresses # (issue) +Addresses #(issue) Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required @@ -23,7 +23,10 @@ instructions so we can reproduce. Please also list any relevant details for your test configuration - Test A -- Test B + +``` +Test B +``` **Test Configuration**: * Operating system: Hal @@ -35,6 +38,7 @@ your test configuration - [ ] Make sure you are merging into the ``develop`` (not ``main``) branch - [ ] My code follows the style guidelines of this project - [ ] I have performed a self-review of my own code +- [ ] I have linted the files updated in this pull request - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [ ] My changes generate no new warnings @@ -42,6 +46,8 @@ your test configuration - [ ] New and existing unit tests pass locally with my changes - [ ] Any dependent changes have been merged and published in downstream modules - [ ] Add a note to ``CHANGELOG.md``, summarizing the changes +- [ ] Update zenodo.json file for new code contributors -If this is a release PR, replace the first item of the above checklist with the release -checklist on the wiki: https://github.com/pysat/pysat/wiki/Checklist-for-Release +If this is a release PR, replace the first item of the above checklist with the +release checklist on the wiki: +https://github.com/pysat/pysat/wiki/Checklist-for-Release diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index 39a113b76..266a1479c 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -1,5 +1,6 @@ -# This workflow will install Python dependencies, run tests and lint with a variety of Python versions -# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions +# This workflow will install Python dependencies, run tests and lint with a +# variety of Python versions. For more information see: +# https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions name: Documentation Check @@ -8,25 +9,22 @@ on: [push, pull_request] jobs: build: - runs-on: ubuntu-latest + runs-on: ["ubuntu-latest"] strategy: fail-fast: false matrix: - python-version: [3.9] + python-version: ["3.10"] name: Documentation tests steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 - name: Set up Python ${{ matrix.python-version }} - uses: actions/setup-python@v4 + uses: actions/setup-python@v5 with: python-version: ${{ matrix.python-version }} - name: Install dependencies - run: | - python -m pip install --upgrade pip - pip install -r test_requirements.txt - pip install -r requirements.txt + run: pip install .[doc] - name: Set up pysat run: | diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 7f577605e..99e9bc565 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -1,5 +1,6 @@ -# This workflow will install Python dependencies, run tests and lint with a variety of Python versions -# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions +# This workflow will install Python dependencies, run tests and lint with a +# variety of Python versions. For more information see: +# https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions name: Pytest with Flake8 @@ -10,39 +11,48 @@ jobs: strategy: fail-fast: false matrix: - os: [ubuntu-latest, windows-latest] - python-version: ["3.9", "3.10"] + os: ["ubuntu-latest", "windows-latest", "macos-latest"] + python-version: ["3.10", "3.11", "3.12"] numpy_ver: ["latest"] + test_config: ["latest"] include: - - python-version: "3.8" - numpy_ver: "1.20" + # NEP29 compliance settings + - python-version: "3.9" + numpy_ver: "1.23" os: ubuntu-latest + test_config: "NEP29" + # Operational compliance settings - python-version: "3.6.8" numpy_ver: "1.19.5" os: "ubuntu-20.04" + test_config: "Ops" name: Python ${{ matrix.python-version }} on ${{ matrix.os }} with numpy ${{ matrix.numpy_ver }} runs-on: ${{ matrix.os }} steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 - name: Set up Python ${{ matrix.python-version }} - uses: actions/setup-python@v4 + uses: actions/setup-python@v5 with: python-version: ${{ matrix.python-version }} - - name: Install NEP29/Operational dependencies - if: ${{ matrix.numpy_ver != 'latest'}} + - name: Install Operational dependencies + if: ${{ matrix.test_config == 'Ops'}} run: | - pip install --no-binary :numpy: numpy==${{ matrix.numpy_ver }} + pip install numpy==${{ matrix.numpy_ver }} + pip install -r requirements.txt + pip install -r test_requirements.txt + pip install . - - name: Install standard dependencies and pysat + - name: Install NEP29 dependencies + if: ${{ matrix.test_config == 'NEP29'}} run: | - python -m pip install --upgrade pip - pip install -r requirements.txt - python setup.py install + pip install numpy==${{ matrix.numpy_ver }} + pip install --upgrade-strategy only-if-needed .[test] - - name: Install requirements for testing setup - run: pip install -r test_requirements.txt + - name: Install standard dependencies + if: ${{ matrix.test_config == 'latest'}} + run: pip install .[test] - name: Set up pysat run: | @@ -56,9 +66,22 @@ jobs: run: flake8 . --count --exit-zero --max-complexity=10 --statistics - name: Test with pytest - run: pytest --cov=pysat/ + run: pytest - name: Publish results to coveralls env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} - run: coveralls --rcfile=setup.cfg --service=github + COVERALLS_PARALLEL: true + run: coveralls --rcfile=pyproject.toml --service=github + + finish: + name: Finish Coverage Analysis + needs: build + runs-on: ubuntu-latest + steps: + - name: Coveralls Finished + env: + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + run: | + pip install --upgrade coveralls + coveralls --service=github --finish diff --git a/.github/workflows/pip_rc_install.yml b/.github/workflows/pip_rc_install.yml new file mode 100644 index 000000000..cfb33fd61 --- /dev/null +++ b/.github/workflows/pip_rc_install.yml @@ -0,0 +1,41 @@ +# This workflow will install Python dependencies and the latest RC of pysat from +# test pypi. This test should be manually run before a pysat RC is officially +# approved and versioned. For more information see: +# https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions + +name: Test install of latest RC from pip + +on: [workflow_dispatch] + +jobs: + build: + strategy: + fail-fast: false + matrix: + os: ["ubuntu-latest", "macos-latest", "windows-latest"] + python-version: ["3.11"] # Keep this version at the highest supported Python version + + name: Python ${{ matrix.python-version }} on ${{ matrix.os }} + runs-on: ${{ matrix.os }} + steps: + - uses: actions/checkout@v4 + - name: Set up Python ${{ matrix.python-version }} + uses: actions/setup-python@v5 + with: + python-version: ${{ matrix.python-version }} + + - name: Install standard dependencies + run: pip install -r requirements.txt + + - name: Install pysat RC + run: pip install --no-deps --pre -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ pysat + + - name: Set up pysat + run: | + mkdir pysatData + python -c "import pysat; pysat.params['data_dirs'] = 'pysatData'" + + - name: Check that install imports correctly + run: | + cd .. + python -c "import pysat; print(pysat.__version__)" diff --git a/.github/workflows/stats.yml b/.github/workflows/stats.yml index 6cde36c2f..9a15d0ae4 100644 --- a/.github/workflows/stats.yml +++ b/.github/workflows/stats.yml @@ -1,5 +1,6 @@ -# This workflow will install Python dependencies, run tests and lint with a variety of Python versions -# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions +# This workflow will install Python dependencies, run tests and lint with a +# variety of Python versions. For more information see: +# https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions name: Statistics of supported instruments @@ -9,26 +10,23 @@ on: jobs: build: - runs-on: ubuntu-latest + runs-on: ["ubuntu-latest"] strategy: fail-fast: false matrix: - python-version: ["3.9"] + python-version: ["3.11"] name: Summary of instrument libraries steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 - name: Set up Python ${{ matrix.python-version }} - uses: actions/setup-python@v4 + uses: actions/setup-python@v5 with: python-version: ${{ matrix.python-version }} - name: Install dependencies run: | - pip install -r requirements.txt - # Need to install pysat before ecosystem - python setup.py install - pip install -r test_requirements.txt + pip install .[test] pip install `awk '{print $1}' ecosystem.txt` - name: Set up pysat diff --git a/.readthedocs.yml b/.readthedocs.yml index defa293da..125f0b3de 100644 --- a/.readthedocs.yml +++ b/.readthedocs.yml @@ -5,22 +5,20 @@ # Required version: 2 +build: + os: ubuntu-22.04 + tools: + python: "3.10" + # Build documentation in the docs/ directory with Sphinx sphinx: - configuration: docs/conf.py - -# Build documentation with MkDocs -#mkdocs: -# configuration: mkdocs.yml + configuration: docs/conf.py -# Optionally build your docs in additional formats such as PDF -formats: - - pdf - -# Optionally set the version of Python and requirements -# required to build your docs +# Declare the Python requirements required to build your docs +# This method includes a local build of the package python: - version: 3.7 - install: - - requirements: docs/requirements.txt - + install: + - method: pip + path: . + extra_requirements: + - doc diff --git a/.zenodo.json b/.zenodo.json index bdc218349..732c82808 100644 --- a/.zenodo.json +++ b/.zenodo.json @@ -17,7 +17,7 @@ ], "creators": [ { - "affiliation": "Stoneris LLC", + "affiliation": "Cosmic Studio", "name": "Stoneback, Russell", "orcid": "0000-0001-7216-4336" }, @@ -66,6 +66,11 @@ "affiliation": "@olist", "name": "Leite, Silvio", "orcid": "0000-0003-1707-7963" + }, + { + "affiliation": "NASA NPP", + "name": "Esman, Teresa", + "orcid": "0000-0003-0382-6281" } ] } diff --git a/ACKNOWLEDGEMENTS.md b/ACKNOWLEDGEMENTS.md new file mode 100644 index 000000000..729414ddd --- /dev/null +++ b/ACKNOWLEDGEMENTS.md @@ -0,0 +1,32 @@ +Funding +======= +The following institutions, missions, and programs have provided funding +for pysat development. + +Institutions +------------ + - The Catholic University of America (CUA) + - Cosmic Studio + - Defense Advanced Research Projects Agency (DARPA) Defense Sciences Office + - National Aeronautics and Space Administration (NASA) + - National Oceanic and Atmospheric Administration (NOAA) + - National Science Foundation (NSF) + - Office of Naval Research (ONR) + +Missions +-------- + - NOAA Constellation Observing System for Meteorology Ionosphere and Climate (COSMIC-2) + - NASA Ionospheric Connections Explorer (ICON) + - NASA Scintillation Observations and Response of the Ionosphere to Electrodynamics (SORTIE) + - NASA Scintillation Prediction Observations Research Task (SPORT) + +Programs +-------- + - NSF 125908, AGS-1651393 + - NASA NNX10AT02G, NNH20ZDA001N-LWS, 80NSSC18K120, and 80NSSC21M0180 + - NASA Space Precipitation Impacts (SPI) project at Goddard Space Flight Center through the Heliophysics Internal Science Funding Model. + - Naval Research Laboratory N00173191G016 and N0017322P0744 + +Disclaimers +=========== +Any opinions or actions taken by the listed funding institutions are those of the institutions and do not necessarily reflect the views of the pysat development team or individual authors. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies. diff --git a/CHANGELOG.md b/CHANGELOG.md index 0d34d155c..7dc0ef87d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,6 +3,73 @@ Change Log All notable changes to this project will be documented in this file. This project adheres to [Semantic Versioning](https://semver.org/). +[3.2.0] - 2024-03-27 +-------------------- +* New Features + * Added tests for warnings, logging messages, and errors in the Instrument + clean method. + * Added Instrument loading test with padding and for times without data. + * Allow Instruments to define custom `concat_data` methods. + * Added `include` kwarg to `Instrument.concat_data` to expand allowed inputs. + * Added data kwarg to the Instrument class `__getitem__` method and reduced + memory usage in the `load` method. + * Added a hidden method the Instrument class `_get_epoch_name_from_data` to + reduce code duplication. + * Added `__delitem__` to Meta and `drop` to MetaHeader and MetaLabels classes. + * Modified Meta to allow MetaHeader attribute access directly from Meta. + * Expanded `Meta.drop` to traverse attached MetaLabel and MetaHeader data. + * Added `__delitem__` and `drop` to Instrument and Constellation classes. + * Added options to customize `pysat_ndtesting` instrument with sample rate, + shift in time. + * Added orbit number to `pysat_ndtesting`. + * Added the overwrite kwarg to `utils.registry.register_by_module`. + * Added unit tests for all functions in `utils.files`. + * Reduced code duplication in the `utils.files.parse_fixed_width_filenames` + and `utils.files.parse_delimited_filenames` functions + * Added ability to set Meta data using `meta['data_var', 'label'] = value` + structure. + * Added test for loading multiple days of data. + * Added user-friendly warning when trying to load data when there are + no files at all to load. Situation currently raises an IndexError. + * Expanded `eval_warnings` to support testing against multiple warning types. +* Bug Fix + * Fixed `utils.files.parse_fixed_width_filenames` output for empty file list + * Updated the parsing functions in `utils.files` to consider type specifiers + when identifying appropriate files in a directory +* Maintenance + * Update link redirects in docs. + * Improved Instrument ValueError messages. + * Updated `Constellation.to_inst` method definition of coords, using dims + to combine common dimensions instead. + * Implement pyproject to manage metadata + * Updated docstring references to `pysat.utils.files` in other modules. + * Remove Sphinx cap + * Update usage of whitespace and if statements (E275) + * Remove hacking cap + * Removed deprecated `pysat_testing2d` instrument + * Removed deprecated meta children info + * Removed deprecated `pysat_testing_xarray` instrument + * Removed deprecated `pysat_testing2d_xarray` instrument + * Removed deprecated `instrument_test_class` + * Removed deprecated `malformed_index` kwarg in test instrumennts + * Removed deprecated `convert_timestamp_to_datetime` function + * Removed deprecated `_test_download_travis` flag + * Removed deprecated `freq` kwarg from `download` + * Removed deprecated `use_header` kwarg from `load` and changed default + behaviour to `use_header=True` + * Use temporary directories for files created during test_utils.py + * Updated code file headers to be consistent and include NRL pub release + * Added acknowledgements.md which includes full institutional funding list + * Removed deprecated `labels` kwarg for `pysat.Instrument()` + * Removed deprecated `utils.load_netcdf4` method + * Removed deprecated `_filter_netcdf4_metadata` method + * Removed deprecated usage of None for tag and inst_id + * Removed deprecated kwarg behaviour for 'fname' in `to_netCDF4` + * Added verion cap for sphinx_rtd_theme + * Used line specific noqa statements for imports + * Add `_new_tests` flag for packages to ignore select new tests + * Add CI testing for python 3.12 + [3.1.0] - 2023-05-31 -------------------- * New Features diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md index bb9195cc8..b6d84d819 100644 --- a/CODE_OF_CONDUCT.md +++ b/CODE_OF_CONDUCT.md @@ -2,11 +2,17 @@ ## Our Pledge -In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation. +In the interest of fostering an open and welcoming environment, we as +contributors and maintainers pledge to making participation in our project and +our community a harassment-free experience for everyone, regardless of age, +body size, disability, ethnicity, gender identity and expression, level of +experience, nationality, personal appearance, race, religion, or sexual +identity and orientation. ## Our Standards -Examples of behavior that contributes to creating a positive environment include: +Examples of behavior that contributes to creating a positive environment +include: * Using welcoming and inclusive language * Being respectful of differing viewpoints and experiences @@ -16,31 +22,61 @@ Examples of behavior that contributes to creating a positive environment include Examples of unacceptable behavior by participants include: -* The use of sexualized language or imagery and unwelcome sexual attention or advances +* The use of sexualized language or imagery and unwelcome sexual attention or + advances * Trolling, insulting/derogatory comments, and personal or political attacks * Public or private harassment -* Publishing others' private information, such as a physical or electronic address, without explicit permission -* Other conduct which could reasonably be considered inappropriate in a professional setting +* Publishing others' private information, such as a physical or electronic + address, without explicit permission +* Other conduct which could reasonably be considered inappropriate in a + professional setting ## Our Responsibilities -Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior. +Project maintainers are responsible for clarifying the standards of acceptable +behavior and are expected to take appropriate and fair corrective action in +response to any instances of unacceptable behavior. -Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful. +Project maintainers have the right and responsibility to remove, edit, or +reject comments, commits, code, wiki edits, issues, and other contributions +that are not aligned to this Code of Conduct, or to ban temporarily or +permanently any contributor for other behaviors that they deem inappropriate, +threatening, offensive, or harmful. ## Scope -This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers. +This Code of Conduct applies both within project spaces and in public spaces +when an individual is representing the project or its community. Examples of +representing a project or community include using an official project e-mail +address, posting via an official social media account, or acting as an +appointed representative at an online or offline event. Representation of a +project may be further defined and clarified by project maintainers. ## Enforcement -Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at pysat.developers@gmail.com. The project team will review and investigate all complaints, and will respond in a way that it deems appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately. +Instances of abusive, harassing, or otherwise unacceptable behavior may be +reported by contacting the project team at pysat.developers@gmail.com. The +pysat project team will review and investigate all complaints, and will respond +in a way that it deems appropriate to the circumstances. The project team is +obligated to maintain confidentiality with regard to the reporter of an +incident. Further details of specific enforcement policies may be posted +separately. -Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership. +Project maintainers who do not follow or enforce the Code of Conduct in good +faith may face temporary or permanent repercussions as determined by other +members of the project's leadership. ## Attribution -This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [http://contributor-covenant.org/version/1/4][version] +This Code of Conduct is adapted from the [Contributor Covenant][homepage], +version 1.4, available at +[https://contributor-covenant.org/version/1/4][version] -[homepage]: http://contributor-covenant.org -[version]: http://contributor-covenant.org/version/1/4/ +## FAQ + +For answers to common questions about this code of conduct, see +[https://www.contributor-covenant.org/faq][faq] + +[homepage]: https://contributor-covenant.org +[version]: https://contributor-covenant.org/version/1/4/ +[faq]: https://www.contributor-covenant.org/faq diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 4c1a95415..77be3f0fa 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -13,13 +13,27 @@ are generally held fortnightly. Short version ------------- -* Submit bug reports and feature requests at +* Submit bug reports, feature requests, and questions at [GitHub](https://github.com/pysat/pysat/issues) * Make pull requests to the ``develop`` branch +Issues +------ + +Bug reports, questions, and feature requests should all be made as GitHub +Issues. Templates are provided for each type of issue, to help you include +all the necessary information. + +Questions +^^^^^^^^^ + +Not sure how something works? Ask away! The more information you provide, the +easier the question will be to answer. You can also interact with the pysat +developers on our [slack channel](https://pysat.slack.com). + Bug reports ------------ +^^^^^^^^^^^ When [reporting a bug](https://github.com/pysat/pysat/issues) please include: @@ -31,12 +45,12 @@ include: * Detailed steps to reproduce the bug Feature requests and feedback ------------------------------ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -The best way to send feedback is to file an issue at -[GitHub](https://github.com/pysat/pysat/issues). +The best way to send feedback is to file an +[issue](https://github.com/pysat/pysat/issues). -If you are proposing a feature: +If you are proposing a new feature or a change in something that already exists: * Explain in detail how it would work. @@ -66,42 +80,57 @@ To set up `pysat` for local development: Now you can make your changes locally. - Tests for new instruments are performed automatically. Tests for custom - functions should be added to the appropriately named file in ``pysat/tests``. - For example, custom functions for the time utilities are tested in - ``pysat/tests/test_utils_time.py``. If no test file exists, then you should - create one. This testing uses pytest, which will run tests on any python - file in the test directory that starts with ``test``. Classes must begin - with ``Test``, and methods must begin with ``test`` as well. + Tests for new instruments are performed automatically. See discussion + [here](https://pysat.readthedocs.io/en/main/new_instrument.html#testing-support) + for more information on triggering these standard tests. + + Tests for custom functions should be added to the appropriately named file + in ``pysat/tests``. For example, custom functions for the time utilities are + tested in ``pysat/tests/test_utils_time.py``. If no test file exists, then + you should create one. This testing uses pytest, which will run tests on + any Python file in the test directory that starts with ``test``. Classes + must begin with ``Test``, and methods must begin with ``test`` as well. 4. When you're done making changes, run all the checks to ensure that nothing is broken on your local system, as well as check for flake8 compliance: ``` - pytest -vs --flake8 pysat + pytest ``` -5. Update/add documentation (in ``docs``), if relevant +5. You should also check for flake8 style compliance: + + ``` + flake8 . --count --select=D,E,F,H,W --show-source --statistics + ``` + + Note that pysat uses the `flake-docstrings` and `hacking` packages to ensure + standards in docstring formatting. + +6. Update/add documentation (in ``docs``). Even if you don't think it's + relevant, check to see if any existing examples have changed. -6. Add your name to the .zenodo.json file as an author +7. Add your name to the .zenodo.json file as an author -7. Commit your changes: +8. Commit your changes: ``` git add . git commit -m "AAA: Brief description of your changes" ``` - Where AAA is a standard shorthand for the type of change (eg, BUG or DOC). + Where AAA is a standard shorthand for the type of change (e.g., BUG or DOC). `pysat` follows the [numpy development workflow](https://numpy.org/doc/stable/dev/development_workflow.html), see the discussion there for a full list of this shorthand notation. -8. Once you are happy with the local changes, push to Github: +9. Once you are happy with the local changes, push to GitHub: ``` git push origin name-of-your-bugfix-or-feature ``` Note that each push will trigger the Continuous Integration workflow. -9. Submit a pull request through the GitHub website. Pull requests should be - made to the ``develop`` branch. +10. Submit a pull request through the GitHub website. Pull requests should be + made to the ``develop`` branch. Note that automated tests will be run on + GitHub Actions, but these must be initialized by a member of the pysat team + for first time contributors. Pull Request Guidelines @@ -114,14 +143,15 @@ For merging, you should: 1. Include an example for use 2. Add a note to ``CHANGELOG.md`` about the changes -3. Update the author list in ``zenodo.json`` if applicable -4. Ensure that all checks passed (current checks include Github Actions and Coveralls) +3. Update the author list in ``zenodo.json``, if applicable +4. Ensure that all checks passed (current checks include GitHub Actions, + Coveralls and ReadTheDocs) -If you don't have all the necessary Python versions available locally or -have trouble building all the testing environments, you can rely on -GitHub Actions to run the tests for each change you add in the pull -request. Because testing here will delay tests by other developers, -please ensure that the code passes all tests on your local system first. +If you don't have all the necessary Python versions available locally or have +trouble building all the testing environments, you can rely on GitHub Actions to +run the tests for each change you add in the pull request. Because testing here +will delay tests by other developers, please ensure that the code passes all +tests on your local system first. Project Style Guidelines @@ -151,7 +181,8 @@ These include: * All classes should have `__repr__` and `__str__` functions * Docstrings use `Note` instead of `Notes` * Try to avoid creating a try/except statement where except passes -* Use setup and teardown in test classes +* Use setup_method (or setup_class) and teardown_method (or teardown_class) in + test classes * Use pytest parametrize in test classes when appropriate * Use pysat testing utilities when appropriate * Provide testing class methods with informative failure statements and diff --git a/MANIFEST.in b/MANIFEST.in index 7c416b089..54745e981 100644 --- a/MANIFEST.in +++ b/MANIFEST.in @@ -1,8 +1,7 @@ -global-include *.py +recursive-include pysat *.py include *.md include *.txt include LICENSE -include pysat/version.txt include pysat/citation.txt prune docs global-exclude *.pdf diff --git a/README.md b/README.md index 30470e6cb..86886932a 100644 --- a/README.md +++ b/README.md @@ -10,20 +10,25 @@ [![Coverage Status](https://coveralls.io/repos/github/pysat/pysat/badge.svg?branch=main)](https://coveralls.io/github/pysat/pysat?branch=main) [![DOI](https://zenodo.org/badge/33449914.svg)](https://zenodo.org/badge/latestdoi/33449914) -The Python Satellite Data Analysis Toolkit (pysat) is a package providing a -simple and flexible interface for downloading, loading, cleaning, managing, -processing, and analyzing scientific measurements. Although pysat was initially -designed for in situ satellite observations, it now supports many different -types of ground- and space-based measurements. +The Python Satellite Data Analysis Toolkit (pysat) provides a simple and +flexible interface for robust data analysis from beginning to end - including +downloading, loading, cleaning, managing, processing, and analyzing data. +Pysat's plug-in design allows analysis support for any data, including user +provided data sets. The pysat team provides a variety of plug-ins to support +public scientific data sets in packages such as pysatNASA, pysatMadrigal, and +more, available as part of the general [pysat ecosystem](https://github.com/pysat). Full [Documentation](http://pysat.readthedocs.io/en/latest/index.html) JGR-Space Physics [Publication](https://doi.org/10.1029/2018JA025297) +Pysat Ecosystem [Publication](https://www.frontiersin.org/articles/10.3389/fspas.2023.1119775/full) + [Citation Info](https://pysat.readthedocs.io/en/latest/citing.html) -Come join us on Slack! An invitation to the pysat workspace is available -in the 'About' section of the [pysat GitHub Repository.](https://github.com/pysat/pysat) +Come join us on Slack! An invitation to the pysat workspace is available +in the 'About' section of the +[pysat GitHub Repository.](https://github.com/pysat/pysat) Development meetings are generally held fortnightly. # Main Features @@ -49,47 +54,54 @@ Development meetings are generally held fortnightly. instruments to pysat # Installation -## Starting from scratch -* Python and associated packages for science are freely available. Convenient - science python package setups are available from https://www.python.org/, - [Anaconda](https://www.anaconda.com/distribution/), and other locations - (some platform specific). Anaconda also includes a developer environment that - works well with pysat. Core science packages such as numpy, scipy, matplotlib, - pandas and many others may also be installed directly via pip or your - favorite package manager. - -* Installation through pip + +The following instructions provide a guide for installing pysat and give some +examples on how to use the routines. + +## Prerequisites + +pysat uses common Python modules, as well as modules developed by and for the +Space Physics community. This module officially supports Python 3.X+. + +| Common modules | Community modules | +| -------------- | ----------------- | +| dask | netCDF4 | +| numpy >= 1.12 | | +| pandas | | +| portalocker | | +| pytest | | +| scipy | | +| toolz | | +| xarray | | + + +## PyPi Installation ``` pip install pysat ``` -* Installation through github + +## GitHub Installation ``` git clone https://github.com/pysat/pysat.git -cd pysat -python setup.py install ``` -An advantage to installing through github is access to the development branches. -The latest bugfixes can be found in the `develop` branch. However, this branch -is not stable (as the name implies). We recommend using this branch in a -virtual environment or using `python setup.py develop`. + +Change directories into the repository folder and run the pyproject.toml or +setup.py file. For a local install use the "--user" flag after "install". + ``` -git clone https://github.com/pysat/pysat.git -cd pysat -git checkout develop -python setup.py develop +cd pysat/ +python -m build . +pip install . ``` -* Note that pysat requires a number of packages for the install. - * dask - * netCDF4 - * numpy - * pandas - * portalocker - * scipy - * toolz - * xarray -* The first time the package is run, you will need to specify a directory to - store data. In python, run: + +# Using pysat + +* The first time pysat is run, you will need to specify a directory to store + the data. In Python, run: ``` pysat.params['data_dirs'] = 'path/to/directory/that/may/or/may/not/exist' ``` * Nominal organization of data is top_dir/platform/name/tag/inst_id/files + +Detailed examples and tutorials for using pysat are available in the +[documentation](http://pysat.readthedocs.io/en/latest/index.html). diff --git a/docs/api.rst b/docs/api.rst index b1768d984..89c4a4ddb 100644 --- a/docs/api.rst +++ b/docs/api.rst @@ -232,33 +232,6 @@ pysat_testing :members: -.. _api-pysat-testing_xarray: - -pysat_testing_xarray -^^^^^^^^^^^^^^^^^^^^ - -.. automodule:: pysat.instruments.pysat_testing_xarray - :members: - - -.. _api-pysat-testing2d: - -pysat_testing2d -^^^^^^^^^^^^^^^ - -.. automodule:: pysat.instruments.pysat_testing2d - :members: - - -.. _api-pysat-testing2d_xarray: - -pysat_testing2d_xarray -^^^^^^^^^^^^^^^^^^^^^^ - -.. automodule:: pysat.instruments.pysat_testing2d_xarray - :members: - - .. _api-pysat-testmodel: pysat_testmodel diff --git a/docs/citing.rst b/docs/citing.rst index 0ff303c7f..a7c1c9467 100644 --- a/docs/citing.rst +++ b/docs/citing.rst @@ -6,7 +6,7 @@ Stoneback et al [2018] ``_ as well as the package ``_. Note that this DOI will always point to the latest version of the code. A list of DOIs for all versions can be found at the Zenodo page above. Depending on -usage, citation of the full ecosystem paper by Stoneback et al [2023] +usage, citation of the full ecosystem paper by Stoneback et al [2023] ``_ may also be appropriate. @@ -37,7 +37,7 @@ A simplified implementation of the citation. .. include:: ../pysat/citation.txt :literal: -Citing the publication: +Citing the publications: .. code:: @@ -56,6 +56,19 @@ Citing the publication: year = {2018} } + @article{Stoneback2023, + author = {Stoneback, R. A. and + Burrell, A. G. and + Klenzing, J. and + Smith, J.}, + doi = {10.3389/fspas.2023.1119775}, + journal = {Frontiers in Astronomy and Space Science}, + title = {The pysat ecosystem}, + volume = {10}, + year = {2023} + } + + To aid in scientific reproducibility, please include the version number in publications that use this code. This can be found by invoking :py:attr:`pysat.__version__`. diff --git a/docs/conf.py b/docs/conf.py index a64369fc3..6d20155b5 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -14,6 +14,7 @@ import os import sys +import pysat # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the @@ -21,7 +22,6 @@ # sys.path.insert(0, os.path.abspath('.')) sys.path.insert(0, os.path.abspath('..')) -import pysat pysat.params['data_dirs'] = '.' diff --git a/docs/dependency.rst b/docs/dependency.rst index 7f0495328..4b0251cb3 100644 --- a/docs/dependency.rst +++ b/docs/dependency.rst @@ -31,8 +31,8 @@ could look something like: | | |-- __init__.py | | `-- test_instruments.py | `-- __init__.py - |-- README.md - `-- setup.py + |-- pyproject.toml + `-- README.md The instruments folder includes a file for each instrument object. The @@ -53,7 +53,7 @@ suite of instrument tests to run on instruments. These are imported from the ``test_instruments.py`` file can be copied directly into the library, updating the instrument library name as indicated. -The ``setup.py`` file should include pysat as a dependency, as well as any +The ``pyproject.toml`` file should include pysat as a dependency, as well as any other packages required by the instruments. A more complicated structure could include analysis routines, @@ -84,8 +84,8 @@ The structure then could look like: | | |-- compare.py | | `-- contrast.py | `-- __init__.py - |-- README.md - `-- setup.py + |-- pyproject.toml + `-- README.md .. _pysat-dep-testinst: @@ -267,18 +267,6 @@ pysat_testing object with 1D data as a function of latitude, longitude, and altitude in a pandas format. Most similar to in situ data. -pysat_testing_xarray -^^^^^^^^^^^^^^^^^^^^ -:ref:`api-pysat-testing_xarray` returns a satellite-like object with 1D data as -a function of latitude, longitude, and altitude in a xarray format. - -pysat_testing2d -^^^^^^^^^^^^^^^ -:ref:`api-pysat-testing2d` is another satellite-like object that also returns -profile data as a function of altitude at some distance from the satellite. It -is similar to a Radio Occultation or other instruments that have altitude -profiles. - pysat_ndtesting ^^^^^^^^^^^^^^^^^^^^^^ :ref:`api-pysat-ndtesting` is a satellite-like object that returns all @@ -330,7 +318,7 @@ leap days. Tips and Tricks --------------- -Remember to include pysat as a dependency in your setup.py or setup.cfg file. +Remember to include pysat as a dependency in your pyproject.toml file. The CI environment will also need to be configured to install pysat and its dependencies. You may need to install pysat from github rather than pip if diff --git a/docs/funding.rst b/docs/funding.rst new file mode 100644 index 000000000..24864daf9 --- /dev/null +++ b/docs/funding.rst @@ -0,0 +1 @@ +.. include:: ../ACKNOWLEDGEMENTS.md diff --git a/docs/index.rst b/docs/index.rst index 6e8810cea..77e8df299 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -12,6 +12,7 @@ Welcome to pysat's documentation! introduction.rst ecosystem.rst citing.rst + funding.rst installation.rst quickstart.rst tutorial.rst @@ -23,3 +24,7 @@ Welcome to pysat's documentation! roadmap.rst faq.rst release_notes.rst + + +.. admonition:: DISTRIBUTION STATEMENT A: Approved for public release. + Distribution is unlimited. diff --git a/docs/installation.rst b/docs/installation.rst index 423522513..f8013c5a8 100644 --- a/docs/installation.rst +++ b/docs/installation.rst @@ -6,7 +6,7 @@ Installation Python and associated packages for science are freely available. Convenient science python package setups are available from ``_, -`Anaconda `_, and other locations +`Anaconda `_, and other locations (some platform specific). Anaconda also includes a developer environment that works well with pysat. Core science packages such as numpy, scipy, matplotlib, pandas and many others may also be installed directly via the @@ -26,7 +26,7 @@ To use Anaconda's tools for creating a suitable virtual environment, .. _inst-standard: - + Standard installation --------------------- @@ -57,7 +57,8 @@ pysat may also be installed directly from the source repository on github:: git clone https://github.com/pysat/pysat.git cd pysat - python setup.py install --user + python -m build . + pip install --user . An advantage to installing through github is access to the development branches. The latest bugfixes can be found in the ``develop`` branch. However, this @@ -67,9 +68,10 @@ virtual environment and using:: git clone https://github.com/pysat/pysat.git cd pysat git checkout develop - python setup.py develop + python -m build . + pip install -e . -The use of `develop` rather than `install` in the setup command installs the -code 'in-place', so any changes to the software do not have to be reinstalled -to take effect. It is not related to changing the pysat working branch from -``main`` to ``develop`` in the preceeding line. +The use of `-e` in the setup command installs the code 'in-place', so any +changes to the software do not have to be reinstalled to take effect. It is not +related to changing the pysat working branch from ``main`` to ``develop`` in the +preceeding line. diff --git a/docs/instruments/testing_instruments.rst b/docs/instruments/testing_instruments.rst index 0fa1299c3..f92558e33 100644 --- a/docs/instruments/testing_instruments.rst +++ b/docs/instruments/testing_instruments.rst @@ -14,20 +14,6 @@ An instrument with satellite-like data as a function of latitude, longitude, and altitude in a pandas format. See :ref:`api-pysat-testing` for more details. -pysat_testing_xarray -^^^^^^^^^^^^^^^^^^^^ -An instrument with satellite-like data as a function of latitude, longitude, -and altitude in a xarray format. See :ref:`api-pysat-testing_xarray` for more -details. - - -pysat_testing2d -^^^^^^^^^^^^^^^ -An instrument with satellite-like data like :py:mod:`pysat_testing`, but with a -2D data variable, 'profile', that is similar to Radio Occultation data. See -:ref:`api-pysat-testing2d` for more details. - - pysat_ndtesting ^^^^^^^^^^^^^^^ An instrument with satellite-like data like :py:mod:`pysat_testing` that @@ -35,13 +21,6 @@ also has an imager-like 3D data variable. See :ref:`api-pysat-ndtesting` for more details. -pysat_testing2d_xarray -^^^^^^^^^^^^^^^^^^^^^^ -An instrument with satellite-like data like :py:mod:`pysat_testing_xarray` that -also has an imager-like 3D data variable. See :ref:`api-pysat-testing2d_xarray` -for more details. - - pysat_testmodel ^^^^^^^^^^^^^^^ An instrument with model-like data that returns a 4D object as a function of diff --git a/docs/introduction.rst b/docs/introduction.rst index 0a34fe9b8..988b3dd98 100644 --- a/docs/introduction.rst +++ b/docs/introduction.rst @@ -6,14 +6,13 @@ Introduction Every scientific instrument has unique properties, though the general process for science data analysis is platform independent. This process can by described as: finding and downloading data, writing code to load the data, cleaning the -data to an appropriate level, and applying the specific analysis for a project, -and plotting the results. The Python Satellite Data Analysis Toolkit (pysat) -provides a framework to support this general process that builds upon these -commonalities. In doing so, pysat simplifies the process of using new -instruments, reduces data management overhead, and enables the creation of -instrument independent analysis routines. Although pysat was initially designed -for `in situ` satellite measurements, pysat has grown to support both -observational and modelled space science measurements. +data to an appropriate level, and applying the specific analysis for a project. +The Python Satellite Data Analysis Toolkit (pysat) provides a framework to +support the data analysis lifecycle. In doing so, pysat simplifies the process +of using new instruments, reduces data management overhead, and enables the +creation of instrument independent analysis routines. Although pysat was +initially designed for `in situ` satellite measurements, pysat has grown to +support both observational and modelled space science measurements. The newest incarnation of pysat has been pared down to focus on the core elements of our mission: providing a framework for data management and analysis. diff --git a/docs/new_instrument.rst b/docs/new_instrument.rst index 06f39603a..f810601a7 100644 --- a/docs/new_instrument.rst +++ b/docs/new_instrument.rst @@ -414,6 +414,9 @@ The load module method signature should appear as: commmonly specify the data set to be loaded - The :py:func:`load` routine should return a tuple with :py:attr:`data` as the first element and a :py:class:`pysat.Meta` object as the second element. + If there is no data to load, :py:attr:`data` should return an empty + :py:class:`pandas.DataFrame` or :py:class:`xarray.Dataset` and :py:attr:`meta` + should return an empty :py:class:`pysat.Meta` object. - For simple time-series data sets, :py:attr:`data` is a :py:class:`pandas.DataFrame`, column names are the data labels, rows are indexed by :py:class:`datetime.datetime` objects. @@ -546,10 +549,11 @@ If provided, :py:mod:`pysat` supports the definition and use of keywords for an instrument module so that users may define their preferred default values. A custom keyword for an instrument module must be defined in each function that will receive that keyword argument if provided by the user. All instrument -functions, :py:func:`init`, :py:func:`preprocess`, :py:func:`load`, -:py:func:`clean`, :py:func:`list_files`, :py:func:`list_remote_files`, and -:py:func:`download` support custom keywords. The same keyword may be used in -more than one function but the same value will be passed to each. +functions, :py:func:`init`, :py:func:`preprocess`, :py:func:`concat_data`, +:py:func:`load`, :py:func:`clean`, :py:func:`list_files`, +:py:func:`list_remote_files`, and :py:func:`download` support custom keywords. +The same keyword may be used in more than one function but the same value will +be passed to each. An example :py:func:`load` function definition with two custom keyword arguments. @@ -619,6 +623,13 @@ Cleans instrument for levels supplied in inst.clean_level. ``self`` is a :py:class:`pysat.Instrument` object. :py:func:`clean` should modify ``self`` in-place as needed; equivalent to a custom routine. +:py:func:`clean` is allowed to raise logger messages, warnings, and errors. If +the routine does this, be sure to test them by assigning the necessary +information to the :py:attr:`_clean_warn` attribute, described in +:ref:`rst_test-clean`. :py:func:`clean` may also +re-assign the cleaning level if appropriate. If you do this, be sure to raise a +logging warning, so that users are aware that this change is happening and why +the clean level they requested is not appropriate. list_remote_files ^^^^^^^^^^^^^^^^^ @@ -643,6 +654,25 @@ The user can search for subsets of files through optional keywords, such as: inst.remote_file_list(year=2019) inst.remote_file_list(year=2019, month=1, day=1) +concat_data +^^^^^^^^^^^ + +Combines data from multiple Instruments of the same type, used internally to +combine data from different load periods. The default method concatonates data +using the :py:attr:`inst.index` name. However, some data sets have multiple +different time indices along which data should be concatonated. In such cases +(e.g., TIMED-GUVI SDR-Imaging data from :py:mod:`pysatNASA`), a custom +:py:meth:`concat_data` method must be supplied. If available, this method +will be used instead of the default +:py:meth:`~pysat._instrument.Instrument.concat_data`, after the default +method handles the prepending of the data that needs to be combined. + +.. code:: python + + def concat_data(self, new_data, **kwargs): + # Perform custom concatonation here, updating self.data + return + Logging ------- @@ -725,7 +755,7 @@ will not be present in Input/Output operations. The standardized :py:mod:`pysat` tests are available in :py:mod:`pysat.tests.instrument_test_class`. The test collection in test_instruments.py imports this class, collects a list of all available -instruments (including potential :py:data:`tag`/:py:data:`inst_id` +instruments (including potential :py:attr:`tag`/:py:attr:`inst_id` combinations), and runs the tests using pytestmark. By default, :py:mod:`pysat` assumes that your instrument has a fully functional download routine, and will run an end-to-end test. If this is not the case, see the next @@ -737,6 +767,44 @@ section. Special Test Configurations --------------------------- +The following test attributes may or may not be necessary for your new +:py:class:`~pysat._instrument.Instrument`. The descriptions should provide +insight into when and how they should be used. + + +.. _rst_test-clean: + +Warnings in the Clean method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Another important test is for warnings and the re-setting of clean levels that +may come up when cleaning data. These may be specified using the +:py:attr:`_clean_warn` attribute, which should point to a dictionary that has a +list with tuples of four elements as the value. The first tuple element should +be 'logger', 'warning', or 'error', specifying the method through which the +warning is being reported. The second tuple element specifies either the logging +level (as a string) or the warning/error type (e.g., ``ValueError``). The third +tuple element provides the warning message as a string and the final element +provides the expected clean level after running the clean routine. The list +allows multiple types of warning messages to be tested for a given +:py:attr:`inst_id`, :py:attr:`tag`, and :py:attr:`clean_level` combination. + +.. code:: python + + # ------------------------------------------ + # Instrument test attributes + + _clean_warn = {inst_id: {tag: {'dusty': [ + ('logger', 'WARN', "I am a warning!", 'clean'), + ('warning', UserWarning, + 'I am a serious warning!', 'dusty'), + ('error', ValueError, "I'm an error", 'dusty')]} + for tag in inst_ids[inst_id]} + for inst_id in inst_ids.keys()} + + +.. _rst_test-nodownload: + No Download Available ^^^^^^^^^^^^^^^^^^^^^ @@ -846,7 +914,7 @@ but should not undergo automated download tests because it would require the user to save a password in a potentially public location. The :py:attr:`_password_req` flag is used to skip both the download tests and the download warning message tests, since a functional download routine is -present. +present. This flag is defaults to :py:val:`False` if not specified. .. code:: python @@ -857,4 +925,29 @@ present. inst_ids = {'': ['Level_1', 'Level_2']} _test_dates = {'': {'Level_1': dt.datetime(2020, 1, 1), 'Level_2': dt.datetime(2020, 1, 1)}} - _password_req = {'': {'Level_1': False}} + _password_req = {'': {'Level_1': True}} + +Updates to Instrument Suite Tests +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Sometimes new standard tests are added to pysat that ensure all data handling +features work as expected throughout the ecosystem. For example, pysat 3.2.0 +adds new tests for loading multiple days at a time or using a data pad. When +these tests require significant updates, an additional flag may be used to +suppress these tests temporarily for specific instruments while updates are +made to that instrument. These new tests are run by default unless specified +using the :py:attr:`_new_tests` flag. + +.. code:: python + + platform = 'newsat' + name = 'data' + tags = {'Level_1': 'Level 1 data, fully compliant', + 'Level_2': 'Level 2 data, needs updates for padding'} + inst_ids = {'': ['Level_1', 'Level_2']} + _test_dates = {'': {'Level_1': dt.datetime(2020, 1, 1), + 'Level_2': dt.datetime(2020, 1, 1)}} + _new_tests = {'': {'Level_2': False}} + +The new tests are marked with a `@pytest.mark.new_tests` statement, and will be +re-evaluated at each minor version release. diff --git a/docs/quickstart.rst b/docs/quickstart.rst index eecb860a5..f1796aaea 100644 --- a/docs/quickstart.rst +++ b/docs/quickstart.rst @@ -67,7 +67,7 @@ installations. print(inst.data) # Testing out the xarray installation - inst = pysat.Instrument('pysat', 'testing_xarray') + inst = pysat.Instrument('pysat', 'ndtesting') inst.load(2009, 1) print(inst.data) diff --git a/docs/requirements.txt b/docs/requirements.txt deleted file mode 100644 index eb1cf559d..000000000 --- a/docs/requirements.txt +++ /dev/null @@ -1,13 +0,0 @@ -docutils<0.18 -ipython -m2r2 -netCDF4 -numpydoc -packaging -pandas -portalocker -pytest -readthedocs-sphinx-search==0.1.1 -sphinx_rtd_theme==1.0.0 -sphinx==4.2.0 -xarray diff --git a/docs/tutorial/tutorial_basics.rst b/docs/tutorial/tutorial_basics.rst index 63fcb1bdb..1e1311801 100644 --- a/docs/tutorial/tutorial_basics.rst +++ b/docs/tutorial/tutorial_basics.rst @@ -439,13 +439,13 @@ Explorer `(ICON) `_. # Retrieve units using general labels dmsp.meta['ti', dmsp.meta.labels.units] - # Update units for ion temperature - dmsp.meta['ti'] = {dmsp.meta.labels.units: 'Kelvin'} + # Update units for ion temperature using direct assignment + dmsp.meta['ti', dmsp.meta.labels.units] = 'Kelvin' - # Update display name for ion temperature, using LaTeX notation + # Update display name for ion temp, using dict assignment and LaTeX notation dmsp.meta['ti'] = {dmsp.meta.labels.name: 'T$_i$'} - # Add new meta data + # Add new meta data for multiple labels using dict assignment dmsp.meta['new'] = {dmsp.meta.labels.units: 'unitless', dmsp.meta.labels.name: 'New display name'} @@ -518,15 +518,16 @@ the instrument PI, etc. Previous versions of pysat stored this data as custom attributes attached to the :py:class:`pysat.Instrument`, instead of keeping all metadata in the :py:class:`pysat.Meta` object. -To avoid breaking existing workflows, global metadata is only loaded into -:py:class:`pysat.MetaHeader` through completely internal pysat processes or -after setting the :py:data:`use_header` keyword argument. +Global metadata is loaded into the :py:class:`pysat.MetaHeader` by default, but +to avoid breaking existing workflows, loading this metadata directly into the +:py:class:`~pysat._instrument.Instrument` by setting the :py:data:`use_header` +keyword argument. .. code:: python - # This will show: Metadata for 0 global attributes - dmsp.load(date=start, use_header=True) - print(dmsp.meta.header) + # This will raise a warning that future releases will require use of + # the MetaHeader class + dmsp.load(date=start, use_header=False) You can manually add global metadata the same way you would assign an attribute. diff --git a/docs/tutorial/tutorial_files.rst b/docs/tutorial/tutorial_files.rst index 067518cb0..04599168e 100644 --- a/docs/tutorial/tutorial_files.rst +++ b/docs/tutorial/tutorial_files.rst @@ -150,7 +150,7 @@ as fill, _FillValue, and FillVal. When writing files pysat processes metadata for both xarray and pandas before writing the file. For xarray, pysat leverages xarray's built-in file writing capabilities. For pandas, pysat interfaces with netCDF4 directly to translate -both 1D and higher dimensional data into netCDF4. +data into netCDF4. .. _tutorial-files-meta: diff --git a/docs/tutorial/tutorial_load.rst b/docs/tutorial/tutorial_load.rst index dfd13072f..76478a28d 100644 --- a/docs/tutorial/tutorial_load.rst +++ b/docs/tutorial/tutorial_load.rst @@ -67,7 +67,7 @@ pysat supports the use of two different data structures. You can either use a pandas `DataFrame `_, a highly capable class with labeled rows and columns, or an xarray -`DataSet `_ +`DataSet `_ for data sets with more dimensions. The type of data class is flagged using the attribute :py:attr:`pysat.Instrument.pandas_format`. This is set to ``True`` if a :py:class:`pandas.DataFrame` is returned by the corresponding diff --git a/docs/tutorial/tutorial_v3_upgrade.rst b/docs/tutorial/tutorial_v3_upgrade.rst index aa45fcae0..3747e3078 100644 --- a/docs/tutorial/tutorial_v3_upgrade.rst +++ b/docs/tutorial/tutorial_v3_upgrade.rst @@ -3,7 +3,7 @@ Transition to v3.0 ------------------ -pysat release v3.0 introduces some backwards incompatible changes from +pysat release v3.0 introduced some backwards incompatible changes from v2.x to ensure a strong foundation for future development. Many of the changes needed to update existing pysat v2.x analysis codes are relatively trivial and relate to an updated restructuring of supporting pysat functions. However, @@ -11,7 +11,7 @@ there are some changes with how pysat stores package information as well as how pysat interacts with the local file system to find files that may require some setup work for systems with an existing pysat v2.x install. -pysat v3.0 now supports a single internal interface for storing and retrieving +pysat v3.0+ now supports a single internal interface for storing and retrieving package data that also makes it possible for users to set values for a variety of pysat defaults. pysat stores all of this information in the user's home directory under ``~/.pysat``. To get the most benefit from this internal @@ -25,12 +25,13 @@ See :ref:`tutorial-params` for more. is a string or list of strings for directories that pysat can use to store science data. -pysat v3.0 now supports more than one top-level directory to store science +pysat v3.0+ now supports more than one top-level directory to store science data as well as updates the default sub-directory structure for storing data. pysat v2.x employed an internal directory template of ``platform/name/tag`` -for organizing data while pysat v3.0 begins with a default of -``os.path.join(platform, name, tag, inst_id)``. Thus, by default, a pysat v3.0 install will -generally not find all existing data files that were managed by pysat v2.x. +for organizing data while pysat v3.0+ begins with a default of +``os.path.join(platform, name, tag, inst_id)``. Thus, by default, a pysat v3.0+ +install will generally not find all existing data files that were managed by +pysat v2.x. Additionally, support for individual instruments has been moved out of pysat and into a penumbra of supporting packages. These supporting packages must be diff --git a/pyproject.toml b/pyproject.toml new file mode 100644 index 000000000..7c1f9a39e --- /dev/null +++ b/pyproject.toml @@ -0,0 +1,94 @@ +[build-system] +requires = ["setuptools", "pip >= 10"] +build-backend = "setuptools.build_meta" + +[project] +name = "pysat" +version = "3.2.0" +description = "Supports science analysis across disparate data platforms" +readme = "README.md" +requires-python = ">=3.6" +license = {file = "LICENSE"} +authors = [ + {name = "Russell Stoneback, et al.", email = "pysat.developers@gmail.com"}, +] +classifiers = [ + "Development Status :: 5 - Production/Stable", + "Intended Audience :: Science/Research", + "Topic :: Scientific/Engineering :: Astronomy", + "Topic :: Scientific/Engineering :: Physics", + "Topic :: Scientific/Engineering :: Atmospheric Science", + "License :: OSI Approved :: BSD License", + "Natural Language :: English", + "Programming Language :: Python :: 3.6", + "Programming Language :: Python :: 3.9", + "Programming Language :: Python :: 3.10", + "Programming Language :: Python :: 3.11", + "Programming Language :: Python :: 3.12", + "Operating System :: MacOS :: MacOS X", + "Operating System :: POSIX :: Linux", + "Operating System :: Microsoft :: Windows" +] +keywords = [ + "pysat", + "ionosphere", + "atmosphere", + "thermosphere", + "magnetosphere", + "heliosphere", + "observations", + "models", + "space", + "satellites", + "analysis" +] +dependencies = [ + "dask", + "netCDF4", + "numpy >= 1.12", + "pandas", + "portalocker", + "pytest", + "scipy", + "toolz", + "xarray" +] + +[project.optional-dependencies] +test = [ + "coveralls", + "flake8", + "flake8-docstrings", + "hacking >= 1.0", + "pysatSpaceWeather<0.1.0", + "pytest-cov", + "pytest-ordering" +] +doc = [ + "extras_require", + "ipython", + "m2r2", + "numpydoc", + "readthedocs-sphinx-search==0.3.2", + "sphinx", + "sphinx_rtd_theme >= 1.2.2, < 2.0.0" +] + +[project.urls] +Documentation = "https://pysat.readthedocs.io/en/latest/" +Source = "https://github.com/pysat/pysat" + +[tool.coverage.report] +omit = ["*/instruments/templates/*"] + +[tool.pytest.ini_options] +addopts = "--cov=pysat" +markers = [ + "all_inst", + "download", + "no_download", + "load_options", + "new_tests", + "first", + "second" +] diff --git a/pysat/__init__.py b/pysat/__init__.py index 122b541bc..35af794dc 100644 --- a/pysat/__init__.py +++ b/pysat/__init__.py @@ -34,10 +34,21 @@ """ +try: + from importlib import metadata + from importlib import resources + + if not hasattr(resources, 'files'): + # The `files` object was introduced in Python 3.9 + resources = None +except ImportError: + import importlib_metadata as metadata + resources = None + import logging import os -from portalocker import Lock +# Logger needs to be initialized before other modules are imported. logger = logging.getLogger(__name__) handler = logging.StreamHandler() formatter = logging.Formatter('%(name)s %(levelname)s: %(message)s') @@ -45,12 +56,13 @@ logger.addHandler(handler) logger.setLevel(logging.WARNING) +# Import statements after this point require a noqa statement for flake8 + # Import and set user and pysat parameters object -from pysat import _params +from pysat import _params # noqa: E402 F401 -# set version -here = os.path.abspath(os.path.dirname(__file__)) -version_filename = os.path.join(here, 'version.txt') +# Set version +__version__ = metadata.version('pysat') # Get home directory home_dir = os.path.expanduser('~') @@ -59,14 +71,22 @@ pysat_dir = os.path.join(home_dir, '.pysat') # Set directory for test data -test_data_path = os.path.join(here, 'tests', 'test_data') +if resources is None: + test_data_path = os.path.join(os.path.realpath(os.path.dirname(__file__)), + 'tests', 'test_data') + citation = os.path.join(os.path.realpath(os.path.dirname(__file__)), + 'citation.txt') +else: + test_data_path = str(resources.files(__package__).joinpath('tests', + 'test_data')) + citation = str(resources.files(__package__).joinpath('citation.txt')) # Create a .pysat directory or parameters file if one doesn't exist. # pysat_settings did not exist pre v3 thus this provides a check against # v2 users that are upgrading. Those users need the settings file plus # new internal directories. -if not os.path.isdir(pysat_dir) or \ - (not os.path.isfile(os.path.join(pysat_dir, 'pysat_settings.json'))): +settings_file = os.path.join(pysat_dir, 'pysat_settings.json') +if not os.path.isdir(pysat_dir) or not os.path.isfile(settings_file): # Make a .pysat directory if not already present if not os.path.isdir(pysat_dir): @@ -83,7 +103,7 @@ os.mkdir(os.path.join(pysat_dir, 'instruments', 'archive')) # Create parameters file - if not os.path.isfile(os.path.join(pysat_dir, 'pysat_settings.json')): + if not os.path.isfile(settings_file): params = _params.Parameters(path=pysat_dir, create_new=True) print(''.join(("\nHi there! pysat will nominally store data in a ", @@ -96,22 +116,20 @@ # Load up existing parameters file params = _params.Parameters() -# Load up version information -with Lock(version_filename, 'r', params['file_timeout']) as version_file: - __version__ = version_file.read().strip() - -from pysat._files import Files -from pysat._instrument import Instrument -from pysat._meta import Meta -from pysat._meta import MetaHeader -from pysat._meta import MetaLabels -from pysat._orbits import Orbits -from pysat import instruments -from pysat import utils - -# Import constellation separately -from pysat._constellation import Constellation +# Modules used by other imports needs to be imported here first. +from pysat import utils # noqa: E402 F401 + +# Import the remainder of the modules. +from pysat._constellation import Constellation # noqa: E402 F401 +from pysat._files import Files # noqa: E402 F401 +from pysat._instrument import Instrument # noqa: E402 F401 +from pysat._meta import Meta # noqa: E402 F401 +from pysat._meta import MetaHeader # noqa: E402 F401 +from pysat._meta import MetaLabels # noqa: E402 F401 +from pysat._orbits import Orbits # noqa: E402 F401 +from pysat import instruments # noqa: E402 F401 + __all__ = ['instruments', 'utils'] -# Cleanup -del here +# Clean up +del settings_file, resources diff --git a/pysat/_constellation.py b/pysat/_constellation.py index 0252c7275..bd8cd9d87 100644 --- a/pysat/_constellation.py +++ b/pysat/_constellation.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Class for Instrument constellations. @@ -378,6 +381,23 @@ def is_plural(num): return output_str + def __delitem__(self, key): + """Delete a key by calling `drop` method. + + Parameters + ---------- + key : str or list-like + A Constellation variable or list of variables. + + Raises + ------ + KeyError + If all key values are unavailable + + """ + self.drop(key) + return + # ----------------------------------------------------------------------- # Define all hidden methods @@ -592,6 +612,61 @@ def date(self): else: return None + def drop(self, names): + """Drop variables (names) from metadata. + + Parameters + ---------- + names : str or list-like + String or list of strings specifying the variable names to drop + + Raises + ------ + KeyError + If all of the keys provided in `names` is not found in the + standard metadata, labels, or header metadata. If a subset is + missing, a logger warning is issued instead. + + """ + # Ensure the input is list-like + names = pysat.utils.listify(names) + + # Divide the names by instrument + good_inst_names = [list() for inst in self.instruments] + bad_names = list() + inst_strs = ['_'.join([attr for attr in [inst.platform, inst.name, + inst.tag, inst.inst_id] + if len(attr) > 0]) for inst in self.instruments] + for name in names: + got_name = False + for i, inst in enumerate(self.instruments): + if name in inst.variables: + good_inst_names[i].append(name) + got_name = True + elif name in self.variables and name.find(inst_strs[i]) > 0: + good_inst_names[i].append(name.split("_{:s}".format( + inst_strs[i]))[0]) + got_name = True + + if not got_name: + bad_names.append(name) + + # If there are no good names, raise a KeyError + if len(bad_names) == len(names): + raise KeyError('{:} not found in Constellation'.format(names)) + + # Drop names by instrument + for i, inst in enumerate(self.instruments): + if len(good_inst_names[i]) > 0: + inst.drop(good_inst_names[i]) + + # If there are some bad names, raise a logging warning + if len(bad_names) > 0: + pysat.logger.warning('{:} not found in Constellation'.format( + bad_names)) + + return + @property def empty(self): """Boolean flag reflecting lack of data. @@ -670,22 +745,24 @@ def to_inst(self, common_coord=True, fill_method=None): # Get the common coordinates needed for all data for cinst in self.instruments: if not cinst.pandas_format: - for new_coord in cinst.data.coords.keys(): - if new_coord not in coords.keys(): - coords[new_coord] = cinst.data.coords[new_coord] - elif new_coord != 'time': - # Two instruments have the same coordinate, if they - # are not identical, we need to establish a common - # range and resolution. Note that this will only - # happen if the coordinates share the same names. - if(len(coords[new_coord]) - != len(cinst.data.coords[new_coord]) - or coords[new_coord].values - != cinst.data.coords[new_coord].values): - coords[new_coord] = establish_common_coord( - [coords[new_coord].values, - cinst.data.coords[new_coord].values], - common=common_coord) + for new_coord in cinst.data.dims.keys(): + if new_coord in cinst.data.coords.keys(): + if new_coord not in coords.keys(): + coords[new_coord] = cinst.data.coords[new_coord] + elif new_coord != 'time': + # Two instruments have the same coordinate, if + # they are not identical, we need to establish + # a common range and resolution. Note that + # this will only happen if the coordinates + # share the same names. + if any([len(cinst.data.coords[new_coord]) + != len(coords[new_coord]), + cinst.data.coords[new_coord].values + != coords[new_coord].values]): + coords[new_coord] = establish_common_coord( + [coords[new_coord].values, + cinst.data.coords[new_coord].values], + common=common_coord) data = xr.Dataset(coords=coords) diff --git a/pysat/_files.py b/pysat/_files.py index 21203b136..5e0a505f7 100644 --- a/pysat/_files.py +++ b/pysat/_files.py @@ -1,7 +1,9 @@ -#!/usr/bin/env python # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- import copy @@ -46,8 +48,8 @@ class Files(object): `month`, `day`, etc. will be filled in as needed using python string formatting. The default file format structure is supplied in the instrument `list_files` routine. See - `pysat.files.parse_delimited_filenames` and - `pysat.files.parse_fixed_width_filenames` for more information. + `pysat.utils.files.parse_delimited_filenames` and + `pysat.utils.files.parse_fixed_width_filenames` for more information. (default=None) write_to_disk : bool If true, the list of Instrument files will be written to disk. diff --git a/pysat/_instrument.py b/pysat/_instrument.py index 744b6c302..812d6f769 100644 --- a/pysat/_instrument.py +++ b/pysat/_instrument.py @@ -2,7 +2,12 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- +"""Class for single instruments.""" + import copy import datetime as dt import errno @@ -75,8 +80,8 @@ class Instrument(object): `month`, `day`, etc. will be filled in as needed using python string formatting. The default file format structure is supplied in the instrument `list_files` routine. See - `pysat.files.parse_delimited_filenames` and - `pysat.files.parse_fixed_width_filenames` for more information. + `pysat.utils.files.parse_delimited_filenames` and + `pysat.utils.files.parse_fixed_width_filenames` for more information. The value will be None if not specified by the user at instantiation. (default=None) temporary_file_list : bool @@ -90,10 +95,6 @@ class Instrument(object): of files found will be checked to ensure the filesizes are greater than zero. Empty files are removed from the stored list of files. (default=False) - labels : dict or NoneType - Dict where keys are the label attribute names and the values are tuples - that have the label values and value types in that order. If None uses - the Meta defaults. Deprecated, use `meta_kwargs` (default=None) meta_kwargs : dict or NoneType Dict to specify custom Meta initialization (default=None) custom : list or NoneType @@ -252,21 +253,13 @@ def __init__(self, platform=None, name=None, tag='', inst_id='', orbit_info=None, inst_module=None, data_dir='', directory_format=None, file_format=None, temporary_file_list=False, strict_time_flag=True, - ignore_empty_files=False, labels=None, meta_kwargs=None, + ignore_empty_files=False, meta_kwargs=None, custom=None, **kwargs): """Initialize `pysat.Instrument` object.""" - # Check for deprecated usage of None - if None in [tag, inst_id]: - warnings.warn(" ".join(["The usage of None in `tag` and `inst_id`", - "has been deprecated and will be removed", - "in 3.2.0+. Please use '' instead of", - "None."]), - DeprecationWarning, stacklevel=2) - # Set default tag, inst_id, and Instrument module - self.tag = '' if tag is None else tag.lower() - self.inst_id = '' if inst_id is None else inst_id.lower() + self.tag = tag.lower() + self.inst_id = inst_id.lower() self.inst_module = inst_module @@ -322,7 +315,7 @@ def __init__(self, platform=None, name=None, tag='', inst_id='', # Expected function keywords exp_keys = ['list_files', 'load', 'preprocess', 'download', - 'list_remote_files', 'clean', 'init'] + 'list_remote_files', 'clean', 'init', 'concat_data'] for fkey in exp_keys: func_name = _kwargs_keys_to_func_name(fkey) func = getattr(self, func_name) @@ -331,10 +324,9 @@ def __init__(self, platform=None, name=None, tag='', inst_id='', default_kwargs = _get_supported_keywords(func) # Expand the dict to include method keywords for load. - # TODO(#1020): Remove this if statement for the 3.2.0+ release + # TODO(#1020): Remove this if statement when `use_header` is removed if fkey == 'load': - meth = getattr(self, fkey) - default_kwargs.update(_get_supported_keywords(meth)) + default_kwargs['use_header'] = True # Confirm there are no reserved keywords present for kwarg in kwargs.keys(): @@ -409,11 +401,16 @@ def __init__(self, platform=None, name=None, tag='', inst_id='', # Check to make sure value is reasonable if self.file_format is not None: # Check if it is an iterable string - if(not isinstance(self.file_format, str) - or (self.file_format.find("{") < 0) - or (self.file_format.find("}") < 0)): - raise ValueError(''.join(['file format set to default, ', - 'supplied string must be iterable ', + if isinstance(self.file_format, str): + if any([self.file_format.find("{") < 0, + self.file_format.find("}") < 0]): + raise ValueError(''.join(['Supplied format string must be ', + 'iterable string with key ', + 'formatting [{', + self.file_format, '}]'])) + else: + raise ValueError(''.join(['Supplied format string must be ', + 'iterable string', '[{:}]'.format(self.file_format)])) # Set up empty data and metadata. @@ -425,12 +422,6 @@ def __init__(self, platform=None, name=None, tag='', inst_id='', # use Instrument definition of MetaLabels over the Metadata declaration. self.meta_kwargs = {} if meta_kwargs is None else meta_kwargs - if labels is not None: - warnings.warn("".join(["`labels` is deprecated, use `meta_kwargs`", - "with the 'labels' key instead. Support ", - "for `labels` will be removed in v3.2.0+"]), - DeprecationWarning, stacklevel=2) - self.meta_kwargs["labels"] = labels self.meta = pysat.Meta(**self.meta_kwargs) self.meta.mutable = False @@ -568,7 +559,8 @@ def __eq__(self, other): # required their own path for equality, string comparisons! partial_funcs = ['_init_rtn', '_clean_rtn', '_preprocess_rtn', '_list_files_rtn', '_download_rtn', - '_list_remote_files_rtn', '_load_rtn'] + '_list_remote_files_rtn', '_load_rtn', + '_concat_data_rtn'] # If the type is the same then check everything that is attached to # the Instrument object. Includes attributes, methods, variables, etc. @@ -742,14 +734,16 @@ def __str__(self): return output_str - def __getitem__(self, key): - """Access data in `pysat.Instrument` object. + def __getitem__(self, key, data=None): + """Access data in `pysat.Instrument` or provided data object. Parameters ---------- key : str, tuple, or dict Data variable name, tuple with a slice, or dict used to locate desired data. + data : pds.DataFrame, xr.Dataset, or NoneType + Desired data object to select from or None to use `data` attribute Raises ------ @@ -788,20 +782,22 @@ def __getitem__(self, key): inst[datetime1:datetime2, 'name1':'name2'] """ + if data is None: + data = self.data if self.pandas_format: if isinstance(key, str): - return self.data[key] + return data[key] elif isinstance(key, tuple): try: # Pass keys directly through - return self.data.loc[key[0], key[1]] + return data.loc[key[0], key[1]] except (KeyError, TypeError) as err1: # TypeError for single integer. KeyError for list, array, # slice of integers. Assume key[0] is integer # (including list or slice). try: - return self.data.loc[self.data.index[key[0]], key[1]] + return data.loc[data.index[key[0]], key[1]] except IndexError as err2: err_message = '\n'.join(("original messages:", str(err1), str(err2))) @@ -811,15 +807,15 @@ def __getitem__(self, key): else: try: # Integer based indexing - return self.data.iloc[key] + return data.iloc[key] except (TypeError, ValueError): # If it's not an integer, TypeError is thrown. If it's a # list, ValueError is thrown. - return self.data[key] + return data[key] else: - return self.__getitem_xarray__(key) + return self.__getitem_xarray__(key, data=data) - def __getitem_xarray__(self, key): + def __getitem_xarray__(self, key, data=None): """Access data in `pysat.Instrument` object with `xarray.Dataset`. Parameters @@ -827,6 +823,8 @@ def __getitem_xarray__(self, key): key : str, tuple, or dict Data variable name, tuple with a slice, or dict used to locate desired data + data : xr.Dataset or NoneType + Desired data object to select from or None to use `data` attribute Returns ------- @@ -865,26 +863,35 @@ def __getitem_xarray__(self, key): inst[datetime1:datetime2, 'name'] """ + if data is None: + data = self.data - if 'Epoch' in self.data.indexes: - epoch_name = 'Epoch' - elif 'time' in self.data.indexes: - epoch_name = 'time' - else: + # Find the standard epoch index name(s) + epoch_names = self._get_epoch_name_from_data(data=data) + if len(epoch_names) == 0: return xr.Dataset(None) + # Find secondary time indexes that may need to be sliced + if len(data.indexes) > 1: + for ind in data.indexes.keys(): + if all([ind != epoch_names[0], + data.indexes[ind].dtype + == data.indexes[epoch_names[0]].dtype]): + epoch_names.append(ind) + if isinstance(key, tuple): if len(key) == 2: # Support slicing time, variable name if isinstance(key[1], slice): # Extract subset of variables before epoch selection. - data_subset = self.data[self.variables[key[1]]] + data_subset = data[self.variables[key[1]]] else: # Extract single variable before epoch selection. - data_subset = self.data[key[1]] + data_subset = data[key[1]] # If the input is a tuple, `key[0]` must be linked to the epoch. - key_dict = {'indexers': {epoch_name: key[0]}} + key_dict = {'indexers': {epoch_name: key[0] + for epoch_name in epoch_names}} try: # Assume key[0] is an integer return data_subset.isel(**key_dict) @@ -909,7 +916,7 @@ def __getitem_xarray__(self, key): for i, dim in enumerate(self[var_name].dims): indict[dim] = key[0][i] - return self.data[var_name][indict] + return data[var_name][indict] else: # Multidimensional indexing where the multiple dimensions are # not contained within another object @@ -925,22 +932,37 @@ def __getitem_xarray__(self, key): for i, dim in enumerate(self[var_name].dims): indict[dim] = key[i] - return self.data[var_name][indict] + return data[var_name][indict] else: try: # Grab a particular variable by name - return self.data[key] + return data[key] except (TypeError, KeyError, ValueError): # If that didn't work, likely need to use `isel` or `sel` # Link key to the epoch. - key_dict = {'indexers': {epoch_name: key}} + key_dict = {'indexers': {epoch_name: key + for epoch_name in epoch_names}} try: # Try to get all data variables, but for a subset of time # using integer indexing - return self.data.isel(**key_dict) + return data.isel(**key_dict) except (KeyError, TypeError): # Try to get a subset of time, using label based indexing - return self.data.sel(**key_dict) + try: + return data.sel(**key_dict) + except KeyError as kerr: + if str(kerr).find('Timestamp') >= 0 and len( + epoch_names) > 0: + # The problem is probably coming from a limited + # time range in the ancillery epochs, remove them + # from selection + pysat.logger.warning( + ''.join(['Removing ', repr(epoch_names[1:]), + ' dimensions from data selection'])) + key_dict = {'indexers': {epoch_names[0]: key}} + return data.sel(**key_dict) + else: + raise kerr def __setitem__(self, key, new_data): """Set data in `pysat.Instrument` object. @@ -1021,30 +1043,6 @@ def __setitem__(self, key, new_data): # the rest of the keys are presumed to be metadata in_data = new.pop('data') - # TODO(#908): remove code below with removal of 2D pandas support. - if hasattr(in_data, '__iter__'): - if not isinstance(in_data, pds.DataFrame) and isinstance( - next(iter(in_data), None), pds.DataFrame): - # Input is a list_like of frames, denoting higher order data - warnings.warn(" ".join(["Support for 2D pandas instrument", - "data has been deprecated and will", - "be removed in 3.2.0+. Please", - "either raise an issue with the", - "developers or modify the load", - "statement to use an", - "xarray.Dataset."]), - DeprecationWarning, stacklevel=2) - - if ('meta' not in new) and (key not in self.meta.keys_nD()): - # Create an empty Meta instance but with variable names. - # This will ensure the correct defaults for all - # subvariables. Meta can filter out empty metadata as - # needed, the check above reduces the need to create - # Meta instances. - ho_meta = pysat.Meta(**self.meta_kwargs) - ho_meta[in_data[0].columns] = {} - self.meta[key] = ho_meta - # Assign data and any extra metadata self.data[key] = in_data self._update_data_types(key) @@ -1057,13 +1055,16 @@ def __setitem__(self, key, new_data): new = {'data': new} in_data = new.pop('data') - if 'Epoch' in self.data.indexes: - epoch_name = 'Epoch' - elif 'time' in self.data.indexes: - epoch_name = 'time' - else: + epoch_names = self._get_epoch_name_from_data() + if len(epoch_names) == 0: raise ValueError(' '.join(('Unsupported time index name,', '"Epoch" or "time".'))) + else: + if len(epoch_names) > 1: + pysat.logger.error("".join(["Multiple standard time index ", + "names found, defaulting to ", + epoch_names[0]])) + epoch_name = epoch_names[0] if isinstance(key, tuple): # User provided more than one thing in assignment location @@ -1092,8 +1093,14 @@ def __setitem__(self, key, new_data): self.data[key] = in_data elif len(np.shape(in_data)) <= 1: # If not an xarray input, but still iterable, then we - # go through to process the 1D input - if np.shape(in_data) == np.shape(self.index): + # go through to process the input + if key in self.variables and ( + np.shape(in_data) == np.shape(self.data[key])): + # The ND input has the same shape as the current data + # and can be assigned directly without adjusting the + # dimensions. Only works with existing data. + self.data[key] = (self.data[key].dims, in_data) + elif np.shape(in_data) == np.shape(self.index): # 1D input has the correct length for storage along # 'Epoch'. self.data[key] = (epoch_name, in_data) @@ -1150,6 +1157,24 @@ def __setitem__(self, key, new_data): return + def __delitem__(self, key): + """Delete a key by calling `drop` method. + + Parameters + ---------- + key : str or list-like + A meta data variable, label, or MetaHeader attribute; which are + considered in that order. + + Raises + ------ + KeyError + If all key values are unavailable + + """ + self.drop(key) + return + def __iter__(self): """Load data for subsequent days or files. @@ -1220,6 +1245,34 @@ def __iter__(self): # ----------------------------------------------------------------------- # Define all hidden methods + def _get_epoch_name_from_data(self, data=None): + """Get the standard epoch name used in this data object. + + Parameters + ---------- + data : pds.DataFrame, xr.Dataset, or NoneType + Desired data object to select from or None to use `data` attribute + + Returns + ------- + epoch_names : list + List of standard epoch names included in the data indexes + + """ + # Initalize output + epoch_names = [] + + # If no data is provided, use the Instrument attribute + if data is None: + data = self.data + + if hasattr(data, 'indexes'): + for ename in ['Epoch', 'time']: + if ename in data.indexes: + epoch_names.append(ename) + + return epoch_names + def _empty(self, data=None): """Determine whether or not data has been loaded. @@ -1265,7 +1318,7 @@ def _index(self, data=None): Returns ------- - pds.Series + index : pds.Series Series containing the time indices for the Instrument data """ @@ -1274,14 +1327,17 @@ def _index(self, data=None): data = self.data if self.pandas_format: - return data.index + index = data.index else: - if 'time' in data.indexes: - return data.indexes['time'] - elif 'Epoch' in data.indexes: - return data.indexes['Epoch'] + epoch_names = self._get_epoch_name_from_data(data=data) + + if len(epoch_names) == 0: + index = pds.Index([]) else: - return pds.Index([]) + # Xarray preferred epoch name order is opposite + index = data.indexes[epoch_names[-1]] + + return index def _pass_method(*args, **kwargs): """Empty default method for updatable Instrument methods.""" @@ -1310,24 +1366,24 @@ def _assign_attrs(self, by_name=False): methods init, preprocess, and clean functions - load, list_files, download, and list_remote_files + load, list_files, download, and list_remote_files, concat_data attributes directory_format, file_format, multi_file_day, orbit_info, and pandas_format test attributes - _test_download, _test_download_ci, and _password_req + _test_download, _test_download_ci, _new_tests, and _password_req """ # Declare the standard Instrument methods and attributes inst_methods = {'required': ['init', 'clean'], - 'optional': ['preprocess']} + 'optional': ['preprocess', 'concat_data']} inst_funcs = {'required': ['load', 'list_files', 'download'], 'optional': ['list_remote_files']} inst_attrs = {'directory_format': None, 'file_format': None, 'multi_file_day': False, 'orbit_info': None, 'pandas_format': True} test_attrs = {'_test_download': True, '_test_download_ci': True, - '_password_req': False} + '_new_tests': True, '_password_req': False} # Set method defaults for mname in [mm for val in inst_methods.values() for mm in val]: @@ -1396,10 +1452,12 @@ def _assign_attrs(self, by_name=False): raise ValueError(estr) if self.tag not in self.inst_module.inst_ids[self.inst_id]: - tag_str = ', '.join([tkey.__repr__() for tkey - in self.inst_module.inst_ids[self.inst_id]]) + tag_id_str = repr(self.inst_module.inst_ids[self.inst_id]).replace( + "{", "'inst ID': ['tag'] combinations are: ") + tag_id_str = tag_id_str.replace("}", "") estr = ''.join(("'", self.tag, "' is not one of the supported ", - 'tags. Supported tags are: ', tag_str, '.')) + "tags for inst ID ['", self.inst_id, "']. ", + 'Supported ', tag_id_str)) raise ValueError(estr) # Assign the Instrument methods @@ -1478,24 +1536,6 @@ def _assign_attrs(self, by_name=False): else: missing.append(iattr) - # Check and see if this instrument has deprecated _test_download_travis - # TODO(#807): Remove this check once _test_download_travis is removed. - if hasattr(self.inst_module, '_test_download_travis'): - local_attr = getattr(self.inst_module, '_test_download_travis') - - # Test to see that this attribute is set for the desired - # `inst_id` and `tag`. - if self.inst_id in local_attr.keys(): - if self.tag in local_attr[self.inst_id].keys(): - # Update the test attribute value - setattr(self, '_test_download_ci', - local_attr[self.inst_id][self.tag]) - warnings.warn(" ".join(["`_test_download_travis` has been", - "deprecated and will be replaced", - "by `_test_download_ci` in", - "3.2.0+"]), - DeprecationWarning, stacklevel=2) - if len(missing) > 0: pysat.logger.debug(' '.join(['These Instrument test attributes', 'kept their default values:', @@ -1572,7 +1612,7 @@ def _load_data(self, date=None, fid=None, inc=None, load_kwargs=None): inc : dt.timedelta, int, or NoneType Increment of files or dates to load, starting from the root date or fid (default=None) - load_kwargs : dict + load_kwargs : dict or NoneType Dictionary of keywords that may be options for specific instruments. If None, uses `self.kwargs['load']`. (default=None) @@ -1683,9 +1723,15 @@ def _load_data(self, date=None, fid=None, inc=None, load_kwargs=None): return data, mdata - def _load_next(self): + def _load_next(self, load_kwargs=None): """Load the next days data (or file) without incrementing the date. + Parameters + ---------- + load_kwargs : dict or NoneType + Dictionary of keywords that may be options for specific instruments. + If None, uses `self.kwargs['load']`. (default=None) + Returns ------- data : pds.DataFrame or xr.Dataset @@ -1702,16 +1748,24 @@ def _load_next(self): or the file. Looks for `self._load_by_date` flag. """ + load_data_kwargs = {'inc': self.load_step, 'load_kwargs': load_kwargs} + if self._load_by_date: - next_date = self.date + self.load_step - return self._load_data(date=next_date, inc=self.load_step) + load_data_kwargs['date'] = self.date + self.load_step else: - next_id = self._fid + self.load_step + 1 - return self._load_data(fid=next_id, inc=self.load_step) + load_data_kwargs['fid'] = self._fid + self.load_step + 1 - def _load_prev(self): + return self._load_data(**load_data_kwargs) + + def _load_prev(self, load_kwargs=None): """Load the previous days data (or file) without decrementing the date. + Parameters + ---------- + load_kwargs : dict or NoneType + Dictionary of keywords that may be options for specific instruments. + If None, uses `self.kwargs['load']`. (default=None) + Returns ------- data : pds.DataFrame or xr.Dataset @@ -1728,14 +1782,14 @@ def _load_prev(self): or the file. Looks for `self._load_by_date` flag. """ - load_kwargs = {'inc': self.load_step} + load_data_kwargs = {'inc': self.load_step, 'load_kwargs': load_kwargs} if self._load_by_date: - load_kwargs['date'] = self.date - self.load_step + load_data_kwargs['date'] = self.date - self.load_step else: - load_kwargs['fid'] = self._fid - self.load_step - 1 + load_data_kwargs['fid'] = self._fid - self.load_step - 1 - return self._load_data(**load_kwargs) + return self._load_data(**load_data_kwargs) def _set_load_parameters(self, date=None, fid=None): """Set the necesssary load attributes. @@ -1855,91 +1909,9 @@ def _get_data_info(self, data): return data, data_type, datetime_flag - def _filter_netcdf4_metadata(self, mdata_dict, coltype, remove=False, - export_nan=None): - """Filter metadata properties to be consistent with netCDF4. - - .. deprecated:: 3.0.2 - Moved to `pysat.utils.io.filter_netcdf4_metadata. This wrapper - will be removed in 3.2.0+. - - Parameters - ---------- - mdata_dict : dict - Dictionary equivalent to Meta object info - coltype : type - Data type provided by `pysat.Instrument._get_data_info` - remove : bool - Removes FillValue and associated parameters disallowed for strings - (default=False) - export_nan : list or NoneType - Metadata parameters allowed to be NaN (default=None) - - Returns - ------- - dict - Modified as needed for netCDf4 - - Warnings - -------- - UserWarning - When data removed due to conflict between value and type - - Note - ---- - Remove forced to True if coltype consistent with a string type - - Metadata values that are NaN and not listed in export_nan are removed. - - See Also - -------- - pysat.utils.io.filter_netcdf4_metadata - - """ - warnings.warn("".join(["`pysat.Instrument._filter_netcdf4_metadata` ", - "has been deprecated and will be removed ", - "in pysat 3.2.0+. Use `pysat.utils.io.", - "filter_netcdf4_metadata` instead."]), - DeprecationWarning, stacklevel=2) - - if remove: - check_type = [self.meta.labels.fill_val, self.meta.labels.max_val, - self.meta.labels.min_val] - else: - check_type = None - - return pysat.utils.io.filter_netcdf4_metadata(self, mdata_dict, coltype, - remove=remove, - check_type=check_type, - export_nan=export_nan) - # ----------------------------------------------------------------------- # Define all accessible methods - @property - def meta_labels(self): - """Provide Meta input for labels kwarg, deprecated. - - Returns - ------- - dict - Either Meta default provided locally or custom value provided - by user and stored in `meta_kwargs['labels']` - - """ - warnings.warn("".join(["Deprecated attribute, returns `meta_kwargs", - "['labels']` or Meta defaults if not set. Will", - " be removed in pysat 3.2.0+"]), - DeprecationWarning, stacklevel=2) - if 'labels' in self.meta_kwargs.keys(): - return self.meta_kwargs['labels'] - else: - return {'units': ('units', str), 'name': ('long_name', str), - 'notes': ('notes', str), 'desc': ('desc', str), - 'min_val': ('value_min', (float, int)), - 'max_val': ('value_max', (float, int)), - 'fill_val': ('fill', (float, int, str))} - @property def bounds(self): """Boundaries for iterating over instrument object by date or file. @@ -2192,8 +2164,8 @@ def bounds(self, value=None): if self.files.stop_date is not None: # Ensure the start and stop times intersect with # the file list - if(start <= self.files.stop_date - and stops[i] >= self.files.start_date): + if all([start <= self.files.stop_date, + stops[i] >= self.files.start_date]): good_bounds.append(i) if len(good_bounds) > 0: @@ -2349,7 +2321,7 @@ def copy(self): return inst_copy - def concat_data(self, new_data, prepend=False, **kwargs): + def concat_data(self, new_data, prepend=False, include=None, **kwargs): """Concatonate data to self.data for xarray or pandas as needed. Parameters @@ -2359,6 +2331,9 @@ def concat_data(self, new_data, prepend=False, **kwargs): prepend : bool If True, assign new data before existing data; if False append new data (default=False) + include : int or NoneType + Index at which `self.data` should be included in `new_data` or None + to use `prepend` (default=None) **kwargs : dict Optional keyword arguments passed to pds.concat or xr.concat @@ -2372,45 +2347,65 @@ def concat_data(self, new_data, prepend=False, **kwargs): For xarray, `dim=Instrument.index.name` is passed along to xarray.concat except if the user includes a value for dim as a keyword argument. + Examples + -------- + :: + + # Concatonate data before and after the existing Instrument data + inst.concat_data([prev_data, next_data], include=1) + """ + # Add any concat_data kwargs + for ckey in self.kwargs['concat_data'].keys(): + if ckey not in kwargs.keys(): + kwargs[ckey] = self.kwargs['concat_data'][ckey] + # Order the data to be concatenated in a list if not isinstance(new_data, list): new_data = [new_data] - if prepend: - new_data.append(self.data) + if include is None: + if prepend: + new_data.append(self.data) + else: + new_data.insert(0, self.data) else: - new_data.insert(0, self.data) + new_data.insert(include, self.data) - # Retrieve the appropriate concatenation function - if self.pandas_format: - # Specifically do not sort unless otherwise specified - if 'sort' not in kwargs: - kwargs['sort'] = False - concat_func = pds.concat + if self._concat_data_rtn.__name__.find('_pass_method') == 0: + # There is no custom concat function, use the pysat standard method. + # Start by retrieving the appropriate concatenation function + if self.pandas_format: + # Specifically do not sort unless otherwise specified + if 'sort' not in kwargs: + kwargs['sort'] = False + concat_func = pds.concat + else: + # Ensure the dimensions are equal + equal_dims = True + idat = 0 + while idat < len(new_data) - 1 and equal_dims: + if new_data[idat].dims != new_data[idat + 1].dims: + equal_dims = False + idat += 1 + + if not equal_dims: + # Update the dimensions, padding data where necessary + new_data = pysat.utils.coords.expand_xarray_dims( + new_data, self.meta, exclude_dims=[self.index.name]) + + # Specify the dimension, if not otherwise specified + if 'dim' not in kwargs: + kwargs['dim'] = self.index.name + + # Set the concat function + concat_func = xr.concat + + # Assign the concatenated data to the instrument + self.data = concat_func(new_data, **kwargs) else: - # Ensure the dimensions are equal - equal_dims = True - idat = 0 - while idat < len(new_data) - 1 and equal_dims: - if new_data[idat].dims != new_data[idat + 1].dims: - equal_dims = False - idat += 1 - - if not equal_dims: - # Update the dimensions, padding data where necessary - new_data = pysat.utils.coords.expand_xarray_dims( - new_data, self.meta, exclude_dims=['time']) - - # Specify the dimension, if not otherwise specified - if 'dim' not in kwargs: - kwargs['dim'] = self.index.name - - # Set the concat function - concat_func = xr.concat - - # Assign the concatenated data to the instrument - self.data = concat_func(new_data, **kwargs) + self._concat_data_rtn(new_data, **kwargs) + return def custom_attach(self, function, at_pos='end', args=None, kwargs=None): @@ -2514,6 +2509,48 @@ def custom_clear(self): self.custom_kwargs = [] return + def drop(self, names): + """Drop variables from Instrument. + + Parameters + ---------- + names : str or list-like + String or list of strings specifying the variables names to drop + + Raises + ------ + KeyError + If all of the variable names provided in `names` are not found + in the variable list. If a subset is missing, a logger warning is + issued instead. + + """ + # Ensure the input is list-like + names = pysat.utils.listify(names) + + # Ensure the names are present in the list of variables + good_names = [name for name in names if name in self.variables] + + if len(good_names) > 0: + # Drop the Instrument data using the appropriate methods + if self.pandas_format: + self.data = self.data.drop(columns=good_names) + else: + self.data = self.data.drop_vars(good_names) + + # Drop the meta data associated with this variable + self.meta.drop(good_names) + + if len(good_names) < len(names): + if len(good_names) == 0: + raise KeyError("{:} not found in Instrument variables".format( + names)) + else: + pysat.logger.warning( + "{:} not found in Instrument variables".format( + [name for name in names if name not in good_names])) + return + def today(self): """Get today's date (UTC), with no hour, minute, second, etc. @@ -2737,33 +2774,6 @@ def rename(self, mapper, lowercase_data_labels=False): inst.rename(str.upper) - If using a pandas-type Instrument with higher-order data and a - dictionary mapper, the upper-level data key must contain a dictionary - for renaming the dependent data variables. The upper-level data key - cannot be renamed. Note that this rename will be invoked individually - for all times in the dataset. - :: - - # Applies to higher-order datasets that are loaded into pandas - inst = pysat.Instrument('pysat', 'testing2D') - inst.load(2009, 1) - mapper = {'uts': 'pysat_uts', - 'profiles': {'density': 'pysat_density'}} - inst.rename(mapper) - print(inst[0, 'profiles'].columns) # 'density' will be updated - - # To rename higher-order data at both levels using a dictionary, - # you need two calls - mapper2 = {'profiles': 'pysat_profile'} - inst.rename(mapper2) - print(inst[0, 'pysat_profile'].columns) - - # A function will affect both standard and higher-order data. - # Remember this function also updates the Meta data. - inst.rename(str.capitalize) - print(inst.meta['Pysat_profile']['children']) - - pysat supports differing case for variable labels across the data and metadata objects attached to an Instrument. Since Meta is case-preserving (on assignment) but case-insensitive to access, the @@ -2773,10 +2783,9 @@ def rename(self, mapper, lowercase_data_labels=False): :: # Example with lowercase_data_labels - inst = pysat.Instrument('pysat', 'testing2D') + inst = pysat.Instrument('pysat', 'testing') inst.load(2009, 1) - mapper = {'uts': 'Pysat_UTS', - 'profiles': {'density': 'PYSAT_density'}} + mapper = {'uts': 'Pysat_UTS'} inst.rename(mapper, lowercase_data_labels=True) # Note that 'Pysat_UTS' was applied to data as 'pysat_uts' @@ -2809,82 +2818,18 @@ def rename(self, mapper, lowercase_data_labels=False): # Initialize dict for renaming normal pandas data pdict = {} - # Collect normal variables and rename higher order variables + # Collect and rename variables for vkey in self.variables: map_key = pysat.utils.get_mapped_value(vkey, mapper) if map_key is not None: - # Treat higher-order pandas and normal pandas separately - if vkey in self.meta.keys_nD(): - # Variable name is in higher order list - hdict = {} - if isinstance(map_key, dict): - # Changing a variable name within a higher order - # object using a dictionary. First ensure the - # variable exist. - for hkey in map_key.keys(): - if hkey not in self.meta[ - vkey]['children'].keys(): - estr = ' '.join( - ('cannot rename', repr(hkey), - 'because it is not a known ', - 'higher-order variable under', - repr(vkey), '.')) - raise ValueError(estr) - hdict = map_key - else: - # This is either a value or a mapping function - for hkey in self.meta[vkey]['children'].keys(): - hmap = pysat.utils.get_mapped_value(hkey, - mapper) - if hmap is not None: - hdict[hkey] = hmap - - pdict[vkey] = map_key - - # Check for lowercase flag - change = True - if lowercase_data_labels: - gdict = {hkey: hdict[hkey].lower() - for hkey in hdict.keys() - if hkey != hdict[hkey].lower()} - - if len(list(gdict.keys())) == 0: - change = False - else: - gdict = hdict - - # Change the higher-order variable names frame-by-frame - if change: - for i in np.arange(len(self.index)): - if isinstance(self[i, vkey], pds.Series): - if self[i, vkey].name in gdict: - new_name = gdict[self[i, vkey].name] - self[i, vkey].rename(new_name, - inplace=True) - else: - tkey = list(gdict.keys())[0] - if self[i, vkey].name != gdict[tkey]: - estr = ' '.join( - ('cannot rename', hkey, - 'because, it is not a known' - 'known higher-order ', - 'variable under', vkey, 'at', - 'index {:d}.'.format(i))) - raise ValueError(estr) - else: - self[i, vkey].rename(columns=gdict, - inplace=True) - + # Add to the pandas renaming dictionary after accounting + # for the `lowercase_data_labels` flag. + if lowercase_data_labels: + if vkey != map_key.lower(): + pdict[vkey] = map_key.lower() else: - # This is a normal variable. Add it to the pandas - # renaming dictionary after accounting for the - # `lowercase_data_labels` flag. - if lowercase_data_labels: - if vkey != map_key.lower(): - pdict[vkey] = map_key.lower() - else: - pdict[vkey] = map_key + pdict[vkey] = map_key # Change variable names for attached data object self.data.rename(columns=pdict, inplace=True) @@ -2908,46 +2853,9 @@ def rename(self, mapper, lowercase_data_labels=False): return - def generic_meta_translator(self, input_meta): - """Convert the `input_meta` metadata into a dictionary. - - .. deprecated:: 3.0.2 - `generic_meta_translator` will be removed in the 3.2.0+ release. - - Parameters - ---------- - input_meta : pysat.Meta - The metadata object to translate - - Returns - ------- - export_dict : dict - A dictionary of the metadata for each variable of an output file - - Note - ---- - Uses the translation dict, if present, at `self._meta_translation_table` - to map existing metadata labels to a list of labels used in the - returned dict. - - """ - - dstr = ''.join(['This function has been deprecated. Please see ', - '`pysat.utils.io.apply_table_translation_to_file` and ', - '`self.meta.to_dict` to get equivalent functionality.']) - warnings.warn(dstr, DeprecationWarning, stacklevel=2) - - meta_dict = input_meta.to_dict() - trans_table = self._meta_translation_table - exp_dict = pysat.utils.io.apply_table_translation_to_file(self, - meta_dict, - trans_table) - - return exp_dict - def load(self, yr=None, doy=None, end_yr=None, end_doy=None, date=None, end_date=None, fname=None, stop_fname=None, verifyPad=False, - use_header=False, **kwargs): + **kwargs): """Load the instrument data and metadata. Parameters @@ -2984,9 +2892,6 @@ def load(self, yr=None, doy=None, end_yr=None, end_doy=None, date=None, verifyPad : bool If True, padding data not removed for debugging. Padding parameters are provided at Instrument instantiation. (default=False) - use_header : bool - If True, moves custom Meta attributes to MetaHeader instead of - Instrument (default=False) **kwargs : dict Dictionary of keywords that may be options for specific instruments. @@ -3044,6 +2949,39 @@ def load(self, yr=None, doy=None, end_yr=None, end_doy=None, date=None, inst.load(fname=inst.files[0], stop_fname=inst.files[1]) """ + # If the `use_header` kwarg is included, set it here. Otherwise set + # it to True. + # TODO(#1020): removed this logic after kwarg not supported. + if 'use_header' in kwargs.keys(): + use_header = kwargs['use_header'] + warnings.warn(''.join(['Meta now contains a class for global ', + 'metadata (MetaHeader). Allowing attachment', + ' of global attributes to Instrument ', + 'through `use_header=False` will be ', + 'Deprecated in pysat 3.3.0+. Remove ', + '`use_header` kwarg (now same as ', + '`use_header=True`) to stop this warning.']), + DeprecationWarning, stacklevel=2) + else: + use_header = True + + # Provide user friendly error if there is no data + if len(self.files.files) == 0: + # TODO(#1182) - Update with pysat 3.3.0+ per directions below + # In pysat 3.3, modify this section to leave function early + # to prevent a downstream IndexError. Remove Deprecation portion + # of message below and leave as a UserWarning. + estr = ''.join(('No files found for Instrument. If files are ', + 'expected, please confirm that data is present ', + 'on the system and that ', + "pysat.params['data_dirs'] is set correctly.")) + warnings.warn(estr, UserWarning, stacklevel=2) + estr = ''.join(("In pysat version 3.3.0+ the subsequent ", + 'IndexError will not be raised.')) + warnings.warn(estr, DeprecationWarning, stacklevel=2) + # Uncomment line below, pysat 3.3.0+ + # return + # Add the load kwargs from initialization those provided on input for lkey in self.kwargs['load'].keys(): # Only use the initialized kwargs if a request hasn't been @@ -3160,8 +3098,7 @@ def load(self, yr=None, doy=None, end_yr=None, end_doy=None, date=None, # Check for consistency between loading range and data padding, if any if self.pad is not None: if self._load_by_date: - tdate = dt.datetime(2009, 1, 1) - if tdate + self.load_step < tdate + loop_pad: + if date + self.load_step < date + loop_pad: estr = ''.join(('Data padding window must be shorter than ', 'data loading window. Load a greater ', 'range of data or shorten the padding.')) @@ -3183,11 +3120,13 @@ def load(self, yr=None, doy=None, end_yr=None, end_doy=None, date=None, pysat.logger.debug('Initializing data cache.') # Using current date or fid - self._prev_data, self._prev_meta = self._load_prev() + self._prev_data, self._prev_meta = self._load_prev( + load_kwargs=kwargs) self._curr_data, self._curr_meta = self._load_data( date=self.date, fid=self._fid, inc=self.load_step, load_kwargs=kwargs) - self._next_data, self._next_meta = self._load_next() + self._next_data, self._next_meta = self._load_next( + load_kwargs=kwargs) else: if self._next_data_track == curr: pysat.logger.debug('Using data cache. Loading next.') @@ -3197,7 +3136,8 @@ def load(self, yr=None, doy=None, end_yr=None, end_doy=None, date=None, self._prev_meta = self._curr_meta self._curr_data = self._next_data self._curr_meta = self._next_meta - self._next_data, self._next_meta = self._load_next() + self._next_data, self._next_meta = self._load_next( + load_kwargs=kwargs) elif self._prev_data_track == curr: pysat.logger.debug('Using data cache. Loading previous.') # Moving backward in time @@ -3206,19 +3146,22 @@ def load(self, yr=None, doy=None, end_yr=None, end_doy=None, date=None, self._next_meta = self._curr_meta self._curr_data = self._prev_data self._curr_meta = self._prev_meta - self._prev_data, self._prev_meta = self._load_prev() + self._prev_data, self._prev_meta = self._load_prev( + load_kwargs=kwargs) else: - # Jumped in time/or switched from filebased to date based + # Jumped in time/or switched from file based to date based # access pysat.logger.debug('Resetting data cache.') del self._prev_data del self._curr_data del self._next_data - self._prev_data, self._prev_meta = self._load_prev() + self._prev_data, self._prev_meta = self._load_prev( + load_kwargs=kwargs) self._curr_data, self._curr_meta = self._load_data( date=self.date, fid=self._fid, inc=self.load_step, load_kwargs=kwargs) - self._next_data, self._next_meta = self._load_next() + self._next_data, self._next_meta = self._load_next( + load_kwargs=kwargs) # Make sure datetime indices for all data is monotonic if self.pandas_format: @@ -3238,14 +3181,16 @@ def load(self, yr=None, doy=None, end_yr=None, end_doy=None, date=None, self._next_data = getattr(self._next_data, sort_method)(*sort_args) - # Make tracking indexes consistent with new loads + # Make tracking indexes consistent with new loads, as date loading + # and file loading have to be treated differently due to change in + # inclusive/exclusive range end treatment. Loading by file is + # inclusive. if self._load_by_date: + # Arithmetic uses datetime or DateOffset objects self._next_data_track = curr + self.load_step self._prev_data_track = curr - self.load_step else: - # File and date loads have to be treated differently - # due to change in inclusive/exclusive range end - # treatment. Loading by file is inclusive. + # Arithmetic uses integers self._next_data_track = curr + self.load_step + 1 self._prev_data_track = curr - self.load_step - 1 @@ -3285,48 +3230,49 @@ def load(self, yr=None, doy=None, end_yr=None, end_doy=None, date=None, "by file."))) # Pad data based upon passed parameter - if (not self._empty(self._prev_data)) & (not self.empty): - stored_data = self.data # .copy() - temp_time = copy.deepcopy(self.index[0]) - - # Pad data using access mechanisms that works for both pandas - # and xarray - self.data = self._prev_data.copy() - - # __getitem__ used below to get data from instrument object. - # Details for handling pandas and xarray are different and - # handled by __getitem__. - self.data = self[first_pad:temp_time] - if not self.empty: - if self.index[-1] == temp_time: - self.data = self[:-1] - self.concat_data(stored_data, prepend=False) - else: - self.data = stored_data - - if (not self._empty(self._next_data)) & (not self.empty): - stored_data = self.data # .copy() - temp_time = copy.deepcopy(self.index[-1]) - - # Pad data using access mechanisms that work for both pandas - # and xarray - self.data = self._next_data.copy() - self.data = self[temp_time:last_pad] - if len(self.index) > 0: - if (self.index[0] == temp_time): - self.data = self[1:] - self.concat_data(stored_data, prepend=True) - else: - self.data = stored_data + cdata = list() + include = None + if not self._empty(self._prev_data) and not self.empty: + # __getitem__ is used to handle any pandas/xarray differences in + # data slicing + pdata = self.__getitem__(slice(first_pad, self.index[0]), + data=self._prev_data) + if not self._empty(pdata): + # Test the data index, slicing if necessary + pindex = self._index(data=pdata) + if len(pindex) > 0: + if pindex[-1] == self.index[0]: + pdata = self.__getitem__(slice(-1), data=pdata) + cdata.append(pdata) + include = 1 + + if not self._empty(self._next_data) and not self.empty: + # __getitem__ is used to handle any pandas/xarray differences in + # data slicing + ndata = self.__getitem__(slice(self.index[-1], last_pad), + data=self._next_data) + if not self._empty(ndata): + # Test the data index, slicing if necessary + nindex = self._index(data=ndata) + if len(nindex) > 1: + if nindex[0] == self.index[-1]: + ndata = self.__getitem__( + slice(1, len(nindex)), data=ndata) + cdata.append(ndata) + if include is None: + include = 0 + + # Concatonate the current, previous, and next data + if len(cdata) > 0: + self.concat_data(cdata, include=include) if len(self.index) > 0: self.data = self[first_pad:last_pad] # Want exclusive end slicing behavior from above if not self.empty: - if (self.index[-1] == last_pad) & (not want_last_pad): + if (self.index[-1] == last_pad) and (not want_last_pad): self.data = self[:-1] - else: # If self.pad is False, load single day self.data, meta = self._load_data(date=self.date, fid=self._fid, @@ -3386,19 +3332,11 @@ def load(self, yr=None, doy=None, end_yr=None, end_doy=None, date=None, # Transfer any extra attributes in meta to the Instrument object. # Metadata types need to be initialized before preprocess is run. - # TODO(#1020): Change the way this kwarg is handled + # TODO(#1020): Remove warning and logic when kwarg is removed if use_header or ('use_header' in self.kwargs['load'] and self.kwargs['load']['use_header']): self.meta.transfer_attributes_to_header() else: - warnings.warn(''.join(['Meta now contains a class for global ', - 'metadata (MetaHeader). Default attachment ', - 'of global attributes to Instrument will ', - 'be Deprecated in pysat 3.2.0+. Set ', - '`use_header=True` in this load call or ', - 'on Instrument instantiation to remove this', - ' warning.']), DeprecationWarning, - stacklevel=2) self.meta.transfer_attributes_to_instrument(self) # Transfer loaded data types to meta. @@ -3621,10 +3559,6 @@ def download(self, start=None, stop=None, date_array=None, **kwargs): """Download data for given Instrument object from start to stop. - .. deprecated:: 3.2.0 - `freq`, which sets the step size for downloads, will be removed in - the 3.2.0+ release. - Parameters ---------- start : pandas.datetime or NoneType @@ -3659,17 +3593,9 @@ def download(self, start=None, stop=None, date_array=None, pandas.DatetimeIndex """ - # Test for deprecated kwargs - if 'freq' in kwargs.keys(): - warnings.warn("".join(["`pysat.Instrument.download` kwarg `freq` ", - "has been deprecated and will be removed ", - "in pysat 3.2.0+. Use `date_array` for ", - "non-daily frequencies instead."]), - DeprecationWarning, stacklevel=2) - freq = kwargs['freq'] - del kwargs['freq'] - else: - freq = 'D' + + # Set frequency to daily. + freq = 'D' # Make sure directories are there, otherwise create them try: @@ -3733,8 +3659,8 @@ def download(self, start=None, stop=None, date_array=None, # Get current bounds curr_bound = self.bounds if self._iter_type == 'date': - if(curr_bound[0][0] == first_date - and curr_bound[1][0] == last_date): + if all([curr_bound[0][0] == first_date, + curr_bound[1][0] == last_date]): pysat.logger.info(' '.join(('Updating instrument', 'object bounds by date'))) self.bounds = (self.files.start_date, @@ -3749,8 +3675,8 @@ def download(self, start=None, stop=None, date_array=None, dsel2 = slice(last_date, last_date + dt.timedelta(hours=23, minutes=59, seconds=59)) - if(curr_bound[0][0] == self.files[dsel1][0] - and curr_bound[1][0] == self.files[dsel2][-1]): + if all([curr_bound[0][0] == self.files[dsel1][0], + curr_bound[1][0] == self.files[dsel2][-1]]): pysat.logger.info(' '.join(('Updating instrument', 'object bounds by file'))) dsel1 = slice(self.files.start_date, @@ -3768,20 +3694,16 @@ def download(self, start=None, stop=None, date_array=None, return - def to_netcdf4(self, fname=None, base_instrument=None, epoch_name=None, + def to_netcdf4(self, fname, base_instrument=None, epoch_name=None, zlib=False, complevel=4, shuffle=True, preserve_meta_case=False, export_nan=None, export_pysat_info=True, unlimited_time=True, modify=False): """Store loaded data into a netCDF4 file. - .. deprecated:: 3.0.2 - Changed `fname` from a kwarg to an arg of type str in the 3.2.0+ - release. - Parameters ---------- - fname : str or NoneType - Full path to save instrument object to (default=None) + fname : str + Full path to save instrument object to netCDF base_instrument : pysat.Instrument or NoneType Class used as a comparison, only attributes that are present with self and not on base_instrument are written to netCDF. Using None @@ -3835,11 +3757,6 @@ def to_netcdf4(self, fname=None, base_instrument=None, epoch_name=None, pysat.utils.io.to_netcdf """ - if fname is None: - warnings.warn("".join(["`fname` as a kwarg has been deprecated, ", - "must supply a filename 3.2.0+"]), - DeprecationWarning, stacklevel=2) - raise ValueError("Must supply an output filename") # Prepare the instrument object used to create the output file inst = self if modify else self.copy() diff --git a/pysat/_meta.py b/pysat/_meta.py index 85d5c8d54..21694050b 100644 --- a/pysat/_meta.py +++ b/pysat/_meta.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Classes for storing and managing meta data.""" @@ -13,6 +16,7 @@ import pysat import pysat.utils._core as core_utils +from pysat.utils import listify from pysat.utils import testing @@ -67,10 +71,6 @@ class Meta(object): but the original case is preserved. Case preseveration is built in to support writing files with a desired case to meet standards. - Metadata for higher order data objects, those that have - multiple products under a single variable name in a `pysat.Instrument` - object, are stored by providing a Meta object under the single name. - Supports any custom metadata values in addition to the expected metadata attributes (units, name, notes, desc, value_min, value_max, and fill). These base attributes may be used to programatically access and set types @@ -121,20 +121,6 @@ class Meta(object): meta2['var_name42'] = {'long_name': 'name2of4', 'units': 'Units2'} meta['var_name4'] = {'meta': meta2} - # An alternative method to achieve the same result is: - meta['var_name4'] = meta2 - meta['var_name4'].children['name41'] - meta['var_name4'].children['name42'] - - # You may, of course, have a mixture of 1D and nD data - meta = pysat.Meta() - meta['dm'] = {'units': 'hey', 'long_name': 'boo'} - meta['rpa'] = {'units': 'crazy', 'long_name': 'boo_whoo'} - meta2 = pysat.Meta() - meta2[['higher', 'lower']] = {'meta': [meta, None], - 'units': [None, 'boo'], - 'long_name': [None, 'boohoo']} - # Meta data may be assigned from another Meta object using dict-like # assignments key1 = 'var_name' @@ -182,8 +168,8 @@ def __init__(self, metadata=None, header_data=None, # Set the NaN export list self._export_nan = [] if export_nan is None else export_nan for lvals in labels.values(): - if(lvals[0] not in self._export_nan - and float in pysat.utils.listify(lvals[1])): + if all([lvals[0] not in self._export_nan, + float in pysat.utils.listify(lvals[1])]): self._export_nan.append(lvals[0]) # Set the labels @@ -192,9 +178,6 @@ def __init__(self, metadata=None, header_data=None, # Set the data types, if provided self._data_types = data_types - # Initialize higher order (nD) data structure container, a dict - self._ho_data = {} - # Use any user provided data to instantiate object with data. # Attributes unit and name labels are called within. if metadata is not None: @@ -249,23 +232,17 @@ def __str__(self, long_str=True): """ # Get the desired variables as lists labs = [var for var in self.attrs()] - vdim = [var for var in self.keys() if var not in self.keys_nD()] - nchild = {var: len([kk for kk in self[var]['children'].keys()]) - for var in self.keys_nD()} - ndim = ["{:} -> {:d} children".format(var, nchild[var]) - for var in self.keys_nD()] + vdim = [var for var in self.keys()] # Get the lengths of each list nlabels = len(labs) nvdim = len(vdim) - nndim = len(ndim) # Print the short output out_str = "pysat Meta object\n" out_str += "-----------------\n" out_str += "Tracking {:d} metadata values\n".format(nlabels) out_str += "Metadata for {:d} standard variables\n".format(nvdim) - out_str += "Metadata for {:d} ND variables\n".format(nndim) # Print the global meta data. `max_num` should be divisible by 2 and # `ncol`. @@ -283,10 +260,6 @@ def __str__(self, long_str=True): out_str += "\nStandard Metadata variables:\n" out_str += core_utils.fmt_output_in_cols(vdim, ncols=ncol, max_num=max_num) - if nndim > 0: - out_str += "\nND Metadata variables:\n" - out_str += core_utils.fmt_output_in_cols(ndim, ncols=ncol, - max_num=max_num) return out_str @@ -348,13 +321,17 @@ def __setitem__(self, data_vars, input_dat): Parameters ---------- - data_vars : str, list + data_vars : str, list, tuple Data variable names for the input metadata - input_dat : dict, pds.Series, or Meta + input_dat : dict, pds.Series, Meta, int, float, str, bool, or NoneType Input metadata to be assigned - """ + Raises + ------ + ValueError + For unexpected input type that does not allow metadata to be set. + """ input_data = deepcopy(input_dat) if isinstance(input_data, dict): @@ -414,133 +391,85 @@ def __setitem__(self, data_vars, input_dat): # Time to actually add the metadata for ikey in input_data: - if ikey not in ['children', 'meta']: - for i, var in enumerate(data_vars): - to_be_set = input_data[ikey][i] - good_set = True - - # See if this meta data key has already been defined - # in MetaLabels - if ikey in self.labels.label_attrs.keys(): - iattr = self.labels.label_attrs[ikey] - if not isinstance( - to_be_set, self.labels.label_type[iattr]): - # If this is a disagreement between byte data - # and an expected str, resolve it here - if(isinstance(to_be_set, bytes) - and str in pysat.utils.listify( - self.labels.label_type[iattr])): - to_be_set = core_utils.stringify(to_be_set) - else: - # This type is incorrect, try casting it - wmsg = ''.join(['Metadata with type ', - repr(type(to_be_set)), - ' does not match expected ', - 'type ', - repr(self.labels.label_type[ - iattr])]) - try: - if hasattr(to_be_set, '__iter__'): - if str in pysat.utils.listify( - self.labels.label_type[ - iattr]): - to_be_set = '\n\n'.join( - [str(tval) for tval in - to_be_set]) - else: - raise TypeError("can't recast") + for i, var in enumerate(data_vars): + to_be_set = input_data[ikey][i] + good_set = True + + # See if this meta data key has already been defined + # in MetaLabels + if ikey in self.labels.label_attrs.keys(): + iattr = self.labels.label_attrs[ikey] + if not isinstance( + to_be_set, self.labels.label_type[iattr]): + # If this is a disagreement between byte data + # and an expected str, resolve it here + if all([isinstance(to_be_set, bytes), + str in pysat.utils.listify( + self.labels.label_type[iattr])]): + to_be_set = core_utils.stringify(to_be_set) + else: + # This type is incorrect, try casting it + wmsg = ''.join(['Metadata with type ', + repr(type(to_be_set)), + ' does not match expected ', + 'type ', + repr(self.labels.label_type[ + iattr])]) + try: + if hasattr(to_be_set, '__iter__'): + if str in pysat.utils.listify( + self.labels.label_type[iattr]): + to_be_set = '\n\n'.join( + [str(tval) for tval in + to_be_set]) else: - to_be_set = pysat.utils.listify( - self.labels.label_type[ - iattr])[0](to_be_set) - - # Inform user data was recast - pysat.logger.info(''.join(( - wmsg, '. Recasting input for ', - repr(var), ' with key ', - repr(ikey)))) - except (TypeError, ValueError): - # Warn user data was dropped - warnings.warn(''.join(( - wmsg, '. Dropping input for ', - repr(var), ' with key ', - repr(ikey)))) - good_set = False - else: - # Extend the meta labels. Ensure the attribute - # name has no spaces and that bytes are used instead - # of strings. - iattr = ikey.replace(" ", "_") - itype = type(to_be_set) - if itype == bytes: - itype = str - - # Update the MetaLabels object and the existing - # metadata to ensure all data have all labels - self.labels.update(iattr, ikey, itype) - self._label_setter(ikey, ikey, type(to_be_set)) - - # Set the data - if good_set: - self._data.loc[var, ikey] = to_be_set - else: - # Key is 'meta' or 'children', providing higher order - # metadata. Meta inputs could be part of a larger multiple - # parameter assignment, so not all names may actually have - # a 'meta' object to add. - for item, val in zip(data_vars, input_data['meta']): - if val is not None: - # Assign meta data, using a recursive call... - # Heads to if Meta instance call. - self[item] = val - + raise TypeError("can't recast") + else: + to_be_set = pysat.utils.listify( + self.labels.label_type[ + iattr])[0](to_be_set) + + # Inform user data was recast + pysat.logger.info(''.join(( + wmsg, '. Recasting input for ', + repr(var), ' with key ', repr(ikey)))) + except (TypeError, ValueError): + # Warn user data was dropped + warnings.warn(''.join(( + wmsg, '. Dropping input for ', + repr(var), ' with key ', + repr(ikey)))) + good_set = False + else: + # Extend the meta labels. Ensure the attribute + # name has no spaces and that bytes are used instead + # of strings. + iattr = ikey.replace(" ", "_") + itype = type(to_be_set) + if itype == bytes: + itype = str + + # Update the MetaLabels object and the existing + # metadata to ensure all data have all labels + self.labels.update(iattr, ikey, itype) + self._label_setter(ikey, ikey, type(to_be_set)) + + # Set the data + if good_set: + self._data.loc[var, ikey] = to_be_set elif isinstance(input_data, pds.Series): # Outputs from Meta object are a Series. Thus, this takes in input # from a Meta object. Set data using standard assignment via a dict. - in_dict = input_data.to_dict() - if 'children' in in_dict: - child = in_dict.pop('children') - if child is not None: - # If there is data in the child object, assign it here - self._warn_meta_children() - self.ho_data[data_vars] = child - # Remaining items are simply assigned via recursive call - self[data_vars] = in_dict - - elif isinstance(input_data, Meta): - # Dealing with a higher order data set - self._warn_meta_children() - # `data_vars` is only a single name here (by choice for support) - if (data_vars in self._ho_data) and input_data.empty: - # No actual metadata provided and there is already some - # higher order metadata in self - return - - # Get Meta approved variable data names - new_item_name = self.var_case_name(data_vars) - - # Ensure that Meta labels of object to be assigned are - # consistent with self. input_data accepts self's labels. - input_data.accept_default_labels(self) - - # Go through and ensure Meta object to be added has variable and - # attribute names consistent with other variables and attributes - # this covers custom attributes not handled by default routine - # above - attr_names = [item for item in input_data.attrs()] - input_data.data.columns = self.attr_case_name(attr_names) - - # Same thing for variables - input_data.data.index = self.var_case_name(input_data.data.index) - - # Assign Meta object now that things are consistent with Meta - # object settings, but first make sure there are lower dimension - # metadata parameters, passing in an empty dict fills in defaults - # if there is no existing metadata info. - self[new_item_name] = {} + self[data_vars] = input_data.to_dict() + else: + # The input data is a value, this only works if `data_vars` is + # a tuple that contains the data variable and the metadata label + if isinstance(data_vars, tuple) and len(data_vars) == 2: + self[data_vars[0]] = {data_vars[1]: input_data} + else: + raise ValueError( + "unexpected input combination, can't set metadata") - # Now add to higher order data - self._ho_data[new_item_name] = input_data return def __getitem__(self, key): @@ -565,7 +494,7 @@ def __getitem__(self, key): :: import pysat - inst = pysat.Instrument('pysat', 'testing2d') + inst = pysat.Instrument('pysat', 'testing') inst.load(date=inst.inst_module._test_dates['']['']) meta = inst.meta @@ -578,14 +507,6 @@ def __getitem__(self, key): meta[:, 'units'] meta[:, ['units', 'long_name']] - # For higher order data, slicing is not supported for multiple - # parents with any children - meta['profiles', 'density', 'units'] - meta['profiles', 'density', ['units', 'long_name']] - meta['profiles', ['density', 'dummy_str'], ['units', 'long_name']] - meta['profiles', ('units', 'long_name')] - meta[['series_profiles', 'profiles'], ('units', 'long_name')] - """ # Define a local convenience function def match_name(func, var_name, index_or_column): @@ -605,43 +526,11 @@ def match_name(func, var_name, index_or_column): # If tuple length is 2, index, column new_index = match_name(self.var_case_name, key[0], self.data.index) - try: - # Assume this is a label name - new_name = match_name(self.attr_case_name, key[1], - self.data.columns) - return self.data.loc[new_index, new_name] - except KeyError as kerr: - # This may instead be a child variable, check for children - if(hasattr(self[new_index], 'children') - and self[new_index].children is None): - raise kerr - - try: - new_child_index = match_name( - self.attr_case_name, key[1], - self[new_index].children.data.index) - return self.ho_data[new_index].data.loc[new_child_index] - except AttributeError: - raise NotImplementedError( - ''.join(['Cannot retrieve child meta data ', - 'from multiple parents'])) - - elif len(key) == 3: - # If tuple length is 3, index, child_index, column - new_index = match_name(self.var_case_name, key[0], - self.data.index) - try: - new_child_index = match_name( - self.attr_case_name, key[1], - self[new_index].children.data.index) - except AttributeError: - raise NotImplementedError( - 'Cannot retrieve child meta data from multiple parents') - - new_name = match_name(self.attr_case_name, key[2], + + # Assume this is a label name + new_name = match_name(self.attr_case_name, key[1], self.data.columns) - return self.ho_data[new_index].data.loc[new_child_index, - new_name] + return self.data.loc[new_index, new_name] elif isinstance(key, list): # If key is a list, selection works as-is @@ -665,15 +554,9 @@ def match_name(func, var_name, index_or_column): # above and commented .copy code below have been kept. Remove # for any subsequent releases if things are still ok. meta_row = self.data.loc[new_key] # .copy() - if new_key in self.keys_nD(): - meta_row.at['children'] = self.ho_data[new_key] # .copy() - else: - # Not higher order meta. Assign value of None. First, we - # assign a string, and then None. Ensures column is not - # a numeric data type. - meta_row.at['children'] = '' - meta_row.at['children'] = None return meta_row + elif key in self.header.global_attrs: + return getattr(self.header, key) else: raise KeyError("Key '{:}' not found in MetaData".format(key)) else: @@ -701,12 +584,26 @@ def __contains__(self, data_var): if data_var.lower() in [ikey.lower() for ikey in self.keys()]: does_contain = True - if not does_contain: - if data_var.lower() in [ikey.lower() for ikey in self.keys_nD()]: - does_contain = True - return does_contain + def __delitem__(self, key): + """Delete a key by calling `drop` method. + + Parameters + ---------- + key : str or list-like + A meta data variable, label, or MetaHeader attribute; which are + considered in that order. + + Raises + ------ + KeyError + If all key values are unavailable + + """ + self.drop(key) + return + def __eq__(self, other_meta): """Check equality between Meta instances. @@ -754,39 +651,6 @@ def __eq__(self, other_meta): other_meta[key, attr]): return False - # Check the higher order products. Recursive call into this function - # didn't work, so spell out the details. - keys1 = [key for key in self.keys_nD()] - keys2 = [key for key in other_meta.keys_nD()] - try: - testing.assert_lists_equal(keys1, keys2) - except AssertionError: - return False - - # Check the higher order variables within each nD key are the same. - # NaN is treated as equal, though mathematically NaN is not equal - # to anything. - for key in self.keys_nD(): - for iter1, iter2 in [(self[key].children.keys(), - other_meta[key].children.keys()), - (self[key].children.attrs(), - other_meta[key].children.attrs())]: - list1 = [value for value in iter1] - list2 = [value for value in iter2] - - try: - testing.assert_lists_equal(list1, list2) - except AssertionError: - return False - - # Check if all elements are individually equal - for ckey in self[key].children.keys(): - for cattr in self[key].children.attrs(): - if not testing.nan_equal( - self[key].children[ckey, cattr], - other_meta[key].children[ckey, cattr]): - return False - # If we made it this far, things are good return True @@ -821,8 +685,8 @@ def _insert_default_values(self, data_var, data_type=None): for i, lattr in enumerate(self.labels.label_type.keys()): labels.append(getattr(self.labels, lattr)) lattrs.append(lattr) - if(isinstance(self.labels.label_type[lattr], tuple) - and data_type is not None): + if all([isinstance(self.labels.label_type[lattr], tuple), + data_type is not None]): need_data_type[lattr] = True else: need_data_type[lattr] = False @@ -913,12 +777,6 @@ def _label_setter(self, new_label, current_label, default_type, 'Meta instantiation.')) pysat.logger.info(mstr) - # Check higher order structures and recursively change labels - for key in self.keys_nD(): - # Update children - self.ho_data[key]._label_setter(new_label, current_label, - default_type, use_names_default) - return # ----------------------------------------------------------------------- @@ -940,22 +798,6 @@ def data(self, new_frame): self._data = new_frame return - @property - def ho_data(self): - """Retrieve higher order data. - - May be set using `ho_data.setter(new_dict)`, where `new_dict` is a - dict containing the higher order metadata. - - """ - return self._ho_data - - @ho_data.setter - def ho_data(self, new_dict): - # Set the ho_data property. See docstring for property above. - self._ho_data = new_dict - return - @property def empty(self): """Return boolean True if there is no metadata. @@ -967,8 +809,6 @@ def empty(self): """ - # Only need to check on lower data since lower data - # is set when higher metadata assigned. if self.data.empty: return True else: @@ -986,7 +826,6 @@ def merge(self, other): for key in other.keys(): if key not in self: - # Copies over both lower and higher dimensional data self[key] = other[key] return @@ -995,18 +834,57 @@ def drop(self, names): Parameters ---------- - names : list-like - List of strings specifying the variable names to drop + names : str or list-like + String or list of strings specifying the variable names to drop + + Raises + ------ + KeyError + If all of the keys provided in `names` is not found in the + standard metadata, labels, or header metadata. If a subset is + missing, a logger warning is issued instead. """ + # Ensure the input is list-like + names = listify(names) + + # Divide the names by category + data_names = [] + label_names = [] + header_names = [] + bad_names = [] + for name in names: + if name in self.keys(): + data_names.append(name) + elif name in self.data.columns: + label_names.append(name) + elif name in self.header.global_attrs: + header_names.append(name) + else: + bad_names.append(name) - # Drop the lower dimension data - self.data = self._data.drop(names, axis=0) + # Drop the data + if len(data_names) > 0: + # Drop the lower dimension data + self.data = self._data.drop(data_names, axis=0) - # Drop the higher dimension data - for name in names: - if name in self._ho_data: - self._ho_data.pop(name) + if len(label_names) > 0: + # This is a metadata label + self.data = self._data.drop(label_names, axis=1) + + # Also drop this from Labels + self.labels.drop(label_names) + + if len(header_names) > 0: + # There is header metadata to drop + self.header.drop(header_names) + + if len(bad_names) > 0: + estr = "{:} not found in Meta".format(repr(bad_names)) + if len(data_names) + len(label_names) + len(header_names) == 0: + raise KeyError(estr) + else: + pysat.logger.warning(estr) return def keep(self, keep_names): @@ -1029,6 +907,7 @@ def keep(self, keep_names): # Drop names not specified in keep_names list self.drop(drop_names) + return def apply_meta_labels(self, other_meta): @@ -1157,11 +1036,6 @@ def keys(self): for ikey in self.data.index: yield ikey - def keys_nD(self): - """Yield keys for higher order metadata.""" - for ndkey in self.ho_data: - yield ndkey - def attrs(self): """Yield metadata products stored for each variable name.""" for dcol in self.data.columns: @@ -1181,10 +1055,6 @@ def hasattr_case_neutral(self, attr_name): True if the case-insensitive check for attribute name is successful, False if no attribute name is present. - Note - ---- - Does not check higher order meta objects - """ if attr_name.lower() in [dcol.lower() for dcol in self.data.columns]: @@ -1207,11 +1077,9 @@ def attr_case_name(self, name): Note ---- - Checks first within standard attributes. If not found there, checks - attributes for higher order data structures. If not found, returns - supplied name as it is available for use. Intended to be used to help - ensure that the same case is applied to all repetitions of a given - variable name. + Checks first within standard attributes. If not found, returns supplied + name as it is available for use. Intended to be used to help ensure that + the same case is applied to all repetitions of a given variable name. """ @@ -1229,8 +1097,6 @@ def attr_case_name(self, name): # Create a list of all attribute names and lower case attribute names self_keys = [key for key in self.attrs()] - for key in list(self.keys_nD()): - self_keys.extend(self.ho_data[key].data.columns) lower_self_keys = [key.lower() for key in self_keys] case_names = [] @@ -1258,18 +1124,11 @@ def rename(self, mapper): Dictionary with old names as keys and new names as variables or a function to apply to all names - Raises - ------ - ValueError - When normal data is treated like higher-order data in dict mapping. - Note ---- - Checks first within standard attributes. If not found there, checks - attributes for higher order data structures. If not found, returns - supplied name as it is available for use. Intended to be used to help - ensure that the same case is applied to all repetitions of a given - variable name. + Checks first within standard attributes. If not found, returns supplied + name as it is available for use. Intended to be used to help ensure that + the same case is applied to all repetitions of a given variable name. """ @@ -1278,37 +1137,14 @@ def rename(self, mapper): # Update the attribute name map_var = core_utils.get_mapped_value(var, mapper) if map_var is not None: - if isinstance(map_var, dict): - if var in self.keys_nD(): - child_meta = self[var].children.copy() - child_meta.rename(map_var) - self.ho_data[var] = child_meta - else: - raise ValueError('unknown mapped value at {:}'.format( - repr(var))) - else: - # Get and update the meta data - hold_meta = self[var].copy() - hold_meta.name = map_var - - # Remove the metadata under the previous variable name - self.drop(var) - if var in self.ho_data: - del self.ho_data[var] - - # Re-add the meta data with the updated variable name - self[map_var] = hold_meta - - # Determine if the attribute is present in higher order - # structures - if map_var in self.keys_nD(): - # The children attribute is a Meta class object. - # Recursively call the current routine. The only way to - # avoid Meta undoing the renaming process is to assign - # the meta data to `ho_data`. - child_meta = self[map_var].children.copy() - child_meta.rename(mapper) - self.ho_data[map_var] = child_meta + hold_meta = self[var].copy() + hold_meta.name = map_var + + # Remove the metadata under the previous variable name + self.drop(var) + + # Re-add the meta data with the updated variable name + self[map_var] = hold_meta return @@ -1352,14 +1188,10 @@ def concat(self, other_meta, strict=False): other_meta_updated = other_meta.copy() other_meta_updated.labels = self.labels - # Concat 1D metadata in data frames to copy of current metadata + # Concat metadata in data frames to copy of current metadata for key in other_meta_updated.keys(): mdata.data.loc[key] = other_meta.data.loc[key] - # Combine the higher order meta data - for key in other_meta_updated.keys_nD(): - mdata.ho_data[key] = other_meta.ho_data[key] - return mdata def copy(self): @@ -1389,8 +1221,6 @@ def pop(self, label_name): if new_name in self.keys(): output = self[new_name] self.data = self.data.drop(new_name, axis=0) - else: - output = self.ho_data.pop(new_name) else: raise KeyError('Key not present in metadata variables') @@ -1568,26 +1398,6 @@ def to_dict(self, preserve_case=False): for orig_key in meta_dict: export_dict[case_key][orig_key] = meta_dict[orig_key] - # Higher Order Data - # TODO(#789): remove in pysat 3.2.0 - for key in self.ho_data: - if preserve_case: - case_key = self.var_case_name(key) - else: - case_key = key.lower() - - if case_key not in export_dict: - export_dict[case_key] = {} - for ho_key in self.ho_data[key].data.index: - if preserve_case: - case_ho_key = self.var_case_name(ho_key) - else: - case_ho_key = ho_key.lower() - - new_key = '_'.join((case_key, case_ho_key)) - export_dict[new_key] = \ - self.ho_data[key].data.loc[ho_key].to_dict() - return export_dict @classmethod @@ -1650,15 +1460,6 @@ def from_csv(cls, filename=None, col_names=None, sep=None, **kwargs): raise ValueError(''.join(['Unable to retrieve information from ', filename])) - # TODO(#789): remove in pysat 3.2.0 - def _warn_meta_children(self): - """Warn the user that higher order metadata is deprecated.""" - - warnings.warn(" ".join(["Support for higher order metadata has been", - "deprecated and will be removed in 3.2.0+."]), - DeprecationWarning, stacklevel=2) - return - class MetaLabels(object): """Store metadata labels for Instrument instance. @@ -1728,10 +1529,6 @@ class MetaLabels(object): but the original case is preserved. Case preservation is built in to support writing files with a desired case to meet standards. - Metadata for higher order data objects, those that have - multiple products under a single variable name in a `pysat.Instrument` - object, are stored by providing a Meta object under the single name. - Supports any custom metadata values in addition to the expected metadata attributes (units, name, notes, desc, value_min, value_max, and fill). These base attributes may be used to programatically access and set types @@ -2029,6 +1826,37 @@ def default_values_from_attr(self, attr_name, data_type=None): return default_val + def drop(self, names): + """Remove data from MetaLabels. + + Parameters + ---------- + names : str or list-like + Attribute or MetaData name(s) + + Raises + ------ + AttributeError or KeyError + If any part of `names` is missing and cannot be dropped + + """ + # Ennsure the input is list-likee + names = listify(names) + + # Cycle through each name to drop + for name in names: + if name in self.label_attrs.keys(): + lname = self.label_attrs[name] + delattr(self, lname) + del self.label_type[lname] + del self.label_attrs[name] + else: + lname = getattr(self, name) + delattr(self, name) + del self.label_type[name] + del self.label_attrs[lname] + return + def update(self, lattr, lname, ltype): """Update MetaLabels with a new label. @@ -2207,6 +2035,25 @@ def __eq__(self, other): return True + def drop(self, names): + """Drop variables (names) from MetaHeader. + + Parameters + ---------- + names : list-like + List of strings specifying the variable names to drop + + """ + names = listify(names) + + for name in names: + # Delete the attribute + delattr(self, name) + + # Remove the attribute from the attribute list + self.global_attrs.pop(self.global_attrs.index(name)) + return + def to_dict(self): """Convert the header data to a dictionary. diff --git a/pysat/_orbits.py b/pysat/_orbits.py index 93effb84d..6a1301e50 100644 --- a/pysat/_orbits.py +++ b/pysat/_orbits.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- import copy @@ -465,7 +468,8 @@ def _equa_breaks(self, orbit_index_period=24.0): # values new_ind = [] for idx in ind: - tidx, = np.where(lt_diff[(idx - 5):(idx + 6)] + sub_idx = slice((idx - 5), (idx + 6)) + tidx, = np.where(lt_diff[sub_idx] > 10 * typical_lt_diff) if len(tidx) != 0: @@ -473,9 +477,11 @@ def _equa_breaks(self, orbit_index_period=24.0): # Iterate over samples and check. for sub_tidx in tidx: # Look at time change vs local time change - if(ut_diff[idx - 5:idx + 6].iloc[sub_tidx] - < lt_diff[idx - 5:idx + 6].iloc[sub_tidx] - / orbit_index_period * self.orbit_period): + false_alarm = ( + ut_diff[sub_idx].iloc[sub_tidx] * orbit_index_period + < lt_diff[sub_idx].iloc[sub_tidx] + * self.orbit_period) + if false_alarm: # The change in UT is small compared to the change # in the orbit index this is flagged as a false diff --git a/pysat/_params.py b/pysat/_params.py index 1bc5fc906..4dc572b3d 100644 --- a/pysat/_params.py +++ b/pysat/_params.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- import copy diff --git a/pysat/citation.txt b/pysat/citation.txt index 2a546499a..d6f09ccc0 100644 --- a/pysat/citation.txt +++ b/pysat/citation.txt @@ -1 +1 @@ -Stoneback, Russell, et al. (2021). pysat/pysat v3.0 (Version v3.0). Zenodo. http://doi.org/10.5281/zenodo.1199703 \ No newline at end of file +Stoneback, Russell, et al. (2023). pysat/pysat v3.1 (Version v3.1). Zenodo. http://doi.org/10.5281/zenodo.1199703 diff --git a/pysat/constellations/single_test.py b/pysat/constellations/single_test.py index c080cfedd..4709ceacd 100644 --- a/pysat/constellations/single_test.py +++ b/pysat/constellations/single_test.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Create a constellation with one testing instrument. Attributes diff --git a/pysat/constellations/testing.py b/pysat/constellations/testing.py index 1b7917e80..d4b715d9c 100644 --- a/pysat/constellations/testing.py +++ b/pysat/constellations/testing.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Create a constellation with 5 testing instruments. Attributes @@ -13,8 +21,8 @@ import pysat instruments = [pysat.Instrument('pysat', 'testing', clean_level='clean', - num_samples=10, use_header=True), + num_samples=10), pysat.Instrument('pysat', 'ndtesting', clean_level='clean', - num_samples=16, use_header=True), + num_samples=16), pysat.Instrument('pysat', 'testmodel', clean_level='clean', - num_samples=18, use_header=True)] + num_samples=18)] diff --git a/pysat/constellations/testing_empty.py b/pysat/constellations/testing_empty.py index 8da50b08d..dd9f2d59a 100644 --- a/pysat/constellations/testing_empty.py +++ b/pysat/constellations/testing_empty.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Create an empty constellation for testing. Attributes diff --git a/pysat/constellations/testing_partial.py b/pysat/constellations/testing_partial.py index 800e9850c..351901e07 100644 --- a/pysat/constellations/testing_partial.py +++ b/pysat/constellations/testing_partial.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Create a constellation where not all instruments have loadable data. Attributes @@ -9,6 +17,6 @@ import pysat instruments = [pysat.Instrument('pysat', 'testing', clean_level='clean', - num_samples=10, use_header=True), + num_samples=10), pysat.Instrument('pysat', 'testing', tag='no_download', - clean_level='clean', use_header=True)] + clean_level='clean')] diff --git a/pysat/instruments/__init__.py b/pysat/instruments/__init__.py index d1fb406db..882080c8b 100644 --- a/pysat/instruments/__init__.py +++ b/pysat/instruments/__init__.py @@ -5,8 +5,7 @@ """ __all__ = ['pysat_ndtesting', 'pysat_netcdf', 'pysat_testing', - 'pysat_testmodel', 'pysat_testing_xarray', - 'pysat_testing2d', 'pysat_testing2d_xarray'] + 'pysat_testmodel'] for inst in __all__: exec("from pysat.instruments import {x}".format(x=inst)) diff --git a/pysat/instruments/methods/__init__.py b/pysat/instruments/methods/__init__.py index 44278ead2..eca32e606 100644 --- a/pysat/instruments/methods/__init__.py +++ b/pysat/instruments/methods/__init__.py @@ -4,5 +4,5 @@ Each set of methods is contained within a subpackage of this set. """ -from pysat.instruments.methods import general -from pysat.instruments.methods import testing +from pysat.instruments.methods import general # noqa: F401 +from pysat.instruments.methods import testing # noqa: F401 diff --git a/pysat/instruments/methods/general.py b/pysat/instruments/methods/general.py index 732163702..0a7cedfa0 100644 --- a/pysat/instruments/methods/general.py +++ b/pysat/instruments/methods/general.py @@ -1,10 +1,17 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- # -*- coding: utf-8 -*- """Provides generalized routines for integrating instruments into pysat.""" import datetime as dt import numpy as np import pandas as pds -import warnings import pysat @@ -131,8 +138,8 @@ def list_files(tag='', inst_id='', data_path='', format_str=None, new_out = out.asfreq('D') for i, out_month in enumerate(out.index): - if(out_month.month == emonth.month - and out_month.year == emonth.year): + if all([out_month.month == emonth.month, + out_month.year == emonth.year]): out_month = emonth crange = pds.date_range(start=out_month, periods=2, @@ -148,43 +155,6 @@ def list_files(tag='', inst_id='', data_path='', format_str=None, return out -def convert_timestamp_to_datetime(inst, sec_mult=1.0, epoch_name='time'): - """Use datetime instead of timestamp for Epoch. - - .. deprecated:: 3.0.2 - This routine has been deprecated with the addition of the kwargs - `epoch_unit` and `epoch_origin` to `pysat.utils.io.load_netcdf4`. - This routing will be removed in 3.2.0. - - Parameters - ---------- - inst : pysat.Instrument - associated pysat.Instrument object - sec_mult : float - Multiplier needed to convert epoch time to seconds (default=1.0) - epoch_name : str - variable name for instrument index (default='Epoch') - - Note - ---- - If the variable represented by epoch_name is not a float64, data is passed - through unchanged. - - """ - - warnings.warn(" ".join(["New kwargs added to `pysat.utils.io.load_netCDF4`", - "for generalized handling, deprecated", - "function will be removed in pysat 3.2.0+"]), - DeprecationWarning, stacklevel=2) - - if inst.data[epoch_name].dtype == 'float64': - inst.data[epoch_name] = pds.to_datetime( - [dt.datetime.utcfromtimestamp(int(np.floor(epoch_time * sec_mult))) - for epoch_time in inst.data[epoch_name]]) - - return - - def remove_leading_text(inst, target=None): """Remove leading text on variable names. @@ -216,15 +186,6 @@ def remove_leading_text(inst, target=None): inst.meta.data = inst.meta.data.rename( index=lambda x: x.split(prepend_str)[-1]) - orig_keys = [kk for kk in inst.meta.keys_nD()] - - for keynd in orig_keys: - if keynd.find(prepend_str) >= 0: - new_key = keynd.split(prepend_str)[-1] - new_meta = inst.meta.pop(keynd) - new_meta.data = new_meta.data.rename( - index=lambda x: x.split(prepend_str)[-1]) - inst.meta[new_key] = new_meta return @@ -302,5 +263,12 @@ def load_csv_data(fnames, read_csv_kwargs=None): for fname in fnames: fdata.append(pds.read_csv(fname, **read_csv_kwargs)) - data = pds.DataFrame() if len(fdata) == 0 else pds.concat(fdata, axis=0) + if len(fdata) == 0: + data = pds.DataFrame() + else: + data = pds.concat(fdata, axis=0) + + if data.index.name is None: + data.index.name = "Epoch" + return data diff --git a/pysat/instruments/methods/testing.py b/pysat/instruments/methods/testing.py index bf9cbeef1..1c40414ee 100644 --- a/pysat/instruments/methods/testing.py +++ b/pysat/instruments/methods/testing.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Standard functions for the test instruments.""" import datetime as dt @@ -7,6 +15,7 @@ import pandas as pds import time import warnings +import xarray as xr import pysat from pysat.utils import NetworkLock @@ -16,8 +25,7 @@ "https://www.github.com/pysat/pysat")) # Load up citation information -with pysat.utils.NetworkLock(os.path.join(pysat.here, 'citation.txt'), 'r') as \ - locked_file: +with pysat.utils.NetworkLock(pysat.citation, 'r') as locked_file: refs = locked_file.read() @@ -56,16 +64,99 @@ def clean(self, test_clean_kwarg=None): Parameters ---------- test_clean_kwarg : any - Testing keyword (default=None) + Testing keyword. If these keywords contain 'logger', 'warning', or + 'error', the message entered as the value to that key will be returned + as a logging.WARNING, UserWarning, or ValueError, respectively. If the + 'change' kwarg is set, the clean level will be changed to the specified + value. (default=None) """ self.test_clean_kwarg = test_clean_kwarg + if isinstance(test_clean_kwarg, dict): + if 'change' in test_clean_kwarg.keys(): + self.clean_level = test_clean_kwarg['change'] + + if 'logger' in test_clean_kwarg.keys(): + pysat.logger.warning(test_clean_kwarg['logger']) + + if 'warning' in test_clean_kwarg.keys(): + warnings.warn(test_clean_kwarg['warning'], UserWarning) + + if 'error' in test_clean_kwarg.keys(): + raise ValueError(test_clean_kwarg['error']) + + return + + +# Optional methods +def concat_data(self, new_data, **kwargs): + """Concatonate data to self.data for extra time dimensions. + + Parameters + ---------- + new_data : xarray.Dataset or list of such objects + New data objects to be concatonated + **kwargs : dict + Optional keyword arguments passed to xr.concat + + Note + ---- + Expects the extra time dimensions to have a variable name that starts + with 'time', and no other dimensions to have a name that fits this format. + + """ + # Establish the time dimensions, ensuring the standard variable is included + # whether or not it is treated as a variable + time_dims = [self.index.name] + time_dims.extend([var for var in self.variables if var.find('time') == 0 + and var != self.index.name]) + + # Concatonate using the appropriate method for the number of time + # dimensions + if len(time_dims) == 1: + # There is only one time dimensions, but other dimensions may + # need to be adjusted + new_data = pysat.utils.coords.expand_xarray_dims( + new_data, self.meta, exclude_dims=time_dims) + + # Specify the dimension, if not otherwise specified + if 'dim' not in kwargs: + kwargs['dim'] = self.index.name + + self.data = xr.concat(new_data, **kwargs) + else: + inners = None + for ndata in new_data: + # Separate into inner datasets + inner_keys = {dim: [key for key in ndata.keys() + if dim in ndata[key].dims] for dim in time_dims} + inner_dat = {dim: ndata.get(inner_keys[dim]) for dim in time_dims} + + # Add 'single_var's into 'time' dataset to keep track + sv_keys = [val.name for val in ndata.values() + if 'single_var' in val.dims] + singlevar_set = ndata.get(sv_keys) + inner_dat[self.index.name] = xr.merge([inner_dat[self.index.name], + singlevar_set]) + + # Concatenate along desired dimension with previous data + if inners is None: + # No previous data, assign the data separated by dimension + inners = dict(inner_dat) + else: + # Concatenate with existing data + inners = {dim: xr.concat([inners[dim], inner_dat[dim]], + dim=dim) for dim in time_dims} + + # Combine all time dimensions + if inners is not None: + data_list = [inners[dim] for dim in time_dims] + self.data = xr.merge(data_list) return -# Optional method def preprocess(self, test_preprocess_kwarg=None): """Perform standard preprocessing. @@ -79,12 +170,12 @@ def preprocess(self, test_preprocess_kwarg=None): Testing keyword (default=None) """ - self.test_preprocess_kwarg = test_preprocess_kwarg return +# Utility functions def initialize_test_meta(epoch_name, data_keys): """Initialize meta data for test instruments. @@ -163,41 +254,16 @@ def initialize_test_meta(epoch_name, data_keys): 'Note the value_max is largest netCDF4 supports, ', 'but is lower than actual 64-bit int limit.'])} - # Children metadata required for 2D pandas. - # TODO(#789): Delete after removal of Meta children. - series_profile_meta = pysat.Meta() - series_profile_meta['series_profiles'] = {'desc': 'Testing series data.', - 'value_min': 0, - 'value_max': np.inf, - 'units': 'm/s'} - meta['series_profiles'] = {'meta': series_profile_meta, - 'value_min': 0., 'value_max': 25., 'units': 'km', - 'fill': np.nan, - 'desc': ''.join(['Testing series profiles ', - 'indexed by float.'])} - - # Children metadata required for 2D pandas. - # TODO(#789): Delete after removal of Meta children. - data_types = {'density': float, 'fraction': float, 'alt_profiles': float, - 'variable_profiles': float, 'profile_height': int, - 'variable_profile_height': int, 'images': int, 'x': int, - 'y': int, 'z': int, 'image_lat': float, 'image_lon': float} - alt_profile_meta = pysat.Meta() - alt_profile_meta['density'] = {'desc': 'Simulated density values.', - 'units': 'Log N/cc', - 'value_min': 0, 'value_max': np.inf} - alt_profile_meta['fraction'] = {'value_min': 0., 'value_max': 1., - 'desc': ''.join(['Simulated fractional O+ ', - 'composition.'])} - meta['alt_profiles'] = {'value_min': 0., 'value_max': 25., 'fill': np.nan, - 'desc': ''.join([ - 'Testing profile multi-dimensional data ', - 'indexed by float.']), - 'units': 'km', - 'meta': alt_profile_meta} + # Optional and standard metadata for xarray + for var in data_keys: + if var.find('variable_profiles') == 0: + meta[var] = {'desc': 'Profiles with variable altitude.'} + + if len(var) > 17: + tvar = 'time{:s}'.format(var[17:]) + meta[tvar] = {'desc': 'Additional time variable.'} # Standard metadata required for xarray. - meta['variable_profiles'] = {'desc': 'Profiles with variable altitude.'} meta['profile_height'] = {'value_min': 0, 'value_max': 14, 'fill': -1, 'desc': 'Altitude of profile data.'} meta['variable_profile_height'] = {'long_name': 'Variable Profile Height'} @@ -207,13 +273,13 @@ def initialize_test_meta(epoch_name, data_keys): 'notes': 'function of image_lat and image_lon'} meta['x'] = {'desc': 'x-value of image pixel', 'notes': 'Dummy Variable', - 'value_min': 0, 'value_max': 17, 'fill': -1} + 'value_min': 0, 'value_max': 7, 'fill': -1} meta['y'] = {'desc': 'y-value of image pixel', 'notes': 'Dummy Variable', - 'value_min': 0, 'value_max': 17, 'fill': -1} + 'value_min': 0, 'value_max': 7, 'fill': -1} meta['z'] = {'desc': 'z-value of profile height', 'notes': 'Dummy Variable', - 'value_min': 0, 'value_max': 15, 'fill': -1} + 'value_min': 0, 'value_max': 5, 'fill': -1} meta['image_lat'] = {'desc': 'Latitude of image pixel', 'notes': 'Dummy Variable', 'value_min': -90., 'value_max': 90.} @@ -225,8 +291,6 @@ def initialize_test_meta(epoch_name, data_keys): for var in meta.keys(): if var not in data_keys: meta.drop(var) - if var in meta.keys_nD(): - meta.ho_data.pop(var) return meta @@ -713,13 +777,3 @@ def non_unique_index(index): new_index = pds.to_datetime(new_index) return new_index - - -def _warn_malformed_kwarg(): - """Warn user that kwarg has been deprecated.""" - - dstr = ' '.join(['The kwarg malformed_index has been deprecated and', - 'will be removed in pysat 3.2.0+. Please use', - 'non_monotonic_index or non_unique_index to specify', - 'desired behaviour.']) - warnings.warn(dstr, DeprecationWarning, stacklevel=2) diff --git a/pysat/instruments/pysat_ndtesting.py b/pysat/instruments/pysat_ndtesting.py index 581f44446..1feae41f1 100644 --- a/pysat/instruments/pysat_ndtesting.py +++ b/pysat/instruments/pysat_ndtesting.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- # -*- coding: utf-8 -*- """Produces fake instrument data for testing.""" @@ -5,6 +13,7 @@ import functools import numpy as np +import pandas as pds import xarray as xr import pysat @@ -15,8 +24,10 @@ pandas_format = False tags = {'': 'Regular testing data set'} -inst_ids = {'': ['']} -_test_dates = {'': {'': dt.datetime(2009, 1, 1)}} +inst_ids = {'': [tag for tag in tags.keys()]} +_test_dates = {'': {tag: dt.datetime(2009, 1, 1) for tag in tags.keys()}} +_test_load_opt = {'': {'': [{'num_extra_time_coords': 0}, + {'num_extra_time_coords': 1}]}} epoch_name = u'time' @@ -26,13 +37,16 @@ # Clean method clean = mm_test.clean -# Optional method, preprocess +# Optional methods +concat_data = mm_test.concat_data preprocess = mm_test.preprocess -def load(fnames, tag='', inst_id='', non_monotonic_index=False, - non_unique_index=False, malformed_index=False, start_time=None, - num_samples=864, test_load_kwarg=None, max_latitude=90.): +def load(fnames, tag='', inst_id='', sim_multi_file_right=False, + sim_multi_file_left=False, root_date=None, non_monotonic_index=False, + non_unique_index=False, start_time=None, num_samples=864, + sample_rate='100S', test_load_kwarg=None, max_latitude=90.0, + num_extra_time_coords=0): """Load the test files. Parameters @@ -45,27 +59,36 @@ def load(fnames, tag='', inst_id='', non_monotonic_index=False, inst_id : str Instrument ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default='') + sim_multi_file_right : bool + Adjusts date range to be 12 hours in the future or twelve hours beyond + `root_date`. (default=False) + sim_multi_file_left : bool + Adjusts date range to be 12 hours in the past or twelve hours before + `root_date`. (default=False) + root_date : NoneType + Optional central date, uses _test_dates if not specified. + (default=None) non_monotonic_index : bool If True, time index will be non-monotonic (default=False) non_unique_index : bool If True, time index will be non-unique (default=False) - malformed_index : bool - If True, the time index will be non-unique and non-monotonic. Deprecated - and scheduled for removal in pysat 3.2.0. - (default=False) start_time : dt.timedelta or NoneType Offset time of start time since midnight UT. If None, instrument data will begin at midnight. (default=None) num_samples : int Maximum number of times to generate. Data points will not go beyond the current day. (default=864) + sample_rate : str + Frequency of data points, using pandas conventions. (default='100s') test_load_kwarg : any Keyword used for pysat unit testing to ensure that functionality for custom keywords defined in instrument support functions is working correctly. (default=None) max_latitude : float Latitude simulated as `max_latitude` * cos(theta(t))`, where - theta is a linear periodic signal bounded by [0, 2 * pi) (default=90.). + theta is a linear periodic signal bounded by [0, 2 * pi) (default=90.0) + num_extra_time_coords : int + Number of extra time coordinates to include. (default=0) Returns ------- @@ -84,14 +107,18 @@ def load(fnames, tag='', inst_id='', non_monotonic_index=False, drange = mm_test.define_range() # Using 100s frequency for compatibility with seasonal analysis unit tests - uts, index, dates = mm_test.generate_times(fnames, num_samples, freq='100S', + uts, index, dates = mm_test.generate_times(fnames, num_samples, + freq=sample_rate, start_time=start_time) - # TODO(#1094): Remove in pysat 3.2.0 - if malformed_index: - # Warn that kwarg is deprecated and set new kwargs. - mm_test._warn_malformed_kwarg() - non_monotonic_index = True - non_unique_index = True + + # Specify the date tag locally and determine the desired date range + pds_offset = dt.timedelta(hours=12) + if sim_multi_file_right: + root_date = root_date or _test_dates[''][''] + pds_offset + elif sim_multi_file_left: + root_date = root_date or _test_dates[''][''] - pds_offset + else: + root_date = root_date or _test_dates[''][''] if non_monotonic_index: index = mm_test.non_monotonic_index(index) @@ -106,7 +133,7 @@ def load(fnames, tag='', inst_id='', non_monotonic_index=False, # the root start a measurement is and use that info to create a signal # that is continuous from that start. Going to presume there are 5820 # seconds per orbit (97 minute period). - time_delta = dates[0] - dt.datetime(2009, 1, 1) + time_delta = dates[0] - root_date # MLT runs 0-24 each orbit mlt = mm_test.generate_fake_data(time_delta.total_seconds(), uts, @@ -141,6 +168,13 @@ def load(fnames, tag='', inst_id='', non_monotonic_index=False, altitude = alt0 * np.ones(data['latitude'].shape) data['altitude'] = ((epoch_name), altitude) + # Fake orbit number + fake_delta = dates[0] - (_test_dates[''][''] - pds.DateOffset(years=1)) + data['orbit_num'] = ((epoch_name), + mm_test.generate_fake_data(fake_delta.total_seconds(), + uts, period=iperiod['lt'], + cyclic=False)) + # Create some fake data to support testing of averaging routines mlt_int = data['mlt'].astype(int).data long_int = (data['longitude'] / 15.).astype(int).data @@ -165,37 +199,60 @@ def load(fnames, tag='', inst_id='', non_monotonic_index=False, dtype=np.int64)) # Add dummy coords - data.coords['x'] = (('x'), np.arange(17)) - data.coords['y'] = (('y'), np.arange(17)) - data.coords['z'] = (('z'), np.arange(15)) + data.coords['x'] = (('x'), np.arange(7)) + data.coords['y'] = (('y'), np.arange(7)) + data.coords['z'] = (('z'), np.arange(5)) + + # Add extra time coords + for i in range(num_extra_time_coords): + ckey = 'time{:d}'.format(i) + tindex = data.indexes[epoch_name][:-1 * (i + 1)] + data.coords[ckey] = ( + (ckey), [itime + dt.timedelta(microseconds=1 + i) + for i, itime in enumerate(tindex)]) # Create altitude 'profile' at each location to simulate remote data num = len(data['uts']) data['profiles'] = ( (epoch_name, 'profile_height'), - data['dummy3'].values[:, np.newaxis] * np.ones((num, 15))) - data.coords['profile_height'] = ('profile_height', np.arange(15)) + data['dummy3'].values[:, np.newaxis] * np.ones( + (num, data.coords['z'].shape[0]))) + data.coords['profile_height'] = ('profile_height', + np.arange(len(data.coords['z']))) # Profiles that could have different altitude values data['variable_profiles'] = ( (epoch_name, 'z'), data['dummy3'].values[:, np.newaxis] - * np.ones((num, 15))) + * np.ones((num, data.coords['z'].shape[0]))) data.coords['variable_profile_height'] = ( - (epoch_name, 'z'), np.arange(15)[np.newaxis, :] * np.ones((num, 15))) + (epoch_name, 'z'), np.arange(data.coords['z'].shape[0])[np.newaxis, :] + * np.ones((num, data.coords['z'].shape[0]))) # Create fake image type data, projected to lat / lon at some location # from satellite. data['images'] = ((epoch_name, 'x', 'y'), data['dummy3'].values[ - :, np.newaxis, np.newaxis] * np.ones((num, 17, 17))) - data.coords['image_lat'] = \ - ((epoch_name, 'x', 'y'), - np.arange(17)[np.newaxis, - np.newaxis, - :] * np.ones((num, 17, 17))) + :, np.newaxis, np.newaxis] + * np.ones((num, data.coords['x'].shape[0], + data.coords['y'].shape[0]))) + data.coords['image_lat'] = ((epoch_name, 'x', 'y'), + np.arange(data.coords['x'].shape[0])[ + np.newaxis, np.newaxis, :] + * np.ones((num, data.coords['x'].shape[0], + data.coords['y'].shape[0]))) data.coords['image_lon'] = ((epoch_name, 'x', 'y'), - np.arange(17)[np.newaxis, np.newaxis, - :] * np.ones((num, 17, 17))) + np.arange(data.coords['x'].shape[0])[ + np.newaxis, np.newaxis, :] + * np.ones((num, data.coords['x'].shape[0], + data.coords['y'].shape[0]))) + + # There may be data that depends on alternate time indices + for i in range(num_extra_time_coords): + alt_epoch = 'time{:d}'.format(i) + data['variable_profiles{:d}'.format(i)] = ( + (alt_epoch, 'z'), np.full(shape=(data.coords[alt_epoch].shape[0], + data.coords['z'].shape[0]), + fill_value=100.0 + i)) meta = mm_test.initialize_test_meta(epoch_name, data.keys()) return data, meta diff --git a/pysat/instruments/pysat_netcdf.py b/pysat/instruments/pysat_netcdf.py index 2f42de4a1..f989acaf9 100644 --- a/pysat/instruments/pysat_netcdf.py +++ b/pysat/instruments/pysat_netcdf.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """General Instrument for loading pysat-written netCDF files. @@ -44,7 +47,6 @@ import datetime as dt import functools -import numpy as np import warnings import pysat @@ -133,7 +135,7 @@ def download(date_array, tag, inst_id, data_path=None): def load(fnames, tag='', inst_id='', strict_meta=False, file_format='NETCDF4', epoch_name=None, epoch_unit='ms', epoch_origin='unix', pandas_format=True, decode_timedelta=False, meta_kwargs=None, - load_labels=None, meta_processor=None, meta_translation=None, + meta_processor=None, meta_translation=None, drop_meta_labels=None, decode_times=None): """Load pysat-created NetCDF data and meta data. @@ -183,10 +185,6 @@ def load(fnames, tag='', inst_id='', strict_meta=False, file_format='NETCDF4', meta_kwargs : dict or NoneType Dict to specify custom Meta initialization or None to use Meta defaults (default=None) - load_labels : dict or NoneType - Dict where keys are the label attribute names and the values are tuples - that have the label values and value types in that order or None to use - Meta defaults. Deprecated, use `meta_kwargs` instead. (default=None) meta_processor : function or NoneType If not None, a dict containing all of the loaded metadata will be passed to `meta_processor` which should return a filtered version @@ -228,7 +226,6 @@ def load(fnames, tag='', inst_id='', strict_meta=False, file_format='NETCDF4', pandas_format=pandas_format, decode_timedelta=decode_timedelta, meta_kwargs=meta_kwargs, - labels=load_labels, meta_processor=meta_processor, meta_translation=meta_translation, drop_meta_labels=drop_meta_labels, diff --git a/pysat/instruments/pysat_testing.py b/pysat/instruments/pysat_testing.py index c96dccdfe..a4c6183de 100644 --- a/pysat/instruments/pysat_testing.py +++ b/pysat/instruments/pysat_testing.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- # -*- coding: utf-8 -*- """Produces fake instrument data for testing.""" @@ -40,8 +48,8 @@ def load(fnames, tag='', inst_id='', sim_multi_file_right=False, sim_multi_file_left=False, root_date=None, non_monotonic_index=False, - non_unique_index=False, malformed_index=False, start_time=None, - num_samples=86400, test_load_kwarg=None, max_latitude=90.): + non_unique_index=False, start_time=None, num_samples=86400, + test_load_kwarg=None, max_latitude=90.): """Load the test files. Parameters @@ -67,9 +75,6 @@ def load(fnames, tag='', inst_id='', sim_multi_file_right=False, If True, time index will be non-monotonic (default=False) non_unique_index : bool If True, time index will be non-unique (default=False) - malformed_index : bool - If True, the time index will be non-unique and non-monotonic. Deprecated - and scheduled for removal in pysat 3.2.0. (default=False) start_time : dt.timedelta or NoneType Offset time of start time since midnight UT. If None, instrument data will begin at midnight. (default=None) @@ -96,6 +101,10 @@ def load(fnames, tag='', inst_id='', sim_multi_file_right=False, # Support keyword testing pysat.logger.info(''.join(('test_load_kwarg = ', str(test_load_kwarg)))) + # If no download should be simulated, return empty `data` and `meta` objects + if tag == 'no_download': + return pds.DataFrame(), pysat.Meta() + # Create an artificial satellite data set iperiod = mm_test.define_period() drange = mm_test.define_range() @@ -164,13 +173,6 @@ def load(fnames, tag='', inst_id='', sim_multi_file_right=False, data['int32_dummy'] = np.ones(len(data), dtype=np.int32) data['int64_dummy'] = np.ones(len(data), dtype=np.int64) - # TODO(#1094): Remove in pysat 3.2.0 - if malformed_index: - # Warn that kwarg is deprecated and set new kwargs. - mm_test._warn_malformed_kwarg() - non_monotonic_index = True - non_unique_index = True - # Activate if non-monotonic index is needed. if np.any([non_monotonic_index, (tag == 'non_strict')]): index = mm_test.non_monotonic_index(index) @@ -182,16 +184,14 @@ def load(fnames, tag='', inst_id='', sim_multi_file_right=False, data.index = index data.index.name = 'Epoch' + # If we only want data and not metadata stop now + if tag == 'default_meta': + return data, pysat.Meta() + # Set the meta data meta = mm_test.initialize_test_meta('Epoch', data.keys()) - # TODO(#1120): Move logic up so that empty data is returned first. - if tag == 'default_meta': - return data, pysat.Meta() - elif tag == 'no_download': - return pds.DataFrame(), pysat.Meta() - else: - return data, meta + return data, meta list_files = functools.partial(mm_test.list_files, test_dates=_test_dates) diff --git a/pysat/instruments/pysat_testing2d.py b/pysat/instruments/pysat_testing2d.py deleted file mode 100644 index 76819a6a4..000000000 --- a/pysat/instruments/pysat_testing2d.py +++ /dev/null @@ -1,231 +0,0 @@ -# -*- coding: utf-8 -*- -"""Produces fake instrument data for testing. - -.. deprecated:: 3.0.2 - Support for 2D pandas objects will be removed in 3.2.0+. This instrument - module simulates an object that will no longer be supported. - -""" - -import datetime as dt -import functools -import numpy as np -import warnings - -import pandas as pds - -import pysat -from pysat.instruments.methods import testing as mm_test - -platform = 'pysat' -name = 'testing2d' -tags = {'': 'Regular testing data set'} -inst_ids = {'': ['']} -_test_dates = {'': {'': dt.datetime(2009, 1, 1)}} - - -# Init method -def init(self, test_init_kwarg=None): - """Initialize the test instrument. - - Parameters - ---------- - self : pysat.Instrument - This object - test_init_kwarg : any - Testing keyword (default=None) - - """ - - warnings.warn(" ".join(["The instrument module `pysat_testing2d` has been", - "deprecated and will be removed in 3.2.0+. This", - "module simulates an object that will no longer be", - "supported."]), - DeprecationWarning, stacklevel=2) - - mm_test.init(self, test_init_kwarg=test_init_kwarg) - return - - -# Clean method -clean = mm_test.clean - -# Optional method, preprocess -preprocess = mm_test.preprocess - - -def load(fnames, tag='', inst_id='', malformed_index=False, - start_time=None, num_samples=864, test_load_kwarg=None, - max_latitude=90.): - """Load the test files. - - Parameters - ---------- - fnames : list - List of filenames - tag : str - Tag name used to identify particular data set to be loaded. - This input is nominally provided by pysat itself. (default='') - inst_id : str - Instrument ID used to identify particular data set to be loaded. - This input is nominally provided by pysat itself. (default='') - malformed_index : bool - If True, the time index will be non-unique and non-monotonic. - (default=False) - start_time : dt.timedelta or NoneType - Offset time of start time since midnight UT. If None, instrument data - will begin at midnight. - (default=None) - num_samples : int - Maximum number of times to generate. Data points will not go beyond the - current day. (default=864) - test_load_kwarg : any - Keyword used for pysat unit testing to ensure that functionality for - custom keywords defined in instrument support functions is working - correctly. (default=None) - max_latitude : float - Latitude simulated as `max_latitude` * cos(theta(t))`, where - theta is a linear periodic signal bounded by [0, 2 * pi) (default=90.). - - Returns - ------- - data : pds.DataFrame - Testing data - meta : pysat.Meta - Testing metadata - - """ - - # Support keyword testing - pysat.logger.info(''.join(('test_load_kwarg = ', str(test_load_kwarg)))) - - # Create an artificial satellite data set - iperiod = mm_test.define_period() - drange = mm_test.define_range() - - # Using 100s frequency for compatibility with seasonal analysis unit tests - uts, index, dates = mm_test.generate_times(fnames, num_samples, freq='100S', - start_time=start_time) - # Seed the DataFrame with a UT array - data = pds.DataFrame(np.mod(uts, 86400.), columns=['uts']) - - # Need to create simple orbits here. Have start of first orbit - # at 2009,1, 0 UT. 14.84 orbits per day. Figure out how far in time from - # the root start a measurement is and use that info to create a signal - # that is continuous from that start. Going to presume there are 5820 - # seconds per orbit (97 minute period). - time_delta = dates[0] - dt.datetime(2009, 1, 1) - - # MLT runs 0-24 each orbit - data['mlt'] = mm_test.generate_fake_data(time_delta.total_seconds(), uts, - period=iperiod['lt'], - data_range=drange['lt']) - - # SLT, 20 second offset from `mlt`. - data['slt'] = mm_test.generate_fake_data(time_delta.total_seconds() + 20, - uts, period=iperiod['lt'], - data_range=drange['lt']) - - # Create a fake longitude, resets every 6240 seconds. Sat moves at - # 360/5820 deg/s, Earth rotates at 360/86400, takes extra time to go - # around full longitude. - data['longitude'] = mm_test.generate_fake_data(time_delta.total_seconds(), - uts, period=iperiod['lon'], - data_range=drange['lon']) - - # Create latitude signal for testing polar orbits - angle = mm_test.generate_fake_data(time_delta.total_seconds(), - uts, period=iperiod['angle'], - data_range=drange['angle']) - data['latitude'] = max_latitude * np.cos(angle) - - # Create constant altitude at 400 km - alt0 = 400.0 - data['altitude'] = alt0 * np.ones(data['latitude'].shape) - - # Dummy variable data for different types - data['string_dummy'] = ['test'] * len(data) - data['unicode_dummy'] = [u'test'] * len(data) - data['int8_dummy'] = np.ones(len(data), dtype=np.int8) - data['int16_dummy'] = np.ones(len(data), dtype=np.int16) - data['int32_dummy'] = np.ones(len(data), dtype=np.int32) - data['int64_dummy'] = np.ones(len(data), dtype=np.int64) - - if malformed_index: - mm_test._warn_malformed_kwarg() - index = mm_test.non_monotonic_index(index) - index = mm_test.non_unique_index(index) - - data.index = index - data.index.name = 'Epoch' - - # Higher rate time signal (for scalar >= 2). This time signal is used - # for 2D profiles associated with each time in main DataFrame. - num_profiles = 50 if num_samples >= 50 else num_samples - end_date = dates[0] + dt.timedelta(seconds=2 * num_profiles - 1) - high_rate_template = pds.date_range(dates[0], end_date, freq='2S') - - # Create a few simulated profiles. This results in a pds.DataFrame at - # each time with mixed variables. - profiles = [] - - # DataFrame at each time with numeric variables only - alt_profiles = [] - - # Series at each time, numeric data only - series_profiles = [] - - # Frame indexed by date times - frame = pds.DataFrame({'density': - data.iloc[0:num_profiles]['mlt'].values.copy(), - 'dummy_str': ['test'] * num_profiles, - 'dummy_ustr': [u'test'] * num_profiles}, - index=data.index[0:num_profiles], - columns=['density', 'dummy_str', 'dummy_ustr']) - - # Frame indexed by float - dd = np.arange(num_profiles) * 1.2 - ff = np.arange(num_profiles) / num_profiles - ii = np.arange(num_profiles) * 0.5 - frame_alt = pds.DataFrame({'density': dd, 'fraction': ff}, - index=ii, - columns=['density', 'fraction']) - - # Series version of storage - series_alt = pds.Series(dd, index=ii, name='series_profiles') - - for time in data.index: - frame.index = high_rate_template + (time - data.index[0]) - profiles.append(frame) - alt_profiles.append(frame_alt) - series_profiles.append(series_alt) - - # Store multiple data types into main frame - data['profiles'] = pds.Series(profiles, index=data.index) - data['alt_profiles'] = pds.Series(alt_profiles, index=data.index) - data['series_profiles'] = pds.Series(series_profiles, index=data.index) - - # Set the meta data - meta = mm_test.initialize_test_meta('epoch', data.keys()) - - # Reset profiles as children meta - profile_meta = pysat.Meta() - profile_meta['density'] = {'long_name': 'density', 'units': 'N/cc', - 'desc': 'Fake "density" signal for testing.', - 'value_min': 0., 'value_max': 25., - 'fill': np.nan} - profile_meta['dummy_str'] = {'long_name': 'dummy_str', - 'desc': 'String data for testing.'} - profile_meta['dummy_ustr'] = {'long_name': 'dummy_ustr', - 'desc': 'Unicode string data for testing.'} - - # Update profiles metadata with sub-variable information - meta['profiles'] = {'meta': profile_meta} - - return data, meta - - -list_files = functools.partial(mm_test.list_files, test_dates=_test_dates) -list_remote_files = functools.partial(mm_test.list_remote_files, - test_dates=_test_dates) -download = functools.partial(mm_test.download) diff --git a/pysat/instruments/pysat_testing2d_xarray.py b/pysat/instruments/pysat_testing2d_xarray.py deleted file mode 100644 index 334821385..000000000 --- a/pysat/instruments/pysat_testing2d_xarray.py +++ /dev/null @@ -1,66 +0,0 @@ -# -*- coding: utf-8 -*- -"""Produces fake instrument data for testing. - -.. deprecated:: 3.1.0 - This module has been renamed pysat_ndtesting. A copy inheriting the - routines from the new location is maintained here for backwards- - compatibility. This instrument will be removed in 3.2.0+ to reduce - redundancy. - -""" - -import datetime as dt -import functools -import numpy as np -import warnings - -import xarray as xr - -import pysat -from pysat.instruments.methods import testing as mm_test -from pysat.instruments import pysat_ndtesting - -platform = 'pysat' -name = 'testing2d_xarray' - -tags = pysat_ndtesting.tags -inst_ids = pysat_ndtesting.inst_ids -pandas_format = pysat_ndtesting.pandas_format -_test_dates = pysat_ndtesting._test_dates - - -# Init method -def init(self, test_init_kwarg=None): - """Initialize the test instrument. - - Parameters - ---------- - self : pysat.Instrument - This object - test_init_kwarg : any - Testing keyword (default=None) - - """ - - warnings.warn(" ".join(["The instrument module `pysat_testing2d_xarray`", - "has been deprecated and will be removed in", - "3.2.0+. Please use `pysat_ndtesting` instead."]), - DeprecationWarning, stacklevel=2) - - mm_test.init(self, test_init_kwarg=test_init_kwarg) - return - - -# Clean method -clean = pysat_ndtesting.clean - -# Optional method, preprocess -preprocess = pysat_ndtesting.preprocess - -load = pysat_ndtesting.load - -list_files = functools.partial(pysat_ndtesting.list_files, - test_dates=_test_dates) -list_remote_files = functools.partial(pysat_ndtesting.list_remote_files, - test_dates=_test_dates) -download = functools.partial(pysat_ndtesting.download) diff --git a/pysat/instruments/pysat_testing_xarray.py b/pysat/instruments/pysat_testing_xarray.py deleted file mode 100644 index 0edd7365b..000000000 --- a/pysat/instruments/pysat_testing_xarray.py +++ /dev/null @@ -1,223 +0,0 @@ -# -*- coding: utf-8 -*- -"""Produces fake instrument data for testing. - -.. deprecated:: 3.0.2 - All data present in this instrument is duplicated in pysat_ndtesting. - This instrument will be removed in 3.2.0+ to reduce redundancy. - -""" - -import datetime as dt -import functools -import numpy as np -import warnings - -import xarray as xr - -import pysat -from pysat.instruments.methods import testing as mm_test - -# pysat required parameters -platform = 'pysat' -name = 'testing_xarray' - -# Dictionary of data 'tags' and corresponding description -tags = {'': 'Regular testing data set'} - -# Dictionary of satellite IDs, list of corresponding tags -inst_ids = {'': ['']} -_test_dates = {'': {'': dt.datetime(2009, 1, 1)}} -pandas_format = False - -epoch_name = u'time' - - -# Init method -def init(self, test_init_kwarg=None): - """Initialize the test instrument. - - Parameters - ---------- - self : pysat.Instrument - This object - test_init_kwarg : any - Testing keyword (default=None) - - """ - - warnings.warn(" ".join(["The instrument module `pysat_testing_xarray` has", - "been deprecated and will be removed in 3.2.0+."]), - DeprecationWarning, stacklevel=2) - - mm_test.init(self, test_init_kwarg=test_init_kwarg) - return - - -# Clean method -clean = mm_test.clean - -# Optional method, preprocess -preprocess = mm_test.preprocess - - -def load(fnames, tag='', inst_id='', sim_multi_file_right=False, - sim_multi_file_left=False, non_monotonic_index=False, - non_unique_index=False, malformed_index=False, start_time=None, - num_samples=86400, test_load_kwarg=None, max_latitude=90.): - """Load the test files. - - Parameters - ---------- - fnames : list - List of filenames. - tag : str - Tag name used to identify particular data set to be loaded. - This input is nominally provided by pysat itself. (default='') - inst_id : str - Instrument ID used to identify particular data set to be loaded. - This input is nominally provided by pysat itself. (default='') - sim_multi_file_right : bool - Adjusts date range to be 12 hours in the future or twelve hours beyond - `root_date`. (default=False) - sim_multi_file_left : bool - Adjusts date range to be 12 hours in the past or twelve hours before - `root_date`. (default=False) - non_monotonic_index : bool - If True, time index will be non-monotonic (default=False) - non_unique_index : bool - If True, time index will be non-unique (default=False) - malformed_index : bool - If True, the time index will be non-unique and non-monotonic. Deprecated - and scheduled for removal in pysat 3.2.0. (default=False) - start_time : dt.timedelta or NoneType - Offset time of start time since midnight UT. If None, instrument data - will begin at midnight. (default=None) - num_samples : int - Maximum number of times to generate. Data points will not go beyond the - current day. (default=86400) - test_load_kwarg : any - Keyword used for pysat unit testing to ensure that functionality for - custom keywords defined in instrument support functions is working - correctly. (default=None) - max_latitude : float - Latitude simulated as `max_latitude` * cos(theta(t))`, where - theta is a linear periodic signal bounded by [0, 2 * pi). (default=90.) - - Returns - ------- - data : xr.Dataset - Testing data - meta : pysat.Meta - Metadata - - """ - - # Support keyword testing - pysat.logger.info(''.join(('test_load_kwarg = ', str(test_load_kwarg)))) - - # Create an artificial satellite data set - iperiod = mm_test.define_period() - drange = mm_test.define_range() - - uts, index, dates = mm_test.generate_times(fnames, num_samples, freq='1S', - start_time=start_time) - - if sim_multi_file_right: - root_date = dt.datetime(2009, 1, 1, 12) - elif sim_multi_file_left: - root_date = dt.datetime(2008, 12, 31, 12) - else: - root_date = dt.datetime(2009, 1, 1) - - # TODO(#1094): Remove in pysat 3.2.0 - if malformed_index: - # Warn that kwarg is deprecated and set new kwargs. - mm_test._warn_malformed_kwarg() - non_monotonic_index = True - non_unique_index = True - - if non_monotonic_index: - index = mm_test.non_monotonic_index(index) - if non_unique_index: - index = mm_test.non_unique_index(index) - - data = xr.Dataset({'uts': ((epoch_name), uts)}, - coords={epoch_name: index}) - - # Need to create simple orbits here. Have start of first orbit - # at 2009,1, 0 UT. 14.84 orbits per day. Figure out how far in time from - # the root start a measurement is and use that info to create a signal - # that is continuous from that start. Going to presume there are 5820 - # seconds per orbit (97 minute period). - time_delta = dates[0] - root_date - mlt = mm_test.generate_fake_data(time_delta.total_seconds(), uts, - period=iperiod['lt'], - data_range=drange['lt']) - data['mlt'] = ((epoch_name), mlt) - - # SLT, 20 second offset from `mlt`. - slt = mm_test.generate_fake_data(time_delta.total_seconds() + 20, uts, - period=iperiod['lt'], - data_range=drange['lt']) - data['slt'] = ((epoch_name), slt) - - # Create a fake longitude, resets every 6240 seconds. Sat moves at - # 360/5820 deg/s, Earth rotates at 360/86400, takes extra time to go - # around full longitude. - longitude = mm_test.generate_fake_data(time_delta.total_seconds(), uts, - period=iperiod['lon'], - data_range=drange['lon']) - data['longitude'] = ((epoch_name), longitude) - - # Create latitude area for testing polar orbits - angle = mm_test.generate_fake_data(time_delta.total_seconds(), uts, - period=iperiod['angle'], - data_range=drange['angle']) - latitude = max_latitude * np.cos(angle) - data['latitude'] = ((epoch_name), latitude) - - # Create constant altitude at 400 km - alt0 = 400.0 - altitude = alt0 * np.ones(data['latitude'].shape) - data['altitude'] = ((epoch_name), altitude) - - # Fake orbit number - fake_delta = dates[0] - dt.datetime(2008, 1, 1) - orbit_num = mm_test.generate_fake_data(fake_delta.total_seconds(), - uts, period=iperiod['lt'], - cyclic=False) - - data['orbit_num'] = ((epoch_name), orbit_num) - - # Create some fake data to support testing of averaging routines - mlt_int = data['mlt'].astype(int).data - long_int = (data['longitude'] / 15.).astype(int).data - data['dummy1'] = ((epoch_name), mlt_int) - data['dummy2'] = ((epoch_name), long_int) - data['dummy3'] = ((epoch_name), mlt_int + long_int * 1000.) - data['dummy4'] = ((epoch_name), uts) - data['string_dummy'] = ((epoch_name), - ['test'] * len(data.indexes[epoch_name])) - data['unicode_dummy'] = ((epoch_name), - [u'test'] * len(data.indexes[epoch_name])) - data['int8_dummy'] = ((epoch_name), - np.ones(len(data.indexes[epoch_name]), dtype=np.int8)) - data['int16_dummy'] = ((epoch_name), - np.ones(len(data.indexes[epoch_name]), - dtype=np.int16)) - data['int32_dummy'] = ((epoch_name), - np.ones(len(data.indexes[epoch_name]), - dtype=np.int32)) - data['int64_dummy'] = ((epoch_name), - np.ones(len(data.indexes[epoch_name]), - dtype=np.int64)) - - # Set the meta data - meta = mm_test.initialize_test_meta(epoch_name, data.keys()) - return data, meta - - -list_files = functools.partial(mm_test.list_files, test_dates=_test_dates) -list_remote_files = functools.partial(mm_test.list_remote_files, - test_dates=_test_dates) -download = functools.partial(mm_test.download) diff --git a/pysat/instruments/pysat_testmodel.py b/pysat/instruments/pysat_testmodel.py index f781bdcb1..b0bf28105 100644 --- a/pysat/instruments/pysat_testmodel.py +++ b/pysat/instruments/pysat_testmodel.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- # -*- coding: utf-8 -*- """Produces fake instrument data for testing.""" diff --git a/pysat/instruments/templates/template_instrument.py b/pysat/instruments/templates/template_instrument.py index 4890fdf77..9a2fcc0d9 100644 --- a/pysat/instruments/templates/template_instrument.py +++ b/pysat/instruments/templates/template_instrument.py @@ -2,6 +2,10 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# This work was supported by the Office of Naval Research. # ---------------------------------------------------------------------------- """Template for a pysat.Instrument support file. diff --git a/pysat/tests/classes/cls_ci.py b/pysat/tests/classes/cls_ci.py index 8fd3285ce..17b6c1c4a 100644 --- a/pysat/tests/classes/cls_ci.py +++ b/pysat/tests/classes/cls_ci.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Class setup and teardown for unit tests that are only run in the CI env.""" diff --git a/pysat/tests/classes/cls_instrument_access.py b/pysat/tests/classes/cls_instrument_access.py index edc8b759a..f7c252122 100644 --- a/pysat/tests/classes/cls_instrument_access.py +++ b/pysat/tests/classes/cls_instrument_access.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Tests for data access and related functions in the pysat Instrument object. Includes: @@ -47,8 +55,7 @@ def test_instrument_complete_by_init(self): """Test Instrument object fully complete by self._init_rtn().""" # Create a base instrument to compare against - inst_copy = pysat.Instrument(inst_module=self.testInst.inst_module, - use_header=True) + inst_copy = pysat.Instrument(inst_module=self.testInst.inst_module) # Get instrument module and init funtcion inst_mod = self.testInst.inst_module @@ -71,7 +78,7 @@ def temp_init(inst, test_init_kwarg=None): # Instantiate instrument with test module which invokes needed test # code in the background - pysat.Instrument(inst_module=inst_mod, use_header=True) + pysat.Instrument(inst_module=inst_mod) # Restore nominal init function inst_mod.init = inst_mod_init @@ -125,13 +132,72 @@ def test_basic_instrument_load(self, kwargs): """ # Load data by year and day of year - self.testInst.load(self.ref_time.year, self.ref_doy, **kwargs, - use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy, **kwargs) # Test that the loaded date range is correct self.eval_successful_load() return + @pytest.mark.parametrize("method", ["del", "drop"]) + @pytest.mark.parametrize("del_all", [True, False]) + def test_basic_instrument_del(self, method, del_all): + """Test that data can be deleted from an Instrument. + + Parameters + ---------- + method : str + String specifying the deletion method + del_all : bool + Delete a single variable if False, delete all if True + + """ + + # Load data by year and day of year + self.testInst.load(self.ref_time.year, self.ref_doy) + + # Get a variable name(s) to delete + var = self.testInst.variables if del_all else self.testInst.variables[0] + + # Delete the variable + if method == 'del': + del self.testInst[var] + else: + self.testInst.drop(var) + + # Test that the absence of the desired variable(s) + if del_all: + assert self.testInst.empty + assert len(self.testInst.variables) == 0 + else: + assert var not in self.testInst.variables + return + + def test_basic_instrument_bad_var_drop(self): + """Check for error when deleting absent data variable.""" + # Load data by year and day of year + self.testInst.load(self.ref_time.year, self.ref_doy) + + # Test that the correct error is raised + testing.eval_bad_input(self.testInst.drop, KeyError, + "not found in Instrument variables", + input_args=["not_a_data_variable"]) + return + + def test_basic_instrument_partial_bad_var_drop(self, caplog): + """Check for log warning when deleting present and absent variables.""" + # Load data by year and day of year + self.testInst.load(self.ref_time.year, self.ref_doy) + + dvars = [self.testInst.variables[0], "not_a_data_var"] + + # Test that the correct warning is raised + with caplog.at_level(logging.INFO, logger='pysat'): + self.testInst.drop(dvars) + + captured = caplog.text + assert captured.find("not found in Instrument variables") > 0 + return + @pytest.mark.parametrize('pad', [None, dt.timedelta(days=1)]) def test_basic_instrument_load_no_data(self, caplog, pad): """Test Instrument load with no data for appropriate log messages. @@ -152,7 +218,7 @@ def test_basic_instrument_load_no_data(self, caplog, pad): # Test doesn't check against loading by filename since that produces # an error if there is no file. Loading by yr, doy no different # than date in this case. - self.testInst.load(date=no_data_d, pad=pad, use_header=True) + self.testInst.load(date=no_data_d, pad=pad) # Confirm by checking against caplog that metadata was # not assigned. @@ -182,7 +248,7 @@ def test_basic_instrument_load_two_days(self): end_date = self.ref_time + dt.timedelta(days=2) end_doy = int(end_date.strftime("%j")) self.testInst.load(self.ref_time.year, self.ref_doy, end_date.year, - end_doy, use_header=True) + end_doy) # Test that the loaded date range is correct self.eval_successful_load(end_date=end_date) @@ -195,8 +261,7 @@ def test_basic_instrument_bad_keyword_at_load(self): testing.eval_bad_input(self.testInst.load, TypeError, "load() got an unexpected keyword", input_kwargs={'date': self.ref_time, - 'unsupported_keyword': True, - 'use_header': True}) + 'unsupported_keyword': True}) return def test_basic_instrument_load_yr_no_doy(self): @@ -205,7 +270,7 @@ def test_basic_instrument_load_yr_no_doy(self): # Check that the correct error is raised estr = 'Unknown or incomplete input combination.' testing.eval_bad_input(self.testInst.load, TypeError, estr, - [self.ref_time.year], {'use_header': True}) + [self.ref_time.year]) return @pytest.mark.parametrize('doy', [0, 367, 1000, -1, -10000]) @@ -221,7 +286,7 @@ def test_basic_instrument_load_yr_bad_doy(self, doy): estr = 'Day of year (doy) is only valid between and ' testing.eval_bad_input(self.testInst.load, ValueError, estr, - [self.ref_time.year, doy], {'use_header': True}) + [self.ref_time.year, doy]) return @pytest.mark.parametrize('end_doy', [0, 367, 1000, -1, -10000]) @@ -239,7 +304,7 @@ def test_basic_instrument_load_yr_bad_end_doy(self, end_doy): testing.eval_bad_input(self.testInst.load, ValueError, estr, [self.ref_time.year, 1], {'end_yr': self.ref_time.year, - 'end_doy': end_doy, 'use_header': True}) + 'end_doy': end_doy}) return def test_basic_instrument_load_yr_no_end_doy(self): @@ -248,7 +313,7 @@ def test_basic_instrument_load_yr_no_end_doy(self): estr = 'Both end_yr and end_doy must be set' testing.eval_bad_input(self.testInst.load, ValueError, estr, [self.ref_time.year, self.ref_doy, - self.ref_time.year], {'use_header': True}) + self.ref_time.year]) return @pytest.mark.parametrize("kwargs", [{'yr': 2009, 'doy': 1, @@ -277,7 +342,6 @@ def test_basic_instrument_load_mixed_inputs(self, kwargs): """ - kwargs['use_header'] = True estr = 'An inconsistent set of inputs have been' testing.eval_bad_input(self.testInst.load, ValueError, estr, input_kwargs=kwargs) @@ -286,7 +350,7 @@ def test_basic_instrument_load_mixed_inputs(self, kwargs): def test_basic_instrument_load_no_input(self): """Test that `.load()` loads all data.""" - self.testInst.load(use_header=True) + self.testInst.load() assert (self.testInst.index[0] == self.testInst.files.start_date) assert (self.testInst.index[-1] >= self.testInst.files.stop_date) assert (self.testInst.index[-1] <= self.testInst.files.stop_date @@ -316,7 +380,6 @@ def test_instrument_load_errors_with_multifile(self, load_in, verr): else: load_kwargs = dict() - load_kwargs['use_header'] = True testing.eval_bad_input(self.testInst.load, ValueError, verr, input_kwargs=load_kwargs) return @@ -324,7 +387,7 @@ def test_instrument_load_errors_with_multifile(self, load_in, verr): def test_basic_instrument_load_by_date(self): """Test loading by date.""" - self.testInst.load(date=self.ref_time, use_header=True) + self.testInst.load(date=self.ref_time) self.eval_successful_load() return @@ -332,8 +395,7 @@ def test_basic_instrument_load_by_dates(self): """Test date range loading, `date` and `end_date`.""" end_date = self.ref_time + dt.timedelta(days=2) - self.testInst.load(date=self.ref_time, end_date=end_date, - use_header=True) + self.testInst.load(date=self.ref_time, end_date=end_date) self.eval_successful_load(end_date=end_date) return @@ -341,15 +403,14 @@ def test_basic_instrument_load_by_date_with_extra_time(self): """Ensure `.load(date=date)` only uses date portion of datetime.""" # Put in a date that has more than year, month, day - self.testInst.load(date=(self.ref_time + dt.timedelta(minutes=71)), - use_header=True) + self.testInst.load(date=(self.ref_time + dt.timedelta(minutes=71))) self.eval_successful_load() return def test_basic_instrument_load_data(self): """Test that correct day loads (checking down to the sec).""" - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) self.eval_successful_load() return @@ -361,7 +422,7 @@ def test_basic_instrument_load_leap_year(self): self.ref_time = dt.datetime(2008, 12, 31) self.ref_doy = 366 - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) self.eval_successful_load() return @@ -398,7 +459,7 @@ def test_file_load_bad_start_file(self, operator): """ - self.testInst.load(fname=self.testInst.files[1], use_header=True) + self.testInst.load(fname=self.testInst.files[1]) # Set new bounds that do not include this date. self.testInst.bounds = (self.testInst.files[0], self.testInst.files[2], @@ -418,7 +479,7 @@ def test_file_load_bad_start_date(self, operator): """ - self.testInst.load(date=self.ref_time, use_header=True) + self.testInst.load(date=self.ref_time) # Set new bounds that do not include this date. self.testInst.bounds = (self.ref_time + dt.timedelta(days=1), @@ -435,7 +496,7 @@ def test_basic_fname_instrument_load(self): # If mangle_file_date is true, index will not match exactly. # Find the closest point instead. ind = np.argmin(abs(self.testInst.files.files.index - self.ref_time)) - self.testInst.load(fname=self.testInst.files[ind], use_header=True) + self.testInst.load(fname=self.testInst.files[ind]) self.eval_successful_load() return @@ -455,7 +516,7 @@ def test_fname_load_default(self, operator, direction): # If mangle_file_date is true, index will not match exactly. # Find the closest point. ind = np.argmin(abs(self.testInst.files.files.index - self.ref_time)) - self.testInst.load(fname=self.testInst.files[ind], use_header=True) + self.testInst.load(fname=self.testInst.files[ind]) getattr(self.testInst, operator)() # Modify ref time since iterator changes load date. @@ -468,8 +529,7 @@ def test_fname_load_default(self, operator, direction): def test_filename_load(self): """Test if file is loadable by filename with no path.""" - self.testInst.load(fname=self.ref_time.strftime('%Y-%m-%d.nofile'), - use_header=True) + self.testInst.load(fname=self.ref_time.strftime('%Y-%m-%d.nofile')) self.eval_successful_load() return @@ -481,7 +541,7 @@ def test_filenames_load(self): stop_fname = self.ref_time + foff stop_fname = stop_fname.strftime('%Y-%m-%d.nofile') self.testInst.load(fname=self.ref_time.strftime('%Y-%m-%d.nofile'), - stop_fname=stop_fname, use_header=True) + stop_fname=stop_fname) assert self.testInst.index[0] == self.ref_time assert self.testInst.index[-1] >= self.ref_time + foff assert self.testInst.index[-1] <= self.ref_time + (2 * foff) @@ -499,8 +559,7 @@ def test_filenames_load_out_of_order(self): testing.eval_bad_input(self.testInst.load, ValueError, estr, input_kwargs={'fname': stop_fname, - 'stop_fname': check_fname, - 'use_header': True}) + 'stop_fname': check_fname}) return def test_eq_no_data(self): @@ -513,7 +572,7 @@ def test_eq_no_data(self): def test_eq_both_with_data(self): """Test equality when the same object with loaded data.""" - self.testInst.load(date=self.ref_time, use_header=True) + self.testInst.load(date=self.ref_time) inst_copy = self.testInst.copy() assert inst_copy == self.testInst return @@ -521,7 +580,7 @@ def test_eq_both_with_data(self): def test_eq_one_with_data(self): """Test equality when the same objects but only one with loaded data.""" - self.testInst.load(date=self.ref_time, use_header=True) + self.testInst.load(date=self.ref_time) inst_copy = self.testInst.copy() inst_copy.data = self.testInst._null_data assert inst_copy != self.testInst @@ -530,7 +589,7 @@ def test_eq_one_with_data(self): def test_eq_different_data_type(self): """Test equality different data type.""" - self.testInst.load(date=self.ref_time, use_header=True) + self.testInst.load(date=self.ref_time) inst_copy = self.testInst.copy() # Can only change data types if Instrument empty @@ -574,7 +633,8 @@ def test_inequality_reduced_object(self): @pytest.mark.parametrize("prepend, sort_dim_toggle", [(True, True), (True, False), (False, False)]) - def test_concat_data(self, prepend, sort_dim_toggle): + @pytest.mark.parametrize("include", [True, False]) + def test_concat_data(self, prepend, sort_dim_toggle, include): """Test `pysat.Instrument.data` concatenation. Parameters @@ -586,6 +646,8 @@ def test_concat_data(self, prepend, sort_dim_toggle): If True, sort variable names in pandas before concatenation. If False, do not sort for pandas objects. For xarray objects, rename the epoch if True. + include : bool + Use `include` kwarg instead of `prepend` for the same behaviour. """ @@ -593,12 +655,12 @@ def test_concat_data(self, prepend, sort_dim_toggle): ref_time2 = self.ref_time + pds.tseries.frequencies.to_offset( self.testInst.files.files.index.freqstr) doy2 = int(ref_time2.strftime('%j')) - self.testInst.load(ref_time2.year, doy2, use_header=True) + self.testInst.load(ref_time2.year, doy2) data2 = self.testInst.data len2 = len(self.testInst.index) # Load a different data set into the instrument - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) len1 = len(self.testInst.index) # Set the keyword arguments @@ -611,6 +673,9 @@ def test_concat_data(self, prepend, sort_dim_toggle): data2 = data2.rename({self.xarray_epoch_name: 'Epoch2'}) self.testInst.data = self.testInst.data.rename( {self.xarray_epoch_name: 'Epoch2'}) + if include: + # To prepend new data, put existing data at end and vice versa + kwargs['include'] = 1 if prepend else 0 # Concat together self.testInst.concat_data(data2, **kwargs) @@ -649,7 +714,7 @@ def test_empty_flag_data_empty(self): def test_empty_flag_data_not_empty(self): """Test the status of the empty flag for loaded data.""" - self.testInst.load(date=self.ref_time, use_header=True) + self.testInst.load(date=self.ref_time) assert not self.testInst.empty return @@ -660,7 +725,7 @@ def test_index_attribute(self): assert isinstance(self.testInst.index, pds.Index) # Test an index is present with data loaded in an Instrument - self.testInst.load(date=self.ref_time, use_header=True) + self.testInst.load(date=self.ref_time) assert isinstance(self.testInst.index, pds.Index) return @@ -668,7 +733,7 @@ def test_index_return(self): """Test that the index is returned in the proper format.""" # Load data - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) # Ensure we get the index back if self.testInst.pandas_format: @@ -691,7 +756,7 @@ def test_basic_data_access_by_name(self, labels): """ - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) assert np.all((self.testInst[labels] == self.testInst.data[labels]).values) return @@ -710,7 +775,7 @@ def test_data_access_by_indices_and_name(self, index): """ - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) assert np.all(self.testInst[index, 'mlt'] == self.testInst.data['mlt'][index]) return @@ -718,7 +783,7 @@ def test_data_access_by_indices_and_name(self, index): def test_data_access_by_row_slicing(self): """Check that each variable is downsampled.""" - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) result = self.testInst[0:10] for variable, array in result.items(): assert len(array) == len(self.testInst.data[variable].values[0:10]) @@ -731,7 +796,7 @@ def test_data_access_by_row_slicing_and_name_slicing(self): if not self.testInst.pandas_format: pytest.skip("name slicing not implemented for xarray") - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) result = self.testInst[0:10, 'uts':'mlt'] for variable, array in result.items(): assert len(array) == len(self.testInst.data[variable].values[0:10]) @@ -741,7 +806,7 @@ def test_data_access_by_row_slicing_and_name_slicing(self): def test_data_access_by_datetime_and_name(self): """Check that datetime can be used to access data.""" - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) self.out = dt.datetime(2009, 1, 1, 0, 0, 0) assert np.all(self.testInst[self.out, 'uts'] == self.testInst.data['uts'].values[0]) @@ -750,7 +815,7 @@ def test_data_access_by_datetime_and_name(self): def test_data_access_by_datetime_slicing_and_name(self): """Check that a slice of datetimes can be used to access data.""" - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) time_step = (self.testInst.index[1] - self.testInst.index[0]).value / 1.E9 offset = dt.timedelta(seconds=(10 * time_step)) @@ -763,7 +828,7 @@ def test_data_access_by_datetime_slicing_and_name(self): def test_setting_data_by_name(self): """Test setting data by name.""" - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) self.testInst['doubleMLT'] = 2. * self.testInst['mlt'] assert np.all(self.testInst['doubleMLT'] == 2. * self.testInst['mlt']) return @@ -771,7 +836,7 @@ def test_setting_data_by_name(self): def test_setting_series_data_by_name(self): """Test setting series data by name.""" - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) self.testInst['doubleMLT'] = 2. * pds.Series( self.testInst['mlt'].values, index=self.testInst.index) assert np.all(self.testInst['doubleMLT'] == 2. * self.testInst['mlt']) @@ -783,7 +848,7 @@ def test_setting_series_data_by_name(self): def test_setting_pandas_dataframe_by_names(self): """Test setting pandas dataframe by name.""" - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) self.testInst[['doubleMLT', 'tripleMLT']] = pds.DataFrame( {'doubleMLT': 2. * self.testInst['mlt'].values, 'tripleMLT': 3. * self.testInst['mlt'].values}, @@ -795,7 +860,7 @@ def test_setting_pandas_dataframe_by_names(self): def test_setting_data_by_name_single_element(self): """Test setting data by name for a single element.""" - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) self.testInst['doubleMLT'] = 2. assert np.all(self.testInst['doubleMLT'] == 2.) assert len(self.testInst['doubleMLT']) == len(self.testInst.index) @@ -807,7 +872,7 @@ def test_setting_data_by_name_single_element(self): def test_setting_data_by_name_with_meta(self): """Test setting data by name with meta.""" - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) self.testInst['doubleMLT'] = {'data': 2. * self.testInst['mlt'], 'units': 'hours', 'long_name': 'double trouble'} @@ -819,7 +884,7 @@ def test_setting_data_by_name_with_meta(self): def test_setting_partial_data(self): """Test setting partial data by index.""" - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) self.out = self.testInst if self.testInst.pandas_format: self.testInst[0:3] = 0 @@ -854,7 +919,7 @@ def test_setting_partial_data_by_inputs(self, changed, fixed): """ - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) self.testInst['doubleMLT'] = 2. * self.testInst['mlt'] self.testInst[changed, 'doubleMLT'] = 0 assert (self.testInst[fixed, 'doubleMLT'] @@ -865,7 +930,7 @@ def test_setting_partial_data_by_inputs(self, changed, fixed): def test_modifying_data_inplace(self): """Test modification of data inplace.""" - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) self.testInst['doubleMLT'] = 2. * self.testInst['mlt'] self.testInst['doubleMLT'] += 100 assert (self.testInst['doubleMLT'] @@ -884,7 +949,7 @@ def test_getting_all_data_by_index(self, index): """ - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) inst_subset = self.testInst[index] if self.testInst.pandas_format: assert len(inst_subset) == len(index) @@ -907,7 +972,7 @@ def test_unknown_variable_error_renaming(self, values): """ # Check for error for unknown variable name - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) # Capture the ValueError and message testing.eval_bad_input(self.testInst.rename, ValueError, @@ -938,7 +1003,7 @@ def test_basic_variable_renaming(self, lowercase, mapper): values = {var: mapper(var) for var in self.testInst.variables} # Test single variable - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) self.testInst.rename(mapper, lowercase_data_labels=lowercase) for key in values: @@ -951,164 +1016,3 @@ def test_basic_variable_renaming(self, lowercase, mapper): assert key not in self.testInst.variables assert key not in self.testInst.meta.keys() return - - @pytest.mark.parametrize("mapper", [ - {'profiles': {'density': 'ionization'}}, - {'profiles': {'density': 'mass'}, - 'alt_profiles': {'density': 'volume'}}, - str.upper]) - def test_ho_pandas_variable_renaming(self, mapper): - """Test rename of higher order pandas variable. - - Parameters - ---------- - mapper : dict or function - A function or dict that maps how the variables will be renamed. - - """ - # TODO(#789): Remove when meta children support is dropped. - - # Initialize the testing dict - if isinstance(mapper, dict): - values = mapper - else: - values = {var: mapper(var) for var in self.testInst.variables} - - # Check for pysat_testing2d instrument - if self.testInst.platform == 'pysat': - if self.testInst.name == 'testing2d': - self.testInst.load(self.ref_time.year, self.ref_doy, - use_header=True) - self.testInst.rename(mapper) - for key in values: - for ikey in values[key]: - # Check column name unchanged - assert key in self.testInst.data - assert key in self.testInst.meta - - # Check for new name in HO data - check_var = self.testInst.meta[key]['children'] - - if isinstance(values[key], dict): - map_val = values[key][ikey] - else: - map_val = mapper(ikey) - - assert map_val in self.testInst[0, key] - assert map_val in check_var - - # Ensure old name not present - assert ikey not in self.testInst[0, key] - if map_val.lower() != ikey: - assert ikey not in check_var - return - - @pytest.mark.parametrize("values", [{'profiles': - {'help': 'I need somebody'}}, - {'fake_profi': - {'help': 'Not just anybody'}}, - {'wrong_profile': - {'help': 'You know I need someone'}, - 'fake_profiles': - {'Beatles': 'help!'}, - 'profiles': - {'density': 'valid_change'}}, - {'fake_profile': - {'density': 'valid HO change'}}, - {'Nope_profiles': - {'density': 'valid_HO_change'}}]) - def test_ho_pandas_unknown_variable_error_renaming(self, values): - """Test higher order pandas variable rename raises error if unknown. - - Parameters - ---------- - values : dict - Variables to be renamed. A dict where each key is the current - variable and its value is the new variable name. - - """ - # TODO(#789): Remove when meta children support is dropped. - - # Check for pysat_testing2d instrument - if self.testInst.platform == 'pysat': - if self.testInst.name == 'testing2d': - self.testInst.load(self.ref_time.year, self.ref_doy, - use_header=True) - - # Check for error for unknown column or HO variable name - testing.eval_bad_input(self.testInst.rename, ValueError, - "cannot rename", [values]) - else: - pytest.skip("Not implemented for this instrument") - return - - @pytest.mark.parametrize("values", [{'profiles': {'density': 'Ionization'}}, - {'profiles': {'density': 'MASa'}, - 'alt_profiles': - {'density': 'VoLuMe'}}]) - def test_ho_pandas_variable_renaming_lowercase(self, values): - """Test rename higher order pandas variable uses lowercase. - - Parameters - ---------- - values : dict - Variables to be renamed. A dict where each key is the current - variable and its value is the new variable name. - - """ - # TODO(#789): Remove when meta children support is dropped. - - # Check for pysat_testing2d instrument - if self.testInst.platform == 'pysat': - if self.testInst.name == 'testing2d': - self.testInst.load(self.ref_time.year, self.ref_doy, - use_header=True) - self.testInst.rename(values) - for key in values: - for ikey in values[key]: - # Check column name unchanged - assert key in self.testInst.data - assert key in self.testInst.meta - - # Check for new name in HO data - test_val = values[key][ikey] - assert test_val in self.testInst[0, key] - check_var = self.testInst.meta[key]['children'] - - # Case insensitive check - assert values[key][ikey] in check_var - - # Ensure new case in there - check_var = check_var[values[key][ikey]].name - assert values[key][ikey] == check_var - - # Ensure old name not present - assert ikey not in self.testInst[0, key] - check_var = self.testInst.meta[key]['children'] - assert ikey not in check_var - return - - def test_generic_meta_translator(self): - """Test `generic_meta_translator`.""" - - # Get default meta translation table - trans_table = pysat.utils.io.default_to_netcdf_translation_table( - self.testInst) - - # Load data - self.testInst.load(date=self.ref_time) - - # Assign table - self.testInst._meta_translation_table = trans_table - - # Apply translation - trans_meta = self.testInst.generic_meta_translator(self.testInst.meta) - - # Perform equivalent via replacement functions - meta_dict = self.testInst.meta.to_dict() - truth_meta = pysat.utils.io.apply_table_translation_to_file( - self.testInst, meta_dict, trans_table=trans_table) - - assert np.all(truth_meta == trans_meta) - - return diff --git a/pysat/tests/classes/cls_instrument_integration.py b/pysat/tests/classes/cls_instrument_integration.py index e76a0f7e8..caaee66d6 100644 --- a/pysat/tests/classes/cls_instrument_integration.py +++ b/pysat/tests/classes/cls_instrument_integration.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Integration tests for pysat.Instrument. Note @@ -6,18 +14,13 @@ """ -import datetime as dt import logging -import numpy as np import os import tempfile -import pandas as pds import pytest -import xarray as xr import pysat -from pysat.utils import testing class InstIntegrationTests(object): diff --git a/pysat/tests/classes/cls_instrument_iteration.py b/pysat/tests/classes/cls_instrument_iteration.py index 2a960f758..c820ee788 100644 --- a/pysat/tests/classes/cls_instrument_iteration.py +++ b/pysat/tests/classes/cls_instrument_iteration.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Tests for iteration in the pysat Instrument object and methods. Note diff --git a/pysat/tests/classes/cls_instrument_library.py b/pysat/tests/classes/cls_instrument_library.py index bb047209e..42f85a097 100644 --- a/pysat/tests/classes/cls_instrument_library.py +++ b/pysat/tests/classes/cls_instrument_library.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Standardized class and functions to test instruments for pysat libraries. Note @@ -32,16 +40,19 @@ class TestInstruments(InstLibTests): import datetime as dt from importlib import import_module +import logging +import numpy as np import sys import tempfile import warnings +import pandas as pds import pytest +import xarray as xr import pysat from pysat.utils import generate_instrument_list -from pysat.utils.testing import assert_hasattr -from pysat.utils.testing import assert_isinstance +from pysat.utils import testing def initialize_test_inst_and_date(inst_dict): @@ -66,14 +77,73 @@ def initialize_test_inst_and_date(inst_dict): test_inst = pysat.Instrument(inst_module=inst_dict['inst_module'], tag=inst_dict['tag'], inst_id=inst_dict['inst_id'], - temporary_file_list=True, - update_files=True, use_header=True, + temporary_file_list=True, update_files=True, **kwargs) test_dates = inst_dict['inst_module']._test_dates date = test_dates[inst_dict['inst_id']][inst_dict['tag']] return test_inst, date +def load_and_set_strict_time_flag(test_inst, date, raise_error=False, + clean_off=True, set_end_date=False): + """Load data and set the strict time flag if needed for other tests. + + Parameters + ---------- + test_inst : pysat.Instrument + Test instrument + date : dt.datetime + Date for loading data + raise_error : bool + Raise the load error if it is not the strict time flag error + (default=False) + clean_off : bool + Turn off the clean method when re-loading data and testing the + strict time flag (default=True) + set_end_date : bool + If True, load with setting the end date. If False, load single day. + (default=False) + + """ + + kwargs = {} + + if set_end_date: + kwargs['end_date'] = date + dt.timedelta(days=2) + + try: + test_inst.load(date=date, **kwargs) + except Exception as err: + # Catch all potential input errors, and only ensure that the one caused + # by the strict time flag is prevented from occurring on future load + # calls. + if str(err).find('Loaded data') > 0: + # Change the flags that may have caused the error to be raised, to + # see if it the strict time flag + test_inst.strict_time_flag = False + + if clean_off: + # Turn the clean method off + orig_clean_level = str(test_inst.clean_level) + test_inst.clean_level = 'none' + + # Evaluate the warning + with warnings.catch_warnings(record=True) as war: + test_inst.load(date=date, **kwargs) + + assert len(war) >= 1 + categories = [war[j].category for j in range(len(war))] + assert UserWarning in categories + + if clean_off: + # Reset the clean level + test_inst.clean_level = orig_clean_level + elif raise_error: + raise err + + return + + class InstLibTests(object): """Provide standardized tests for pysat instrument libraries. @@ -136,6 +206,21 @@ def teardown_class(self): del self.saved_path, self.tempdir return + def setup_method(self): + """Initialize parameters before each method.""" + self.test_inst = None + self.date = None + self.module = None + + return + + def teardown_method(self): + """Clean up any instruments that were initialized.""" + + del self.test_inst, self.date, self.module + + return + def initialize_test_package(self, inst_loc, user_info=None): """Generate custom instrument lists for each category of tests. @@ -184,6 +269,11 @@ def initialize_test_package(self, inst_loc, user_info=None): mark = pytest.mark.parametrize("inst_name", instruments['names']) getattr(self, method).pytestmark.append(mark) + elif 'new_tests' in mark_names: + # Prioritize new test marks if present + mark = pytest.mark.parametrize("inst_dict", + instruments['new_tests']) + getattr(self, method).pytestmark.append(mark) elif 'load_options' in mark_names: # Prioritize load_options mark if present mark = pytest.mark.parametrize("inst_dict", @@ -213,36 +303,37 @@ def test_modules_standard(self, inst_name): """ # Ensure that each module is at minimum importable - module = import_module(''.join(('.', inst_name)), - package=self.inst_loc.__name__) + self.module = import_module(''.join(('.', inst_name)), + package=self.inst_loc.__name__) # Check for presence of basic instrument module attributes for mattr in self.module_attrs: - assert_hasattr(module, mattr) + testing.assert_hasattr(self.module, mattr) if mattr in self.attr_types.keys(): - assert_isinstance(getattr(module, mattr), - self.attr_types[mattr]) + testing.assert_isinstance(getattr(self.module, mattr), + self.attr_types[mattr]) # Check for presence of required instrument attributes - for inst_id in module.inst_ids.keys(): - for tag in module.inst_ids[inst_id]: - inst = pysat.Instrument(inst_module=module, tag=tag, - inst_id=inst_id, use_header=True) + for inst_id in self.module.inst_ids.keys(): + for tag in self.module.inst_ids[inst_id]: + self.test_inst = pysat.Instrument(inst_module=self.module, + tag=tag, inst_id=inst_id) # Test to see that the class parameters were passed in - assert_isinstance(inst, pysat.Instrument) - assert inst.platform == module.platform - assert inst.name == module.name - assert inst.inst_id == inst_id - assert inst.tag == tag - assert inst.inst_module is not None + testing.assert_isinstance(self.test_inst, pysat.Instrument) + assert self.test_inst.platform == self.module.platform + assert self.test_inst.name == self.module.name + assert self.test_inst.inst_id == inst_id + assert self.test_inst.tag == tag + assert self.test_inst.inst_module is not None # Test the required class attributes for iattr in self.inst_attrs: - assert_hasattr(inst, iattr) + testing.assert_hasattr(self.test_inst, iattr) if iattr in self.attr_types: - assert_isinstance(getattr(inst, iattr), - self.attr_types[iattr]) + testing.assert_isinstance(getattr(self.test_inst, + iattr), + self.attr_types[iattr]) return @pytest.mark.all_inst @@ -257,14 +348,14 @@ def test_standard_function_presence(self, inst_name): """ - module = import_module(''.join(('.', inst_name)), - package=self.inst_loc.__name__) + self.module = import_module(''.join(('.', inst_name)), + package=self.inst_loc.__name__) # Test for presence of all standard module functions for mcall in self.inst_callable: - if hasattr(module, mcall): + if hasattr(self.module, mcall): # If present, must be a callable function - assert callable(getattr(module, mcall)) + assert callable(getattr(self.module, mcall)) else: # If absent, must not be a required function assert mcall not in self.module_attrs @@ -282,12 +373,12 @@ def test_instrument_test_dates(self, inst_name): """ - module = import_module(''.join(('.', inst_name)), - package=self.inst_loc.__name__) - info = module._test_dates + self.module = import_module(''.join(('.', inst_name)), + package=self.inst_loc.__name__) + info = self.module._test_dates for inst_id in info.keys(): for tag in info[inst_id].keys(): - assert_isinstance(info[inst_id][tag], dt.datetime) + testing.assert_isinstance(info[inst_id][tag], dt.datetime) return @pytest.mark.first @@ -304,20 +395,23 @@ def test_download(self, inst_dict): """ - test_inst, date = initialize_test_inst_and_date(inst_dict) + self.test_inst, self.date = initialize_test_inst_and_date(inst_dict) # Check for username. - dl_dict = inst_dict['user_info'] if 'user_info' in \ - inst_dict.keys() else {} - test_inst.download(date, date, **dl_dict) - assert len(test_inst.files.files) > 0 + if 'user_info' in inst_dict.keys(): + dl_dict = inst_dict['user_info'] + else: + dl_dict = {} + + # Ask to download two consecutive days + self.test_inst.download(start=self.date, + stop=self.date + dt.timedelta(days=2), + **dl_dict) + assert len(self.test_inst.files.files) > 0 return @pytest.mark.second - # Need to maintain download mark for backwards compatibility. - # Can remove once pysat 3.1.0 is released and libraries are updated. @pytest.mark.load_options - @pytest.mark.download @pytest.mark.parametrize("clean_level", ['none', 'dirty', 'dusty', 'clean']) def test_load(self, clean_level, inst_dict): """Test that instruments load at each cleaning level. @@ -333,34 +427,274 @@ def test_load(self, clean_level, inst_dict): """ - test_inst, date = initialize_test_inst_and_date(inst_dict) - if len(test_inst.files.files) > 0: - # Set Clean Level - test_inst.clean_level = clean_level + self.test_inst, self.date = initialize_test_inst_and_date(inst_dict) + if len(self.test_inst.files.files) > 0: + # Set the clean level + self.test_inst.clean_level = clean_level target = 'Fake Data to be cleared' - test_inst.data = [target] - try: - test_inst.load(date=date, use_header=True) - except ValueError as verr: - # Check if instrument is failing due to strict time flag - if str(verr).find('Loaded data') > 0: - test_inst.strict_time_flag = False - with warnings.catch_warnings(record=True) as war: - test_inst.load(date=date, use_header=True) - assert len(war) >= 1 - categories = [war[j].category for j in range(0, len(war))] - assert UserWarning in categories - else: - # If error message does not match, raise error anyway - raise(verr) + self.test_inst.data = [target] + + # Make sure the strict time flag doesn't interfere with + # the load tests, and re-run with desired clean level + load_and_set_strict_time_flag(self.test_inst, self.date, + raise_error=True, clean_off=False) # Make sure fake data is cleared - assert target not in test_inst.data + assert target not in self.test_inst.data # If cleaning not used, something should be in the file # Not used for clean levels since cleaning may remove all data if clean_level == "none": - assert not test_inst.empty + assert not self.test_inst.empty + else: + pytest.skip("Download data not available") + + return + + @pytest.mark.second + @pytest.mark.load_options + def test_load_empty(self, inst_dict): + """Test that instruments load empty objects if no data is available. + + Parameters + ---------- + inst_dict : dict + Dictionary containing info to instantiate a specific instrument. + Set automatically from instruments['download'] when + `initialize_test_package` is run. + + """ + + # Get the instrument information and update the date to be in the future + self.test_inst, self.date = initialize_test_inst_and_date(inst_dict) + self.date = dt.datetime(dt.datetime.utcnow().year + 100, 1, 1) + + # Make sure the strict time flag doesn't interfere with the load test + load_and_set_strict_time_flag(self.test_inst, self.date, + raise_error=True) + + # Check the empty status + assert self.test_inst.empty, "Data was loaded for a far-future time" + assert self.test_inst.meta == pysat.Meta(), "Meta data is not empty" + if self.test_inst.pandas_format: + assert all(self.test_inst.data == pds.DataFrame()), "Data not empty" + else: + assert self.test_inst.data.dims == xr.Dataset().dims, \ + "Dims not empty" + assert self.test_inst.data.data_vars == xr.Dataset().data_vars, \ + "Data variables not empty" + + return + + # TODO(#1172): remove mark.new_tests at v3.3.0 + @pytest.mark.second + @pytest.mark.load_options + @pytest.mark.new_tests + def test_load_multiple_days(self, inst_dict): + """Test that instruments load multiple days when requested. + + Parameters + ---------- + inst_dict : dict + Dictionary containing info to instantiate a specific instrument. + Set automatically from instruments['download'] when + `initialize_test_package` is run. + + """ + + self.test_inst, self.date = initialize_test_inst_and_date(inst_dict) + if len(self.test_inst.files.files) > 0: + if self.date < self.test_inst.today(): + # Make sure the strict time flag doesn't interfere with + # the load tests, and re-run with desired clean level + self.test_inst.clean_level = 'none' + load_and_set_strict_time_flag(self.test_inst, self.date, + raise_error=True, clean_off=True, + set_end_date=True) + + # Make sure more than one day has been loaded + assert hasattr(self.test_inst.index, 'day'), \ + "No data to load for {:}-{:}".format( + self.date, self.date + dt.timedelta(days=2)) + assert len(np.unique(self.test_inst.index.day)) > 1 + else: + pytest.skip("".join(["Can't download multiple days of real-", + "time or forecast data"])) + else: + pytest.skip("Download data not available") + + return + + @pytest.mark.second + @pytest.mark.load_options + @pytest.mark.parametrize("clean_level", ['dirty', 'dusty', 'clean']) + def test_clean_warn(self, clean_level, inst_dict, caplog): + """Test that appropriate warnings and errors are raised when cleaning. + + Parameters + ---------- + clean_level : str + Cleanliness level for loaded instrument data. + inst_dict : dict + Dictionary containing info to instantiate a specific instrument. + Set automatically from instruments['download'] when + `initialize_test_package` is run. + + """ + # Not all Instruments have warning messages to test, only run tests + # when the desired test attribute is defined + if hasattr(inst_dict['inst_module'], '_clean_warn'): + clean_warn = inst_dict['inst_module']._clean_warn[ + inst_dict['inst_id']][inst_dict['tag']] + + # Cleaning warnings may vary by clean level, test the warning + # messages at the current clean level, specified by `clean_level` + if clean_level in clean_warn.keys(): + # Only need to test if there are clean warnings for this level + self.test_inst, self.date = initialize_test_inst_and_date( + inst_dict) + clean_warnings = clean_warn[clean_level] + + # Make sure the strict time flag doesn't interfere with + # the cleaning tests + load_and_set_strict_time_flag(self.test_inst, self.date) + + # Cycle through each of the potential cleaning messages + # for this Instrument module, inst ID, tag, and clean level + for (clean_method, clean_method_level, clean_method_msg, + final_level) in clean_warnings: + if len(self.test_inst.files.files) > 0: + # Set the clean level + self.test_inst.clean_level = clean_level + target = 'Fake Data to be cleared' + self.test_inst.data = [target] + + if clean_method == 'logger': + # A logging message is expected + with caplog.at_level( + getattr(logging, clean_method_level), + logger='pysat'): + self.test_inst.load(date=self.date) + + # Test the returned message + out_msg = caplog.text + assert out_msg.find(clean_method_msg) >= 0, \ + "{:s} not in output: {:s}".format( + clean_method_msg, out_msg) + elif clean_method == 'warning': + # A warning message is expected + with warnings.catch_warnings(record=True) as war: + self.test_inst.load(date=self.date) + + # Test the warning output + testing.eval_warnings(war, [clean_method_msg], + clean_method_level) + elif clean_method == 'error': + # An error message is expected, evaluate error + # and the error message + testing.eval_bad_input( + self.test_inst.load, clean_method_level, + clean_method_msg, + input_kwargs={'date': self.date}) + else: + raise AttributeError( + 'unknown type of warning: {:}'.format( + clean_method)) + + # Test to see if the clean flag has the expected value + # afterwards + assert self.test_inst.clean_level == final_level, \ + "Clean level should now be {:s}, not {:s}".format( + final_level, self.test_inst.clean_level) + + # Make sure fake data is cleared + assert target not in self.test_inst.data + else: + pytest.skip("".join(["Can't test clean warnings for ", + "Instrument ", + repr(inst_dict['inst_module']), + " level ", clean_level, + " (no downloaded files)"])) + else: + pytest.skip("".join(["No clean warnings for Instrument ", + repr(inst_dict['inst_module']), " level ", + clean_level])) + else: + pytest.skip("No clean warnings for Instrument {:s}".format( + repr(inst_dict['inst_module']))) + + return + + # TODO(#1172): remove mark.new_tests at v3.3.0 + @pytest.mark.second + @pytest.mark.load_options + @pytest.mark.new_tests + @pytest.mark.parametrize('pad', [{'days': 1}, dt.timedelta(days=1)]) + def test_load_w_pad(self, pad, inst_dict): + """Test that instruments load with a pad specified different ways. + + Parameters + ---------- + pad : pds.DateOffset, dict, or NoneType + Valid pad value for initializing an instrument + inst_dict : dict + Dictionary containing info to instantiate a specific instrument. + Set automatically from instruments['download'] when + `initialize_test_package` is run. + + """ + # Skip for Python 3.6, keeping information that will allow adding + # or skipping particular instruments. + # TODO(#1136): Remove skip once memory management is improved + if sys.version_info.minor < 7: + pytest.skip("skipping 3.6 for {:} ({:} =? {:})".format( + inst_dict, inst_dict['inst_module'].__name__.find( + 'pysat_testing'), len(inst_dict['inst_module'].__name__) + - len('pysat_testing'))) + return + + # Update the Instrument dict with the desired pad + if 'kwargs' in inst_dict.keys(): + inst_dict['kwargs']['pad'] = pad + else: + inst_dict['kwargs'] = {'pad': pad} + + # Assign the expected representation + if type(pad) in [dict]: + pad_repr = repr(pds.DateOffset(days=1)) + elif type(pad) in [dt.timedelta]: + pad_repr = "1 day, 0:00:00" + else: + pad_repr = repr(pad) + + self.test_inst, self.date = initialize_test_inst_and_date(inst_dict) + if len(self.test_inst.files.files) > 0: + # Make sure the strict time flag doesn't interfere with + # the load tests + self.test_inst.clean = 'none' + load_and_set_strict_time_flag(self.test_inst, self.date, + raise_error=True, clean_off=True) + + if self.test_inst.empty: + # This will be empty if this is a forecast file that doesn't + # include the load date + self.test_inst.pad = None + load_and_set_strict_time_flag(self.test_inst, self.date, + raise_error=True, clean_off=True) + assert not self.test_inst.empty, \ + "No data on {:}".format(self.date) + assert self.test_inst.index.max() < self.date, \ + "Padding should have left data and didn't" + else: + # Padding was successful, evaluate the data index length + assert (self.test_inst.index[-1] + - self.test_inst.index[0]).total_seconds() < 86400.0 + + # Evaluate the recorded pad + inst_str = self.test_inst.__str__() + assert inst_str.find( + 'Data Padding: {:s}'.format(pad_repr)) > 0, "".join([ + "bad pad value: ", pad_repr, " not in ", inst_str]) else: pytest.skip("Download data not available") @@ -379,11 +713,11 @@ def test_remote_file_list(self, inst_dict): """ - test_inst, date = initialize_test_inst_and_date(inst_dict) - name = '_'.join((test_inst.platform, test_inst.name)) + self.test_inst, self.date = initialize_test_inst_and_date(inst_dict) + name = '_'.join((self.test_inst.platform, self.test_inst.name)) if hasattr(getattr(self.inst_loc, name), 'list_remote_files'): - assert callable(test_inst.remote_file_list) + assert callable(self.test_inst.remote_file_list) # Check for username if 'user_info' in inst_dict.keys(): @@ -391,7 +725,8 @@ def test_remote_file_list(self, inst_dict): else: dl_dict = {} - files = test_inst.remote_file_list(start=date, stop=date, **dl_dict) + files = self.test_inst.remote_file_list(start=self.date, + stop=self.date, **dl_dict) # If test date is correctly chosen, files should exist assert len(files) > 0 @@ -413,10 +748,10 @@ def test_download_warning(self, inst_dict): """ - test_inst, date = initialize_test_inst_and_date(inst_dict) + self.test_inst, self.date = initialize_test_inst_and_date(inst_dict) with warnings.catch_warnings(record=True) as war: - test_inst.download(date, date) + self.test_inst.download(self.date, self.date) assert len(war) >= 1 categories = [war[j].category for j in range(0, len(war))] diff --git a/pysat/tests/classes/cls_instrument_property.py b/pysat/tests/classes/cls_instrument_property.py index 554cd7fef..93a020d9e 100644 --- a/pysat/tests/classes/cls_instrument_property.py +++ b/pysat/tests/classes/cls_instrument_property.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Test for instrument properties in the pysat Instrument object and methods. Note @@ -378,7 +386,7 @@ def test_inst_attributes_not_overwritten(self): greeting = '... listen!' self.testInst.hei = greeting - self.testInst.load(date=self.ref_time, use_header=True) + self.testInst.load(date=self.ref_time) assert self.testInst.hei == greeting return @@ -416,15 +424,11 @@ def test_str_w_orbit(self): """Test string output with Orbit data.""" reload(pysat.instruments.pysat_testing) - orbit_info = {'index': 'mlt', - 'kind': 'local time', + orbit_info = {'index': 'mlt', 'kind': 'local time', 'period': np.timedelta64(97, 'm')} testInst = pysat.Instrument(platform='pysat', name='testing', - num_samples=10, - clean_level='clean', - update_files=True, - orbit_info=orbit_info, - use_header=True) + num_samples=10, clean_level='clean', + update_files=True, orbit_info=orbit_info) self.out = testInst.__str__() @@ -434,7 +438,7 @@ def test_str_w_orbit(self): assert self.out.find('Loaded Orbit Number: 0') > 0 # Activate orbits, check that message has changed - testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + testInst.load(self.ref_time.year, self.ref_doy) testInst.orbits.next() self.out = testInst.__str__() assert self.out.find('Loaded Orbit Number: 1') > 0 @@ -462,7 +466,7 @@ def passfunc(self): def test_str_w_load_lots_data(self): """Test string output with loaded data with many variables.""" - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) self.out = self.testInst.__str__() assert self.out.find('Number of variables:') > 0 assert self.out.find('...') > 0 @@ -472,7 +476,7 @@ def test_str_w_load_less_data(self): """Test string output with loaded data with few (4) variables.""" # Load the test data - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) # Ensure the desired data variable is present and delete all others # 4-6 variables are needed to test all lines; choose the lesser limit @@ -548,7 +552,7 @@ def test_instrument_function_keywords(self, caplog, func, kwarg, val): with caplog.at_level(logging.INFO, logger='pysat'): # Trigger load functions - self.testInst.load(date=self.ref_time, use_header=True) + self.testInst.load(date=self.ref_time) # Refresh files to trigger other functions self.testInst.files.refresh() @@ -607,7 +611,7 @@ def test_instrument_function_keyword_liveness(self, caplog, func, kwarg): with caplog.at_level(logging.INFO, logger='pysat'): # Trigger load functions - self.testInst.load(date=self.ref_time, use_header=True) + self.testInst.load(date=self.ref_time) # Refresh files to trigger other functions self.testInst.files.refresh() @@ -693,7 +697,7 @@ def test_optional_unknown_data_dir(self, caplog): [({'inst_id': 'invalid_inst_id'}, "'invalid_inst_id' is not one of the supported inst_ids."), ({'inst_id': '', 'tag': 'bad_tag'}, - "'bad_tag' is not one of the supported tags.")]) + "'bad_tag' is not one of the supported tags")]) def test_error_bad_instrument_object(self, kwargs, estr): """Ensure instantiation with invalid inst_id or tag errors. diff --git a/pysat/tests/classes/cls_registration.py b/pysat/tests/classes/cls_registration.py index 7d21a4a28..ae1da2e64 100644 --- a/pysat/tests/classes/cls_registration.py +++ b/pysat/tests/classes/cls_registration.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Standardized class and functions to test registration for pysat libraries. @@ -12,7 +15,6 @@ """ import importlib -import pytest import sys import pysat diff --git a/pysat/tests/instrument_test_class.py b/pysat/tests/instrument_test_class.py deleted file mode 100644 index c35e0ff3a..000000000 --- a/pysat/tests/instrument_test_class.py +++ /dev/null @@ -1,78 +0,0 @@ -"""Standardized class and functions to test instruments for pysat libraries. - -Note ----- -Not directly called by pytest, but imported as part of test_instruments.py. -Can be imported directly for external instrument libraries of pysat instruments. - -""" - -import warnings - -import pysat.tests.classes.cls_instrument_library as cls_inst_lib - - -def initialize_test_inst_and_date(inst_dict): - """Initialize the instrument object to test and date. - - .. deprecated:: 3.0.2 - `initialize_test_inst_and_date` will be removed in pysat 3.2.0, it is - moved to `pysat.tests.classes.cls_instrument_library`. - - Parameters - ---------- - inst_dict : dict - Dictionary containing specific instrument info, generated by - generate_instrument_list - - Returns - ------- - test_inst : pysat.Instrument - instrument object to be tested - date : dt.datetime - test date from module - - """ - - warnings.warn(" ".join(["`initialize_test_inst_and_date` has been moved to", - "`pysat.tests.classes.cls_instrument_library`.", - "The link here will be removed in 3.2.0+."]), - DeprecationWarning, stacklevel=2) - return cls_inst_lib.initialize_test_inst_and_date(inst_dict) - - -class InstTestClass(cls_inst_lib.InstLibTests): - """Provide standardized tests for pysat instrument libraries. - - .. deprecated:: 3.0.2 - `InstTestClass` will be removed in pysat 3.2.0, it is replaced by - `pysat.tests.classes.cls_instrument_library.InstLibTests`. - - Note - ---- - Uses class level setup and teardown so that all tests use the same - temporary directory. We do not want to geneate a new tempdir for each test, - as the load tests need to be the same as the download tests. - - Not directly run by pytest, but inherited through test_instruments.py - - Users will need to run `apply_marks_to_tests` before setting up the test - class. - - """ - - def __init_subclass__(self): - """Throw a warning if used as a subclass.""" - - warnings.warn(" ".join( - ["`InstTestClass` has been deprecated and will be removed in", - "3.2.0+. Please update code to use the `InstLibTests` class", - "under `pysat.tests.classes.cls_instrument_library`."]), - DeprecationWarning, stacklevel=2) - warnings.warn(" ".join( - ["`test_load` now uses `@pytest.mark.load_options` in place", - "of `@pytest.mark.download`. The old behavior will be removed in", - "3.2.0+. Please update code or use the new" - "`InstLibTests.initialize_test_package` function", - "under `pysat.tests.classes.cls_instrument_library`."]), - DeprecationWarning, stacklevel=2) diff --git a/pysat/tests/test_constellation.py b/pysat/tests/test_constellation.py index 5dda00d34..42712da1b 100644 --- a/pysat/tests/test_constellation.py +++ b/pysat/tests/test_constellation.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Unit tests for the Constellation class.""" @@ -43,8 +46,7 @@ def test_construct_constellation(self, ikeys, ivals, ilen): # Initialize the Constellation using the desired kwargs const = pysat.Constellation( - **{ikey: ivals[i] for i, ikey in enumerate(ikeys)}, - use_header=True) + **{ikey: ivals[i] for i, ikey in enumerate(ikeys)}) # Test that the appropriate number of Instruments were loaded. Each # fake Instrument has 5 tags and 1 inst_id. @@ -72,7 +74,7 @@ def test_some_bad_construct_constellation(self, caplog): # Load the Constellation and capture log output with caplog.at_level(logging.WARNING, logger='pysat'): const = pysat.Constellation(platforms=['Executor', 'platname1'], - tags=[''], use_header=True) + tags=['']) # Test the partial Constellation initialization assert len(const.instruments) == 2 @@ -164,7 +166,7 @@ def test_getitem(self): """Test Constellation iteration through instruments attribute.""" self.in_kwargs['const_module'] = None - self.const = pysat.Constellation(**self.in_kwargs, use_header=True) + self.const = pysat.Constellation(**self.in_kwargs) tst_get_inst = self.const[:] pysat.utils.testing.assert_lists_equal(self.instruments, tst_get_inst) return @@ -173,7 +175,7 @@ def test_repr_w_inst(self): """Test Constellation string output with instruments loaded.""" self.in_kwargs['const_module'] = None - self.const = pysat.Constellation(**self.in_kwargs, use_header=True) + self.const = pysat.Constellation(**self.in_kwargs) out_str = self.const.__repr__() assert out_str.find("Constellation(instruments") >= 0 @@ -183,7 +185,7 @@ def test_str_w_inst(self): """Test Constellation string output with instruments loaded.""" self.in_kwargs['const_module'] = None - self.const = pysat.Constellation(**self.in_kwargs, use_header=True) + self.const = pysat.Constellation(**self.in_kwargs) out_str = self.const.__str__() assert out_str.find("pysat Constellation ") >= 0 @@ -216,7 +218,7 @@ def test_str_with_data(self, common_index, cstr): self.in_kwargs["common_index"] = common_index self.const = pysat.Constellation(**self.in_kwargs) - self.const.load(date=self.ref_time, use_header=True) + self.const.load(date=self.ref_time) out_str = self.const.__str__() assert out_str.find("pysat Constellation ") >= 0 @@ -239,7 +241,7 @@ def double_mlt(inst): # Add the custom function self.const.custom_attach(double_mlt, at_pos='end') - self.const.load(date=self.ref_time, use_header=True) + self.const.load(date=self.ref_time) # Test the added value for inst in self.const: @@ -255,7 +257,7 @@ def setup_method(self): """Set up the unit test environment for each method.""" self.inst = list(constellations.testing.instruments) - self.const = pysat.Constellation(instruments=self.inst, use_header=True) + self.const = pysat.Constellation(instruments=self.inst) self.ref_time = pysat.instruments.pysat_testing._test_dates[''][''] self.attrs = ["platforms", "names", "tags", "inst_ids", "instruments", "bounds", "empty", "empty_partial", "index_res", @@ -342,7 +344,7 @@ def test_empty_flag_data_empty_partial_load(self): """Test the status of the empty flag for partially loaded data.""" self.const = pysat.Constellation( - const_module=constellations.testing_partial, use_header=True) + const_module=constellations.testing_partial) self.const.load(date=self.ref_time) assert self.const.empty_partial assert not self.const.empty @@ -352,7 +354,7 @@ def test_empty_flag_data_not_empty_partial_load(self): """Test the alt status of the empty flag for partially loaded data.""" self.const = pysat.Constellation( - const_module=constellations.testing_partial, use_header=True) + const_module=constellations.testing_partial) self.const.load(date=self.ref_time) assert not self.const._empty(all_inst=False) return @@ -361,7 +363,7 @@ def test_empty_flag_data_not_empty(self): """Test the status of the empty flag for loaded data.""" # Load data and test the status flag - self.const.load(date=self.ref_time, use_header=True) + self.const.load(date=self.ref_time) assert not self.const.empty return @@ -372,7 +374,7 @@ def test_full_data_index(self, ikwarg): # Test the attribute with loaded data self.const = pysat.Constellation(instruments=self.inst, **ikwarg) - self.const.load(date=self.ref_time, use_header=True) + self.const.load(date=self.ref_time) assert isinstance(self.const.index, pds.Index) assert self.const.index[0] == self.ref_time @@ -394,7 +396,7 @@ def test_full_data_date(self): """Test the date property when no data is loaded.""" # Test the attribute with loaded data - self.const.load(date=self.ref_time, use_header=True) + self.const.load(date=self.ref_time) assert self.const.date == self.ref_time return @@ -403,7 +405,7 @@ def test_full_variables(self): """Test the variables property when no data is loaded.""" # Test the attribute with loaded data - self.const.load(date=self.ref_time, use_header=True) + self.const.load(date=self.ref_time) assert len(self.const.variables) > 0 assert 'uts_pysat_testing' in self.const.variables @@ -478,9 +480,6 @@ def test_to_inst_xarray(self, common_coord, fill_method): testing.assert_lists_equal(self.dims, list(out_inst.data.dims.keys())) testing.assert_list_contains(self.dims, list(out_inst.data.coords.keys())) - testing.assert_list_contains(['variable_profile_height', 'image_lon', - 'image_lat'], - list(out_inst.data.coords.keys())) for cinst in self.const.instruments: for var in cinst.variables: @@ -504,10 +503,9 @@ def test_to_inst_pandas_w_pad(self): """ # Redefine the Instrument and constellation self.inst = pysat.Instrument( - inst_module=pysat.instruments.pysat_testing, use_header=True, - pad=pds.DateOffset(hours=1), num_samples=10) - self.const = pysat.Constellation(instruments=[self.inst], - use_header=True) + inst_module=pysat.instruments.pysat_testing, num_samples=10, + pad=pds.DateOffset(hours=1)) + self.const = pysat.Constellation(instruments=[self.inst]) # Load the data self.inst.load(date=self.ref_time) @@ -542,11 +540,11 @@ def test_to_inst_mult_pad_clean(self): pad = pds.DateOffset(hours=1) self.inst = [ pysat.Instrument(inst_module=pysat.instruments.pysat_testing, - use_header=True, pad=pad, num_samples=10), + pad=pad, num_samples=10), pysat.Instrument(inst_module=pysat.instruments.pysat_testing, - use_header=True, pad=2 * pad, - clean_level=clean_level, num_samples=10)] - self.const = pysat.Constellation(instruments=self.inst, use_header=True) + pad=2 * pad, clean_level=clean_level, + num_samples=10)] + self.const = pysat.Constellation(instruments=self.inst) # Load the Instrument and Constellation data self.inst[-1].load(date=self.ref_time) @@ -577,3 +575,91 @@ def test_to_inst_mult_pad_clean(self): list(self.inst[1].index)) return + + @pytest.mark.parametrize("method", ["del", "drop"]) + def test_delitem_all_inst(self, method): + """Test Constellation deletion of data variables from all instruments. + + Parameters + ---------- + method : str + String specifying the deletion method + + """ + # Load the Constellation data + self.const.load(date=self.ref_time) + + # Delete the UTS data from all instruments + dvar = "uts" + if method == "del": + del self.const[dvar] + else: + self.const.drop(dvar) + + # Test that this variable is gone from all Instruments + for inst in self.const.instruments: + assert dvar not in inst.variables + + # Test that the constellation variable list has been updated + for var in self.const.variables: + assert var.find(dvar) != 0 + + return + + @pytest.mark.parametrize("method", ["del", "drop"]) + def test_delitem_one_inst(self, method): + """Test Constellation deletion of data variables from one instrument. + + Parameters + ---------- + method : str + String specifying the deletion method + + """ + # Load the Constellation data + self.const.load(date=self.ref_time) + + # Delete the UTS data from the pysat testing instrument + dvar = "uts_pysat_testing" + if method == "del": + del self.const[dvar] + else: + self.const.drop(dvar) + + # Test that this variable is gone from only the desired Instrument + for inst in self.const.instruments: + if inst.platform == "pysat" and inst.name == "testing": + assert "uts" not in inst.variables + else: + assert "uts" in inst.variables + + # Test that the constellation variable list has been updated + assert dvar not in self.const.variables + + return + + def test_bad_var_drop(self): + """Check for error when deleting absent data variable.""" + # Load the Constellation data + self.const.load(date=self.ref_time) + + # Test that the correct error is raised + testing.eval_bad_input(self.const.drop, KeyError, + "not found in Constellation", + input_args=["not_a_data_variable"]) + return + + def test_partial_bad_var_drop(self, caplog): + """Check for log warning when deleting present and absent variables.""" + # Load the Constellation data + self.const.load(date=self.ref_time) + + dvars = [self.const.variables[0], "not_a_data_var"] + + # Test that the correct warning is raised + with caplog.at_level(logging.INFO, logger='pysat'): + self.const.drop(dvars) + + captured = caplog.text + assert captured.find("not found in Constellation") > 0 + return diff --git a/pysat/tests/test_files.py b/pysat/tests/test_files.py index be2b38daa..31875f797 100644 --- a/pysat/tests/test_files.py +++ b/pysat/tests/test_files.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Test pysat Files object and code.""" import datetime as dt @@ -8,7 +16,6 @@ import os import pandas as pds import tempfile -import time import pytest @@ -148,8 +155,7 @@ def setup_method(self): self.testInst = pysat.Instrument( inst_module=pysat.instruments.pysat_testing, clean_level='clean', - temporary_file_list=self.temporary_file_list, update_files=True, - use_header=True) + temporary_file_list=self.temporary_file_list, update_files=True) # Create instrument directories in tempdir create_dir(self.testInst) @@ -215,7 +221,7 @@ def test_equality_with_copy(self): def test_equality_with_copy_with_data(self): """Test that copy is the same as original, loaded `inst.data`.""" # Load data - self.testInst.load(date=self.start, use_header=True) + self.testInst.load(date=self.start) # Make copy self.out = self.testInst.files.copy() @@ -586,7 +592,8 @@ def test_instrument_has_no_files(self): inst = pysat.Instrument(platform='pysat', name='testing', update_files=True) reload(pysat.instruments.pysat_testing) - assert(inst.files.files.empty) + assert inst.files.files.empty + return def test_instrument_has_files(self): @@ -1086,7 +1093,7 @@ def test_files_non_standard_file_format_template_fail(self, file_format): 'temporary_file_list': self.temporary_file_list} testing.eval_bad_input(pysat.Instrument, ValueError, - 'file format set to default', + 'Supplied format string', input_kwargs=in_kwargs) return @@ -1295,8 +1302,6 @@ def teardown_method(self): temporary_file_list=self.temporary_file_list) pysat.params['data_dirs'] = self.data_paths - # TODO(#871): This needs to be replaced or expanded based on the tests that - # portalocker uses def test_race_condition(self): """Test that multiple instances of pysat instrument creation run.""" processes = 5 diff --git a/pysat/tests/test_instrument.py b/pysat/tests/test_instrument.py index 705d3eaa5..0ed962333 100644 --- a/pysat/tests/test_instrument.py +++ b/pysat/tests/test_instrument.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- # -*- coding: utf-8 -*- """Tests the pysat Instrument object and methods.""" @@ -12,15 +20,12 @@ import pysat import pysat.instruments.pysat_ndtesting import pysat.instruments.pysat_testing -import pysat.instruments.pysat_testing2d -import pysat.instruments.pysat_testing_xarray from pysat.tests.classes.cls_instrument_access import InstAccessTests from pysat.tests.classes.cls_instrument_integration import InstIntegrationTests from pysat.tests.classes.cls_instrument_iteration import InstIterationTests from pysat.tests.classes.cls_instrument_property import InstPropertyTests from pysat.utils import testing -from pysat.utils.time import filter_datetime_input class TestBasics(InstAccessTests, InstIntegrationTests, InstIterationTests, @@ -51,10 +56,8 @@ def setup_method(self): reload(pysat.instruments.pysat_testing) self.testInst = pysat.Instrument(platform='pysat', name='testing', - num_samples=10, - clean_level='clean', + num_samples=10, clean_level='clean', update_files=True, - use_header=True, **self.testing_kwargs) self.ref_time = pysat.instruments.pysat_testing._test_dates[''][''] self.ref_doy = int(self.ref_time.strftime('%j')) @@ -92,10 +95,8 @@ def setup_method(self): self.ref_time + pds.DateOffset(years=2) - pds.DateOffset(days=1), freq=self.freq) self.testInst = pysat.Instrument(platform='pysat', name='testing', - num_samples=10, - clean_level='clean', + num_samples=10, clean_level='clean', update_files=True, - use_header=True, file_date_range=date_range, **self.testing_kwargs) self.ref_doy = int(self.ref_time.strftime('%j')) @@ -124,10 +125,8 @@ def setup_method(self): + pds.DateOffset(years=2, days=-1), freq=self.freq) self.testInst = pysat.Instrument(platform='pysat', name='testing', - num_samples=10, - clean_level='clean', + num_samples=10, clean_level='clean', update_files=True, - use_header=True, file_date_range=date_range, **self.testing_kwargs) self.ref_doy = int(self.ref_time.strftime('%j')) @@ -157,10 +156,8 @@ def setup_method(self): + pds.DateOffset(years=5, days=-1), freq=self.freq) self.testInst = pysat.Instrument(platform='pysat', name='testing', - num_samples=10, - clean_level='clean', + num_samples=10, clean_level='clean', update_files=True, - use_header=True, file_date_range=date_range, **self.testing_kwargs) self.ref_doy = int(self.ref_time.strftime('%j')) @@ -182,11 +179,8 @@ def setup_method(self): reload(pysat.instruments.pysat_testing) imod = pysat.instruments.pysat_testing - self.testInst = pysat.Instrument(inst_module=imod, - num_samples=10, - clean_level='clean', - update_files=True, - use_header=True, + self.testInst = pysat.Instrument(inst_module=imod, num_samples=10, + clean_level='clean', update_files=True, **self.testing_kwargs) self.ref_time = imod._test_dates[''][''] self.ref_doy = int(self.ref_time.strftime('%j')) @@ -200,60 +194,6 @@ def teardown_method(self): return -# TODO(#908): remove below class when pysat_testing_xarray is removed. -class TestBasicsXarray(TestBasics): - """Basic tests for xarray `pysat.Instrument`.""" - - def setup_method(self): - """Set up the unit test environment for each method.""" - - reload(pysat.instruments.pysat_testing_xarray) - self.testInst = pysat.Instrument(platform='pysat', - name='testing_xarray', - num_samples=10, - clean_level='clean', - update_files=True, - use_header=True, - **self.testing_kwargs) - self.ref_time = pysat.instruments.pysat_testing_xarray._test_dates[ - ''][''] - self.ref_doy = int(self.ref_time.strftime('%j')) - self.out = None - return - - def teardown_method(self): - """Clean up the unit test environment after each method.""" - - del self.testInst, self.out, self.ref_time, self.ref_doy - return - - -# TODO(#908): remove below class when pysat_testing2d is removed. -class TestBasics2D(TestBasics): - """Basic tests for 2D pandas `pysat.Instrument`.""" - - def setup_method(self): - """Set up the unit test environment for each method.""" - - reload(pysat.instruments.pysat_testing2d) - self.testInst = pysat.Instrument(platform='pysat', name='testing2d', - num_samples=50, - clean_level='clean', - update_files=True, - use_header=True, - **self.testing_kwargs) - self.ref_time = pysat.instruments.pysat_testing2d._test_dates[''][''] - self.ref_doy = int(self.ref_time.strftime('%j')) - self.out = None - return - - def teardown_method(self): - """Clean up the unit test environment after each method.""" - - del self.testInst, self.out, self.ref_time, self.ref_doy - return - - class TestBasicsNDXarray(TestBasics): """Basic tests for ND xarray `pysat.Instrument`. @@ -267,12 +207,9 @@ def setup_method(self): """Set up the unit test environment for each method.""" reload(pysat.instruments.pysat_ndtesting) - self.testInst = pysat.Instrument(platform='pysat', - name='ndtesting', - num_samples=10, - clean_level='clean', + self.testInst = pysat.Instrument(platform='pysat', name='ndtesting', + num_samples=10, clean_level='clean', update_files=True, - use_header=True, **self.testing_kwargs) self.ref_time = pysat.instruments.pysat_ndtesting._test_dates[''][''] self.ref_doy = int(self.ref_time.strftime('%j')) @@ -288,7 +225,7 @@ def teardown_method(self): def test_setting_data_as_tuple(self): """Test setting data as a tuple.""" - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) self.testInst['doubleMLT'] = ('time', 2. * self.testInst['mlt'].values) assert np.all(self.testInst['doubleMLT'] == 2. * self.testInst['mlt']) return @@ -323,7 +260,7 @@ def test_data_access_by_2d_indices_and_name(self, index): """ - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) assert np.all(self.testInst[index, index, 'profiles'] == self.testInst.data['profiles'][index, index]) return @@ -331,7 +268,7 @@ def test_data_access_by_2d_indices_and_name(self, index): def test_data_access_by_2d_tuple_indices_and_name(self): """Check that variables and be accessed by multi-dim tuple index.""" - self.testInst.load(date=self.ref_time, use_header=True) + self.testInst.load(date=self.ref_time) index = ([0, 1, 2, 3], [0, 1, 2, 3]) assert np.all(self.testInst[index, 'profiles'] == self.testInst.data['profiles'][index[0], index[1]]) @@ -340,7 +277,7 @@ def test_data_access_by_2d_tuple_indices_and_name(self): def test_data_access_bad_dimension_tuple(self): """Test raises ValueError for mismatched tuple index and data dims.""" - self.testInst.load(date=self.ref_time, use_header=True) + self.testInst.load(date=self.ref_time) index = ([0, 1, 2, 3], [0, 1, 2, 3], [0, 1, 2, 3]) with pytest.raises(ValueError) as verr: @@ -353,7 +290,7 @@ def test_data_access_bad_dimension_tuple(self): def test_data_access_bad_dimension_for_multidim(self): """Test raises ValueError for mismatched index and data dimensions.""" - self.testInst.load(date=self.ref_time, use_header=True) + self.testInst.load(date=self.ref_time) index = [0, 1, 2, 3] with pytest.raises(ValueError) as verr: @@ -380,7 +317,7 @@ def test_setting_partial_data_by_2d_indices_and_name(self, changed, fixed): """ - self.testInst.load(self.ref_time.year, self.ref_doy, use_header=True) + self.testInst.load(self.ref_time.year, self.ref_doy) self.testInst['doubleProfile'] = 2. * self.testInst['profiles'] self.testInst[changed, changed, 'doubleProfile'] = 0 assert np.all(np.all(self.testInst[fixed, fixed, 'doubleProfile'] @@ -429,7 +366,7 @@ def test_set_xarray_single_value_warnings(self, val, warn_msg): warnings.simplefilter("always") - self.testInst.load(date=self.ref_time, use_header=True) + self.testInst.load(date=self.ref_time) with warnings.catch_warnings(record=True) as self.war: self.testInst["new_val"] = val @@ -447,7 +384,7 @@ def test_set_xarray_single_value_errors(self): """ - self.testInst.load(date=self.ref_time, use_header=True) + self.testInst.load(date=self.ref_time) self.testInst.data = self.testInst.data.assign_coords( {'preset_val': np.array([1.0, 2.0])}) @@ -469,7 +406,7 @@ def test_set_xarray_single_value_broadcast(self, new_val): """ - self.testInst.load(date=self.ref_time, use_header=True) + self.testInst.load(date=self.ref_time) self.testInst.data = self.testInst.data.assign_coords( {'preset_val': 1.0}) @@ -489,12 +426,10 @@ def setup_method(self): reload(pysat.instruments.pysat_testing) self.testInst = pysat.Instrument(platform='pysat', name='testing', - num_samples=10, - clean_level='clean', + num_samples=10, clean_level='clean', update_files=True, mangle_file_dates=True, strict_time_flag=True, - use_header=True, **self.testing_kwargs) self.ref_time = pysat.instruments.pysat_testing._test_dates[''][''] self.ref_doy = int(self.ref_time.strftime('%j')) @@ -529,6 +464,19 @@ def test_creating_empty_instrument_object(self): assert isinstance(self.empty_inst, pysat.Instrument) return + def test_load_empty_instrument_no_files_error(self): + """Ensure error loading Instrument with no files.""" + + # Trying to load when there are no files produces multiple warnings + # and one Error. + with warnings.catch_warnings(record=True) as self.war: + testing.eval_bad_input(self.empty_inst.load, IndexError, + 'index 0 is out of bounds') + estr = ['No files found for Instrument', 'IndexError will not be'] + testing.eval_warnings(self.war, estr, warn_type=[UserWarning, + DeprecationWarning]) + return + def test_empty_repr_eval(self): """Test that repr functions on empty `Instrument`.""" @@ -573,11 +521,11 @@ def test_eq_different_object(self): obj1 = pysat.Instrument(platform='pysat', name='testing', num_samples=10, clean_level='clean', - update_files=True, use_header=True) + update_files=True) - obj2 = pysat.Instrument(platform='pysat', name='testing_xarray', + obj2 = pysat.Instrument(platform='pysat', name='ndtesting', num_samples=10, clean_level='clean', - update_files=True, use_header=True) + update_files=True) assert not (obj1 == obj2) return @@ -613,199 +561,20 @@ def eval_warnings(self): testing.eval_warnings(self.war, self.warn_msgs) return - def test_instrument_labels(self): - """Test deprecation of `labels` kwarg in Instrument.""" - self.in_kwargs['labels'] = { - 'units': ('units', str), 'name': ('long_name', str), - 'notes': ('notes', str), 'desc': ('desc', str), - 'min_val': ('value_min', float), 'max_val': ('value_max', float), - 'fill_val': ('fill', float)} - - # Catch the warnings - with warnings.catch_warnings(record=True) as self.war: - tinst = pysat.Instrument(use_header=True, **self.in_kwargs) - - self.warn_msgs = np.array(["`labels` is deprecated, use `meta_kwargs`"]) - - # Evaluate the warning output - self.eval_warnings() - - # Evaluate the performance - assert float in tinst.meta.labels.label_type['fill_val'] - return - - @pytest.mark.parametrize('use_kwargs', [True, False]) - def test_instrument_meta_labels(self, use_kwargs): - """Test deprecation of `meta_labels` attribute in Instrument. - - Parameters - ---------- - use_kwargs : bool - If True, specify labels on input. If False, use defaults. - - """ - if use_kwargs: - self.in_kwargs['meta_kwargs'] = {'labels': { - 'units': ('units', str), 'name': ('long_name', str), - 'notes': ('notes', str), 'desc': ('desc', str), - 'min_val': ('value_min', float), - 'max_val': ('value_max', float), 'fill_val': ('fill', float)}} - - # Catch the warnings - with warnings.catch_warnings(record=True) as self.war: - tinst = pysat.Instrument(use_header=True, **self.in_kwargs) - labels = tinst.meta_labels - - self.warn_msgs = np.array(["Deprecated attribute, returns `meta_kwarg"]) - - # Evaluate the warning output - self.eval_warnings() - - # Evaluate the performance - if not use_kwargs: - self.in_kwargs['meta_kwargs'] = {'labels': { - 'units': ('units', str), 'name': ('long_name', str), - 'notes': ('notes', str), 'desc': ('desc', str), - 'min_val': ('value_min', (float, int)), - 'max_val': ('value_max', (float, int)), - 'fill_val': ('fill', (float, int, str))}} - - assert labels == self.in_kwargs['meta_kwargs']['labels'] - return - - def test_generic_meta_translator(self): - """Test deprecation of `generic_meta_translator`.""" - - # Catch the warnings - with warnings.catch_warnings(record=True) as self.war: - tinst = pysat.Instrument(use_header=True, **self.in_kwargs) - tinst.generic_meta_translator(tinst.meta) - - self.warn_msgs = np.array(["".join(["This function has been deprecated", - ". Please see "])]) - - # Evaluate the warning output - self.eval_warnings() - return - - def test_download_freq_kwarg(self): - """Test deprecation of download kwarg `freq`.""" - - # Catch the warnings - with warnings.catch_warnings(record=True) as self.war: - tinst = pysat.Instrument(use_header=True, **self.in_kwargs) - tinst.download(start=self.ref_time, freq='D') - - self.warn_msgs = np.array(["".join(["`pysat.Instrument.download` kwarg", - " `freq` has been deprecated and ", - "will be removed in pysat ", - "3.2.0+"])]) - - # Evaluate the warning output - self.eval_warnings() - return - - def test_download_travis_attr(self): - """Test deprecation of instrument attribute `_test_download_travis`.""" - - inst_module = pysat.instruments.pysat_testing - # Add deprecated attribute. - inst_module._test_download_travis = {'': {'': False}} - - self.warn_msgs = np.array([" ".join(["`_test_download_travis` has been", - "deprecated and will be replaced", - "by `_test_download_ci` in", - "3.2.0+"])]) - - # Catch the warnings. - with warnings.catch_warnings(record=True) as self.war: - tinst = pysat.Instrument(inst_module=inst_module, use_header=True) - - # Ensure attributes set properly. - assert tinst._test_download_ci is False - - # Evaluate the warning output - self.eval_warnings() - return - - def test_filter_netcdf4_metadata(self): - """Test deprecation warning generated by `_filter_netcdf4_metadata`.""" - - # Catch the warnings - with warnings.catch_warnings(record=True) as self.war: - tinst = pysat.Instrument(use_header=True, **self.in_kwargs) - tinst.load(date=self.ref_time, use_header=True) - mdata_dict = tinst.meta._data.to_dict() - tinst._filter_netcdf4_metadata(mdata_dict, - coltype='str') - - self.warn_msgs = np.array(["".join(["`pysat.Instrument.", - "_filter_netcdf4_metadata` ", - "has been deprecated and will be ", - "removed in pysat 3.2.0+. Use ", - "`pysat.utils.io.filter_netcdf4_", - "metadata` instead."])]) - - # Evaluate the warning output - self.eval_warnings() - return - - def test_to_netcdf4(self): - """Test deprecation warning generated by `to_netcdf4`.""" - - # Catch the warnings - with warnings.catch_warnings(record=True) as self.war: - tinst = pysat.Instrument(use_header=True, **self.in_kwargs) - try: - tinst.to_netcdf4() - except ValueError: - pass - - self.warn_msgs = np.array(["".join(["`fname` as a kwarg has been ", - "deprecated, must supply a ", - "filename 3.2.0+"])]) - - # Evaluate the warning output - self.eval_warnings() - return - - @pytest.mark.parametrize("kwargs", [{'inst_id': None}, {'tag': None}]) - def test_inst_init_with_none(self, kwargs): - """Check that instantiation with None raises a DeprecationWarning. - - Parameters - ---------- - kwargs : dict - Dictionary of optional kwargs to pass through for instrument - instantiation. - - """ - - with warnings.catch_warnings(record=True) as self.war: - pysat.Instrument('pysat', 'testing', use_header=True, **kwargs) - - self.warn_msgs = np.array(["".join(["The usage of None in `tag` and ", - "`inst_id` has been deprecated ", - "and will be removed in 3.2.0+. ", - "Please use '' instead of None."])]) - - # Evaluate the warning output - self.eval_warnings() - return - + # TODO(#1020): Remove test when keyword `use_header` is removed. def test_load_use_header(self): """Test that user is informed of MetaHeader on load.""" # Determine the expected warnings self.warn_msgs = np.array(["".join(['Meta now contains a class for ', 'global metadata (MetaHeader). ', - 'Default attachment of global ', - 'attributes to Instrument will be', - ' Deprecated in pysat 3.2.0+. Set ', - '`use_header=True` in this load ', - 'call or on Instrument ', - 'instantiation to remove this', - ' warning.'])]) + 'Allowing attachment of global ', + 'attributes to Instrument through', + ' `use_header=False` will be ', + 'Deprecated in pysat 3.3.0+. ', + 'Remove `use_header` kwarg (now ', + 'same as `use_header=True`) to ', + 'stop this warning.'])]) # Capture the warnings with warnings.catch_warnings(record=True) as self.war: @@ -815,23 +584,3 @@ def test_load_use_header(self): # Evaluate the warning output self.eval_warnings() return - - def test_set_2d_pandas_data(self): - """Check that setting 2D data for pandas raises a DeprecationWarning.""" - - test_inst = pysat.Instrument('pysat', 'testing2d', use_header=True) - test_date = pysat.instruments.pysat_testing2d._test_dates[''][''] - test_inst.load(date=test_date) - with warnings.catch_warnings(record=True) as war: - test_inst['new_profiles'] = 2 * test_inst['profiles'] - - warn_msgs = [" ".join(["Support for 2D pandas instrument", - "data has been deprecated and will", - "be removed in 3.2.0+."])] - - # Ensure the minimum number of warnings were raised. - assert len(war) >= len(warn_msgs) - - # Test the warning messages, ensuring each attribute is present. - testing.eval_warnings(war, warn_msgs) - return diff --git a/pysat/tests/test_instrument_custom.py b/pysat/tests/test_instrument_custom.py index 2c3f5ba30..574a72edd 100644 --- a/pysat/tests/test_instrument_custom.py +++ b/pysat/tests/test_instrument_custom.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Unit tests for the `custom_attach` methods for `pysat.Instrument`.""" import copy @@ -43,7 +51,7 @@ def setup_method(self): self.testInst = pysat.Instrument('pysat', 'testing', num_samples=10, clean_level='clean', - update_files=False, use_header=True) + update_files=False) self.out = '' return @@ -73,9 +81,7 @@ def setup_method(self): """Set up the unit test environment for each method.""" self.testInst = pysat.Instrument('pysat', 'testing', num_samples=10, - clean_level='clean', - update_files=True, - use_header=True) + clean_level='clean', update_files=True) self.load_date = pysat.instruments.pysat_testing._test_dates[''][''] self.testInst.load(date=self.load_date) self.custom_args = [2] @@ -235,10 +241,9 @@ class TestBasicsXarray(TestBasics): def setup_method(self): """Set up the unit test environment for each method.""" - self.testInst = pysat.Instrument('pysat', 'testing_xarray', - num_samples=10, clean_level='clean', - use_header=True) - self.load_date = pysat.instruments.pysat_testing_xarray._test_dates + self.testInst = pysat.Instrument('pysat', 'ndtesting', + num_samples=10, clean_level='clean') + self.load_date = pysat.instruments.pysat_ndtesting._test_dates self.load_date = self.load_date[''][''] self.testInst.load(date=self.load_date) self.custom_args = [2] @@ -259,8 +264,7 @@ def setup_method(self): self.testConst = pysat.Constellation(instruments=[ pysat.Instrument('pysat', 'testing', num_samples=10, - clean_level='clean', update_files=True, - use_header=True) + clean_level='clean', update_files=True) for i in range(5)]) self.load_date = pysat.instruments.pysat_testing._test_dates[''][''] self.testConst.load(date=self.load_date) @@ -371,8 +375,7 @@ def test_custom_inst_keyword_instantiation(self): {'function': mult_data, 'args': self.custom_args}] testConst2 = pysat.Constellation(instruments=[ pysat.Instrument('pysat', 'testing', num_samples=10, - clean_level='clean', custom=custom, - use_header=True) + clean_level='clean', custom=custom) for i in range(5)]) # Ensure all instruments within both constellations have the same @@ -400,7 +403,7 @@ def test_custom_const_keyword_instantiation(self): 'apply_inst': False}] testConst2 = pysat.Constellation( instruments=[pysat.Instrument('pysat', 'testing', num_samples=10, - clean_level='clean', use_header=True) + clean_level='clean') for i in range(5)], custom=custom) # Ensure both constellations have the same custom_* attributes diff --git a/pysat/tests/test_instrument_index.py b/pysat/tests/test_instrument_index.py index ea41140a5..512b2bb7b 100644 --- a/pysat/tests/test_instrument_index.py +++ b/pysat/tests/test_instrument_index.py @@ -1,9 +1,14 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Unit tests for the `pysat.Instrument.index` attribute.""" -import datetime as dt from importlib import reload -import numpy as np -import warnings import pytest @@ -47,13 +52,9 @@ def test_index_error_messages(self, kwargs, msg): """ - test_inst = pysat.Instrument(platform='pysat', - name=self.name, - num_samples=10, - clean_level='clean', - update_files=True, - strict_time_flag=True, - use_header=True, + test_inst = pysat.Instrument(platform='pysat', name=self.name, + num_samples=10, clean_level='clean', + update_files=True, strict_time_flag=True, **kwargs) year, doy = pysat.utils.time.getyrdoy(self.ref_time) testing.eval_bad_input(test_inst.load, ValueError, msg, @@ -76,66 +77,3 @@ def teardown_method(self): del self.ref_time, self.name return - - -class TestDeprecation(object): - """Unit test for deprecation warnings for index.""" - - def setup_method(self): - """Set up the unit test environment for each method.""" - - warnings.simplefilter("always", DeprecationWarning) - self.ref_time = pysat.instruments.pysat_testing._test_dates[''][''] - self.warn_msgs = [] - self.war = "" - return - - def teardown_method(self): - """Clean up the unit test environment after each method.""" - - del self.ref_time, self.warn_msgs, self.war - return - - def eval_warnings(self): - """Evaluate the number and message of the raised warnings.""" - - # Ensure the minimum number of warnings were raised. - assert len(self.war) >= len(self.warn_msgs) - - # Test the warning messages, ensuring each attribute is present. - testing.eval_warnings(self.war, self.warn_msgs) - return - - # TODO(#1094): Remove in pysat 3.2.0, potentially with class - @pytest.mark.parametrize('name', ['testing', 'ndtesting', 'testing_xarray', - 'testing2d']) - def test_kwarg_malformed_index(self, name): - """Test deprecation of `malformed_index` kwarg. - - Parameters - ---------- - name : str - name of instrument that uses the deprecated `malformed_index` kwarg. - - """ - - test_inst = pysat.Instrument(platform='pysat', - name=name, - strict_time_flag=False, - use_header=True, - malformed_index=True) - - # Catch the warnings - with warnings.catch_warnings(record=True) as self.war: - test_inst.load(date=self.ref_time) - - self.warn_msgs = np.array([" ".join(["The kwarg malformed_index has", - "been deprecated"])]) - - # Evaluate the warning output - self.eval_warnings() - - # Check that resulting index is both non-monotonic and non-unique - assert not test_inst.index.is_monotonic_increasing - assert not test_inst.index.is_unique - return diff --git a/pysat/tests/test_instrument_listgen.py b/pysat/tests/test_instrument_listgen.py index 8414f9fbb..71161fe42 100644 --- a/pysat/tests/test_instrument_listgen.py +++ b/pysat/tests/test_instrument_listgen.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Unit tests for the list generation methods in `pysat.Instrument`.""" from importlib import reload @@ -45,7 +53,7 @@ def test_for_missing_test_date(self): # If an instrument does not have the _test_dates attribute, it should # still be added to the list for other checks to be run. - # This will be caught later by InstTestClass.test_instrument_test_dates. + # This will be caught later by InstLibTests.test_instrument_test_dates. assert not hasattr(self.test_library.pysat_testing, '_test_dates') inst_list = generate_instrument_list(self.test_library) assert 'pysat_testing' in inst_list['names'] diff --git a/pysat/tests/test_instrument_padding.py b/pysat/tests/test_instrument_padding.py index 89610277b..149d1610d 100644 --- a/pysat/tests/test_instrument_padding.py +++ b/pysat/tests/test_instrument_padding.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Unit tests for the padding methods in `pysat.Instrument`.""" import datetime as dt @@ -10,10 +18,7 @@ import pysat import pysat.instruments.pysat_ndtesting import pysat.instruments.pysat_testing -import pysat.instruments.pysat_testing2d -import pysat.instruments.pysat_testing_xarray from pysat.utils import testing -from pysat.utils.time import filter_datetime_input class TestDataPaddingbyFile(object): @@ -25,15 +30,11 @@ def setup_method(self): reload(pysat.instruments.pysat_testing) self.testInst = pysat.Instrument(platform='pysat', name='testing', clean_level='clean', - pad={'minutes': 5}, - update_files=True, - use_header=True) + pad={'minutes': 5}, update_files=True) self.testInst.bounds = ('2008-01-01.nofile', '2010-12-31.nofile') self.rawInst = pysat.Instrument(platform='pysat', name='testing', - clean_level='clean', - update_files=True, - use_header=True) + clean_level='clean', update_files=True) self.rawInst.bounds = self.testInst.bounds self.delta = dt.timedelta(seconds=0) return @@ -144,20 +145,16 @@ class TestDataPaddingbyFileXarray(TestDataPaddingbyFile): def setup_method(self): """Set up the unit test environment for each method.""" - reload(pysat.instruments.pysat_testing_xarray) - self.testInst = pysat.Instrument(platform='pysat', - name='testing_xarray', + reload(pysat.instruments.pysat_ndtesting) + self.testInst = pysat.Instrument(platform='pysat', name='ndtesting', + num_samples=86400, sample_rate='1s', clean_level='clean', - pad={'minutes': 5}, - update_files=True, - use_header=True) + pad={'minutes': 5}, update_files=True) self.testInst.bounds = ('2008-01-01.nofile', '2010-12-31.nofile') - self.rawInst = pysat.Instrument(platform='pysat', - name='testing_xarray', - clean_level='clean', - update_files=True, - use_header=True) + self.rawInst = pysat.Instrument(platform='pysat', name='ndtesting', + num_samples=86400, sample_rate='1s', + clean_level='clean', update_files=True) self.rawInst.bounds = self.testInst.bounds self.delta = dt.timedelta(seconds=0) return @@ -180,14 +177,11 @@ def setup_method(self): clean_level='clean', update_files=True, sim_multi_file_right=True, - pad={'minutes': 5}, - use_header=True) + pad={'minutes': 5}) self.rawInst = pysat.Instrument(platform='pysat', name='testing', - tag='', - clean_level='clean', + tag='', clean_level='clean', update_files=True, - sim_multi_file_right=True, - use_header=True) + sim_multi_file_right=True) self.testInst.bounds = ('2008-01-01.nofile', '2010-12-31.nofile') self.rawInst.bounds = self.testInst.bounds self.delta = dt.timedelta(seconds=0) @@ -206,20 +200,19 @@ class TestOffsetRightFileDataPaddingBasicsXarray(TestDataPaddingbyFile): def setup_method(self): """Set up the unit test environment for each method.""" - reload(pysat.instruments.pysat_testing_xarray) + reload(pysat.instruments.pysat_ndtesting) self.testInst = pysat.Instrument(platform='pysat', - name='testing_xarray', + name='ndtesting', + num_samples=86400, + sample_rate='1s', clean_level='clean', update_files=True, sim_multi_file_right=True, - pad={'minutes': 5}, - use_header=True) - self.rawInst = pysat.Instrument(platform='pysat', - name='testing_xarray', - clean_level='clean', - update_files=True, - sim_multi_file_right=True, - use_header=True) + pad={'minutes': 5}) + self.rawInst = pysat.Instrument(platform='pysat', name='ndtesting', + num_samples=86400, sample_rate='1s', + clean_level='clean', update_files=True, + sim_multi_file_right=True) self.testInst.bounds = ('2008-01-01.nofile', '2010-12-31.nofile') self.rawInst.bounds = self.testInst.bounds self.delta = dt.timedelta(seconds=0) @@ -240,16 +233,12 @@ def setup_method(self): reload(pysat.instruments.pysat_testing) self.testInst = pysat.Instrument(platform='pysat', name='testing', - clean_level='clean', - update_files=True, + clean_level='clean', update_files=True, sim_multi_file_left=True, - pad={'minutes': 5}, - use_header=True) + pad={'minutes': 5}) self.rawInst = pysat.Instrument(platform='pysat', name='testing', - clean_level='clean', - update_files=True, - sim_multi_file_left=True, - use_header=True) + clean_level='clean', update_files=True, + sim_multi_file_left=True) self.testInst.bounds = ('2008-01-01.nofile', '2010-12-31.nofile') self.rawInst.bounds = self.testInst.bounds self.delta = dt.timedelta(seconds=0) @@ -272,8 +261,7 @@ def setup_method(self): self.testInst = pysat.Instrument(platform='pysat', name='testing', clean_level='clean', pad={'minutes': 5}, - update_files=True, - use_header=True) + update_files=True) self.ref_time = dt.datetime(2009, 1, 2) self.ref_doy = 2 self.delta = dt.timedelta(minutes=5) @@ -313,8 +301,7 @@ def test_data_padding_offset_instantiation(self, pad): self.testInst = pysat.Instrument(platform='pysat', name='testing', clean_level='clean', pad=pad, - update_files=True, - use_header=True) + update_files=True) self.testInst.load(self.ref_time.year, self.ref_doy, verifyPad=True) self.eval_index_start_end() return @@ -348,8 +335,7 @@ def test_padding_exceeds_load_window(self): self.testInst = pysat.Instrument(platform='pysat', name='testing', clean_level='clean', pad={'days': 2}, - update_files=True, - use_header=True) + update_files=True) testing.eval_bad_input(self.testInst.load, ValueError, 'Data padding window must be shorter than ', @@ -480,8 +466,7 @@ def setup_method(self): clean_level='clean', pad={'minutes': 5}, non_monotonic_index=True, - update_files=True, - use_header=True) + update_files=True) self.ref_time = dt.datetime(2009, 1, 2) self.ref_doy = 2 self.delta = dt.timedelta(minutes=5) @@ -500,13 +485,14 @@ class TestDataPaddingXArray(TestDataPadding): def setup_method(self): """Set up the unit test environment for each method.""" - reload(pysat.instruments.pysat_testing_xarray) + reload(pysat.instruments.pysat_ndtesting) self.testInst = pysat.Instrument(platform='pysat', - name='testing_xarray', + name='ndtesting', + num_samples=86400, + sample_rate='1s', clean_level='clean', pad={'minutes': 5}, - update_files=True, - use_header=True) + update_files=True) self.ref_time = dt.datetime(2009, 1, 2) self.ref_doy = 2 self.delta = dt.timedelta(minutes=5) @@ -525,14 +511,15 @@ class TestDataPaddingXArrayNonMonotonic(TestDataPadding): def setup_method(self): """Set up the unit test environment for each method.""" - reload(pysat.instruments.pysat_testing_xarray) + reload(pysat.instruments.pysat_ndtesting) self.testInst = pysat.Instrument(platform='pysat', - name='testing_xarray', + name='ndtesting', + num_samples=86400, + sample_rate='1s', clean_level='clean', pad={'minutes': 5}, non_monotonic_index=True, - update_files=True, - use_header=True) + update_files=True) self.ref_time = dt.datetime(2009, 1, 2) self.ref_doy = 2 self.delta = dt.timedelta(minutes=5) @@ -556,8 +543,7 @@ def setup_method(self): clean_level='clean', update_files=True, sim_multi_file_right=True, - pad={'minutes': 5}, - use_header=True) + pad={'minutes': 5}) self.testInst.multi_file_day = True self.ref_time = dt.datetime(2009, 1, 2) self.ref_doy = 2 @@ -583,8 +569,7 @@ def setup_method(self): update_files=True, sim_multi_file_right=True, non_monotonic_index=True, - pad={'minutes': 5}, - use_header=True) + pad={'minutes': 5}) self.testInst.multi_file_day = True self.ref_time = dt.datetime(2009, 1, 2) self.ref_doy = 2 @@ -604,14 +589,15 @@ class TestMultiFileRightDataPaddingBasicsXarray(TestDataPadding): def setup_method(self): """Set up the unit test environment for each method.""" - reload(pysat.instruments.pysat_testing_xarray) + reload(pysat.instruments.pysat_ndtesting) self.testInst = pysat.Instrument(platform='pysat', - name='testing_xarray', + name='ndtesting', + num_samples=86400, + sample_rate='1s', clean_level='clean', update_files=True, sim_multi_file_right=True, - pad={'minutes': 5}, - use_header=True) + pad={'minutes': 5}) self.testInst.multi_file_day = True self.ref_time = dt.datetime(2009, 1, 2) self.ref_doy = 2 @@ -631,15 +617,16 @@ class TestMultiFileRightDataPaddingBasicsXarrayNonMonotonic(TestDataPadding): def setup_method(self): """Set up the unit test environment for each method.""" - reload(pysat.instruments.pysat_testing_xarray) + reload(pysat.instruments.pysat_ndtesting) self.testInst = pysat.Instrument(platform='pysat', - name='testing_xarray', + name='ndtesting', + num_samples=86400, + sample_rate='1s', clean_level='clean', update_files=True, sim_multi_file_right=True, non_monotonic_index=True, - pad={'minutes': 5}, - use_header=True) + pad={'minutes': 5}) self.testInst.multi_file_day = True self.ref_time = dt.datetime(2009, 1, 2) self.ref_doy = 2 @@ -665,8 +652,7 @@ def setup_method(self): clean_level='clean', update_files=True, sim_multi_file_left=True, - pad={'minutes': 5}, - use_header=True) + pad={'minutes': 5}) self.testInst.multi_file_day = True self.ref_time = dt.datetime(2009, 1, 2) self.ref_doy = 2 @@ -693,8 +679,7 @@ def setup_method(self): update_files=True, sim_multi_file_left=True, non_monotonic_index=True, - pad={'minutes': 5}, - use_header=True) + pad={'minutes': 5}) self.testInst.multi_file_day = True self.ref_time = dt.datetime(2009, 1, 2) self.ref_doy = 2 @@ -714,14 +699,15 @@ class TestMultiFileLeftDataPaddingBasicsXarray(TestDataPadding): def setup_method(self): """Set up the unit test environment for each method.""" - reload(pysat.instruments.pysat_testing_xarray) + reload(pysat.instruments.pysat_ndtesting) self.testInst = pysat.Instrument(platform='pysat', - name='testing_xarray', + name='ndtesting', + num_samples=86400, + sample_rate='1s', clean_level='clean', update_files=True, sim_multi_file_left=True, - pad={'minutes': 5}, - use_header=True) + pad={'minutes': 5}) self.testInst.multi_file_day = True self.ref_time = dt.datetime(2009, 1, 2) self.ref_doy = 2 @@ -741,15 +727,13 @@ class TestMultiFileLeftDataPaddingBasicsXarrayNonMonotonic(TestDataPadding): def setup_method(self): """Set up the unit test environment for each method.""" - reload(pysat.instruments.pysat_testing_xarray) - self.testInst = pysat.Instrument(platform='pysat', - name='testing_xarray', - clean_level='clean', - update_files=True, + reload(pysat.instruments.pysat_ndtesting) + self.testInst = pysat.Instrument(platform='pysat', name='ndtesting', + clean_level='clean', update_files=True, + num_samples=86400, sample_rate='1s', sim_multi_file_left=True, non_monotonic_index=True, - pad={'minutes': 5}, - use_header=True) + pad={'minutes': 5}) self.testInst.multi_file_day = True self.ref_time = dt.datetime(2009, 1, 2) self.ref_doy = 2 diff --git a/pysat/tests/test_instruments.py b/pysat/tests/test_instruments.py index 605270288..70a4edcf6 100644 --- a/pysat/tests/test_instruments.py +++ b/pysat/tests/test_instruments.py @@ -1,3 +1,11 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Unit and Integration Tests for each instrument module. Note @@ -9,15 +17,12 @@ import datetime as dt import numpy as np import pandas as pds -import warnings import pytest import pysat import pysat.tests.classes.cls_instrument_library as cls_inst_lib from pysat.tests.classes.cls_instrument_library import InstLibTests -import pysat.tests.instrument_test_class as itc -from pysat.utils import testing # Optional code to pass through user and password info to test instruments # dict, keyed by pysat instrument, with a list of usernames and passwords @@ -65,11 +70,10 @@ def test_inst_start_time(self, inst_dict, kwarg, output): _, date = cls_inst_lib.initialize_test_inst_and_date(inst_dict) if kwarg: self.test_inst = pysat.Instrument( - inst_module=inst_dict['inst_module'], start_time=kwarg, - use_header=True) + inst_module=inst_dict['inst_module'], start_time=kwarg) else: self.test_inst = pysat.Instrument( - inst_module=inst_dict['inst_module'], use_header=True) + inst_module=inst_dict['inst_module']) self.test_inst.load(date=date) @@ -93,7 +97,7 @@ def test_inst_num_samples(self, inst_dict): num = 10 _, date = cls_inst_lib.initialize_test_inst_and_date(inst_dict) self.test_inst = pysat.Instrument(inst_module=inst_dict['inst_module'], - num_samples=num, use_header=True) + num_samples=num) self.test_inst.load(date=date) assert len(self.test_inst['uts']) == num @@ -116,7 +120,7 @@ def test_inst_file_date_range(self, inst_dict): _, date = cls_inst_lib.initialize_test_inst_and_date(inst_dict) self.test_inst = pysat.Instrument(inst_module=inst_dict['inst_module'], file_date_range=file_date_range, - update_files=True, use_header=True) + update_files=True) file_list = self.test_inst.files.files assert all(file_date_range == file_list.index) @@ -135,8 +139,7 @@ def test_inst_max_latitude(self, inst_dict): """ _, date = cls_inst_lib.initialize_test_inst_and_date(inst_dict) - self.test_inst = pysat.Instrument(inst_module=inst_dict['inst_module'], - use_header=True) + self.test_inst = pysat.Instrument(inst_module=inst_dict['inst_module']) if self.test_inst.name != 'testmodel': self.test_inst.load(date=date, max_latitude=10.) assert np.all(np.abs(self.test_inst['latitude']) <= 10.) @@ -146,97 +149,67 @@ def test_inst_max_latitude(self, inst_dict): return - -class TestDeprecation(object): - """Unit test for deprecation warnings.""" - - def setup_method(self): - """Set up the unit test environment for each method.""" - - warnings.simplefilter("always", DeprecationWarning) - return - - def teardown_method(self): - """Clean up the unit test environment after each method.""" - - return - - def test_subclass_inst_test_class(self): - """Check that subclass of old instrument library tests is deprecated.""" - - with warnings.catch_warnings(record=True) as war: - - class OldClass(itc.InstTestClass): - """Dummy subclass.""" - - pass - - self.warn_msgs = ["`InstTestClass` has been deprecated", - "`test_load` now uses `@pytest.mark.load_options`"] - self.warn_msgs = np.array(self.warn_msgs) - - # Ensure the minimum number of warnings were raised - assert len(war) >= len(self.warn_msgs) - - # Test the warning messages, ensuring each attribute is present - testing.eval_warnings(war, self.warn_msgs) - return - - def test_old_initialize_inst_and_date(self): - """Check that subclass of old instrument library tests is deprecated.""" - - with warnings.catch_warnings(record=True) as war: - try: - itc.initialize_test_inst_and_date({}) - except KeyError: - # empty dict produces KeyError - pass - - self.warn_msgs = ["`initialize_test_inst_and_date` has been moved to"] - self.warn_msgs = np.array(self.warn_msgs) - - # Ensure the minimum number of warnings were raised - assert len(war) >= len(self.warn_msgs) - - # Test the warning messages, ensuring each attribute is present - testing.eval_warnings(war, self.warn_msgs) - return - - def test_old_pytest_mark_presence(self): - """Test that pytest mark is backwards compatible.""" - - n_args = len(InstLibTests.test_load.pytestmark) - mark_names = [InstLibTests.test_load.pytestmark[j].name - for j in range(0, n_args)] - - assert "download" in mark_names - - @pytest.mark.parametrize("inst_module", ['pysat_testing2d', - 'pysat_testing_xarray', - 'pysat_testing2d_xarray']) - def test_deprecated_instruments(self, inst_module): - """Check that instantiating old instruments raises a DeprecationWarning. + @pytest.mark.second + @pytest.mark.parametrize("clean_level", ['clean', 'dusty', 'dirty']) + @pytest.mark.parametrize("change", [True, False]) + @pytest.mark.parametrize('warn_type', ['logger', 'warning', 'error', + 'mult']) + @pytest.mark.parametrize("inst_dict", instruments['download']) + def test_clean_with_warnings(self, clean_level, change, warn_type, + inst_dict, caplog): + """Run `test_clean_warn` with different warning behaviours. Parameters ---------- - inst_module : str - name of deprecated module. + clean_level : str + Cleanliness level for loaded instrument data; must run the clean + routine (not include 'none'). + change : bool + Specify whether or not clean level should change. + warn_type : str + Desired type of warning or error to be raised. + inst_dict : dict + One of the dictionaries returned from + `InstLibTests.initialize_test_package` with instruments to test. """ + # Set the default values + warn_level = {'logger': 'WARN', 'warning': UserWarning, + 'error': ValueError} + warn_msg = 'Default warning message' + if change: + final_level = 'none' if clean_level == 'clean' else 'clean' + else: + final_level = clean_level + + # Construct the expected warnings + if warn_type == 'mult': + # Note that we cannot test errors along with other warnings + # TODO(#1184) test for both warnings and errors + inst_dict['inst_module']._clean_warn = { + inst_dict['inst_id']: {inst_dict['tag']: {clean_level: [ + ('warning', warn_level['warning'], warn_msg, final_level), + ('logger', warn_level['logger'], warn_msg, final_level)]}}} + else: + inst_dict['inst_module']._clean_warn = { + inst_dict['inst_id']: {inst_dict['tag']: {clean_level: [ + (warn_type, warn_level[warn_type], warn_msg, + final_level)]}}} + + # Set the additional Instrument kwargs + if 'kwargs' in inst_dict.keys(): + # Ensure the test instrument cleaning kwarg is reset + inst_dict['kwargs']['test_clean_kwarg'] = {'change': final_level} + else: + inst_dict['kwargs'] = {'test_clean_kwarg': {'change': final_level}} - with warnings.catch_warnings(record=True) as war: - pysat.Instrument(inst_module=getattr(pysat.instruments, - inst_module), - use_header=True) - - warn_msgs = [" ".join(["The instrument module", - "`{:}`".format(inst_module), - "has been deprecated and will be removed", - "in 3.2.0+."])] + if warn_type == 'mult': + inst_dict['kwargs']['test_clean_kwarg']['logger'] = warn_msg + inst_dict['kwargs']['test_clean_kwarg']['warning'] = warn_msg + else: + inst_dict['kwargs']['test_clean_kwarg'][warn_type] = warn_msg - # Ensure the minimum number of warnings were raised. - assert len(war) >= len(warn_msgs) + # Run the test + self.test_clean_warn(clean_level, inst_dict, caplog) - # Test the warning messages, ensuring each attribute is present. - testing.eval_warnings(war, warn_msgs) return diff --git a/pysat/tests/test_meta.py b/pysat/tests/test_meta.py index a4ec9aec8..ec4ff8787 100644 --- a/pysat/tests/test_meta.py +++ b/pysat/tests/test_meta.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Tests the pysat Meta object.""" @@ -67,7 +70,7 @@ class `mutable` attribute. (default=False) """ if inst_kwargs is not None: # Load the test Instrument - self.testInst = pysat.Instrument(**inst_kwargs, use_header=True) + self.testInst = pysat.Instrument(**inst_kwargs) stime = self.testInst.inst_module._test_dates[''][''] self.testInst.load(date=stime) @@ -105,63 +108,6 @@ def eval_meta_settings(self, isfloat=True): lkey.__repr__(), self.meta[self.dval, lkey].__repr__(), self.default_val[lkey].__repr__()) - assert 'children' not in self.meta.data.columns - assert self.dval not in self.meta.keys_nD() - return - - def eval_ho_meta_settings(self, meta_dict): - """Test the Meta settings for higher order data. - - Parameters - ---------- - meta_dict : dict - Dict with meta data labels as keys and values to test against - - """ - - # Test the ND metadata results - testing.assert_list_contains(self.frame_list, - list(self.meta.ho_data[self.dval].keys())) - testing.assert_list_contains( - self.frame_list, list(self.meta[self.dval]['children'].keys())) - - # Test the meta settings at the base and nD level - for label in meta_dict.keys(): - if label == 'meta': - testing.assert_lists_equal( - list(self.meta[self.dval]['children'].attrs()), - list(meta_dict[label].attrs())) - testing.assert_lists_equal( - list(self.meta[self.dval]['children'].keys()), - list(meta_dict[label].keys())) - - for lvar in self.meta[self.dval]['children'].attrs(): - for dvar in self.meta[self.dval]['children'].keys(): - if lvar in self.default_nan: - assert np.isnan( - self.meta[self.dval]['children'][dvar, lvar]), \ - "{:s} child {:s} {:s} value {:} != NaN".format( - self.dval.__repr__(), dvar.__repr__(), - lvar.__repr__(), - self.meta[self.dval]['children'][ - dvar, lvar].__repr__()) - else: - assert (self.meta[self.dval]['children'][dvar, lvar] - == meta_dict[label][dvar, lvar]), \ - "{:s} child {:s} {:s} value {:} != {:}".format( - self.dval.__repr__(), dvar.__repr__(), - lvar.__repr__(), - self.meta[self.dval]['children'][ - dvar, lvar].__repr__(), - meta_dict[label][dvar, lvar].__repr__()) - else: - assert self.meta[self.dval]['children'].hasattr_case_neutral( - label) - assert self.meta[self.dval, label] == meta_dict[label], \ - "{:s} label value {:} != {:}".format( - label, self.meta[self.dval, label].__repr__(), - meta_dict[label].__repr__()) - return # ----------------------- @@ -183,6 +129,13 @@ def test_pop_w_bad_key(self): input_args=['not_a_key']) return + def test_drop_w_bad_name(self): + """Test that a bad name will raise a KeyError for `meta.drop`.""" + + testing.eval_bad_input(self.meta.drop, KeyError, 'not found in Meta', + input_args=['not_a_name']) + return + def test_getitem_w_bad_key(self): """Test that a bad key will raise a KeyError in meta access.""" @@ -192,38 +145,23 @@ def test_getitem_w_bad_key(self): assert str(kerr).find('not found in MetaData') >= 0 return - def test_getitem_w_index(self): - """Test raises NotImplementedError with an integer index.""" + def test_setitem_w_bad_input_combo(self): + """Test that bad input calls will raise ValueError when setting data.""" - with pytest.raises(NotImplementedError) as ierr: - self.meta[1] + with pytest.raises(ValueError) as verr: + self.meta[['uts', 'units']] = 'seconds' - assert str(ierr).find('expected tuple, list, or str') >= 0 + assert str(verr).find( + "unexpected input combination, can't set metadata") >= 0 return - # TODO(#913): remove tests for 2D metadata - @pytest.mark.parametrize("parent_child", [ - (['alt_profiles', 'profiles'], 'density'), - (['alt_profiles', 'profiles'], ['density', 'dummy_str']), - (['alt_profiles', 'profiles'], 'density', 'units'), - (['alt_profiles', 'profiles'], 'density', ['units', 'long_name'])]) - def test_getitem_w_ho_child_slicing(self, parent_child): - """Test raises NotImplementedError with parent/child slicing. - - Parameters - ---------- - parent_child : list - List of inputs with unsupported parent/child slicing options - - """ - - # Set the meta object - self.set_meta(inst_kwargs={'platform': 'pysat', 'name': 'testing2d'}) + def test_getitem_w_index(self): + """Test raises NotImplementedError with an integer index.""" with pytest.raises(NotImplementedError) as ierr: - self.meta[parent_child] + self.meta[1] - assert str(ierr).find("retrieve child meta data from multiple") >= 0 + assert str(ierr).find('expected tuple, list, or str') >= 0 return def test_concat_strict_w_collision(self): @@ -244,25 +182,6 @@ def test_concat_strict_w_collision(self): return - # TODO(#913): remove tests for 2d metadata - def test_concat_strict_w_ho_collision(self): - """Test raises KeyError when higher-order variable nams overlap.""" - - # Set the meta object - self.set_meta(inst_kwargs={'platform': 'pysat', 'name': 'testing2d'}) - - # Create a second object with the same higher-order data variables - concat_meta = pysat.Meta() - for dvar in self.meta.keys_nD(): - concat_meta[dvar] = self.meta[dvar] - - # Test the error message - testing.eval_bad_input( - self.meta.concat, KeyError, - 'Duplicated keys (variable names) in Meta.keys()', [concat_meta], - {'strict': True}) - return - def test_multiple_meta_assignment_error(self): """Test that assignment of multiple metadata raises a ValueError.""" @@ -318,20 +237,6 @@ def test_meta_csv_load_w_errors(self, bad_key, bad_val, err_msg): input_kwargs=kwargs) return - # TODO(#913): remove tests for 2D metadata - def test_meta_rename_bad_ho_input(self): - """Test raises ValueError when treating normal data like HO data.""" - - # Initialize the meta data - self.set_meta(inst_kwargs={'platform': 'pysat', 'name': 'testing2d'}) - - # Set a bad mapping dictionary - mapper = {'mlt': {'mlt_profile': 'mlt_density_is_not_real'}} - - testing.eval_bad_input(self.meta.rename, ValueError, - "unknown mapped value at 'mlt'", [mapper]) - return - # ------------------------- # Test the Warning messages @@ -345,7 +250,8 @@ def test_init_labels_w_int_default(self): self.set_meta(inst_kwargs={'platform': 'pysat', 'name': 'testing', 'tag': 'default_meta', 'clean_level': 'clean', - 'labels': self.meta_labels}) + 'meta_kwargs': + {'labels': self.meta_labels}}) # Test the warning default_str = ''.join(['Metadata set to defaults, as they were', @@ -395,6 +301,24 @@ def test_set_meta_with_wrong_type_drop(self, bad_val): # ------------------------- # Test the logging messages + def test_drop_with_some_bad_names(self, caplog): + """Test a logger warning is raised if not all names can be dropped.""" + + with caplog.at_level(logging.WARN, logger='pysat'): + self.meta.drop(['uts', 'units', 'fake_var']) + + # Test the warning + captured = caplog.text + estr = "missing expected message in: {:}".format(captured) + assert captured.find('not found in Meta') >= 0, estr + + # Check that correct meta data and labels were dropped + assert 'uts' not in self.meta.keys(), 'Did not drop metadata' + assert not hasattr(self.meta.labels, 'units'), 'Did not drop MetaLabel' + assert 'units' not in self.meta.data.columns, 'Did not drop meta label' + + return + @pytest.mark.parametrize('bad_val', [[1, 2], 1, 2.0, True, None]) def test_set_meta_with_wrong_type_cast(self, bad_val, caplog): """Test that setting meta as recastable type raises appropriate warning. @@ -459,11 +383,9 @@ def test_repr(self): assert out.find('Meta(') >= 0 return - # TODO(#913): remove tests for 2d metadata @pytest.mark.parametrize('long_str', [True, False]) @pytest.mark.parametrize('inst_kwargs', - [None, {'platform': 'pysat', 'name': 'testing'}, - {'platform': 'pysat', 'name': 'testing2d'}]) + [None, {'platform': 'pysat', 'name': 'testing'}]) def test_str(self, long_str, inst_kwargs): """Test long string output with custom meta data. @@ -485,7 +407,6 @@ def test_str(self, long_str, inst_kwargs): # Evaluate the common parts of the output string assert out.find('pysat Meta object') >= 0 assert out.find('standard variables') > 0 - assert out.find('ND variables') > 0 assert out.find('global attributes') > 0 # Evaluate the extra parts of the long output string @@ -501,13 +422,8 @@ def test_str(self, long_str, inst_kwargs): else: assert out.find('Standard Metadata variables:') < 0 - if inst_kwargs is not None and inst_kwargs['name'] == 'testing2d': - assert out.find('ND Metadata variables:') > 0 - else: - assert out.find('ND Metadata variables:') < 0 else: assert out.find('Standard Metadata variables:') < 0 - assert out.find('ND Metadata variables:') < 0 return def test_self_equality(self): @@ -535,10 +451,7 @@ def test_equality(self): assert cmeta == self.meta, "identical meta objects differ" return - # TODO(#908): remove tests for deprecated instruments - @pytest.mark.parametrize("inst_name", ["testing", "testing2d", - "ndtesting", "testing_xarray", - "testmodel"]) + @pytest.mark.parametrize("inst_name", ["testing", "ndtesting", "testmodel"]) def test_equality_w_copy(self, inst_name): """Test that meta remains the same when copied. @@ -576,41 +489,6 @@ def test_inequality(self, emeta): assert emeta != self.meta, "meta equality not detectinng differences" return - # TODO(#789): remove tests for higher order meta - @pytest.mark.parametrize('val_dict', [ - {'units': 'U', 'long_name': 'HO Val', 'radn': 'raiden'}, - {'units': 'MetaU', 'long_name': 'HO Val'}]) - def test_inequality_with_higher_order_meta(self, val_dict): - """Test inequality with higher order metadata. - - Parameters - ---------- - val_dict : dict - Information to be added to higher order metadata variable - - """ - - meta_dict = {'units': {'ho_val': 'U', 'ho_prof': 'e-'}, - 'long_name': {'ho_val': 'HO Val', 'ho_prof': 'HO Profile'}} - - # Set the default meta object - self.meta['ho_data'] = pysat.Meta(pds.DataFrame(meta_dict)) - - # Set the altered meta object - cmeta = pysat.Meta() - - for vkey in val_dict.keys(): - if vkey in meta_dict.keys(): - meta_dict[vkey]['ho_val'] = val_dict[vkey] - else: - meta_dict[vkey] = {'ho_val': val_dict[vkey]} - - cmeta['ho_data'] = pysat.Meta(pds.DataFrame(meta_dict)) - - # Evaluate the inequality - assert cmeta != self.meta - return - @pytest.mark.parametrize("label_key", ["units", "name", "notes", "desc", "min_val", "max_val", "fill_val"]) def test_value_inequality(self, label_key): @@ -647,10 +525,7 @@ def test_value_inequality(self, label_key): "differences not detected in label {:s}".format(label_key) return - # TODO(#908): remove tests for deprecated instruments - @pytest.mark.parametrize("inst_name", ["testing", "testing2d", - "ndtesting", "testing_xarray", - "testmodel"]) + @pytest.mark.parametrize("inst_name", ["testing", "ndtesting", "testmodel"]) def test_pop(self, inst_name): """Test meta attributes are retained when extracted using pop. @@ -671,7 +546,6 @@ def test_pop(self, inst_name): # Test the popped object labels pop_attrs = list(mpop.keys()) - pop_attrs.pop(pop_attrs.index('children')) testing.assert_lists_equal(pop_attrs, list(self.meta.attrs())) # Test the popped object values @@ -679,9 +553,6 @@ def test_pop(self, inst_name): comp_values = [mcomp[pattr] for pattr in pop_attrs] testing.assert_lists_equal(pop_values, comp_values) - if mpop['children'] is not None: - assert mpop['children'] == mcomp['children'] - # Test that the popped variable is no longer in the main object assert dvar not in self.meta.keys(), "pop did not remove metadata" @@ -721,6 +592,22 @@ def test_accept_default_labels(self): return + def test_meta_assign_single_val(self): + """Test basic assignment of a single metadata value.""" + # Ensure the data has not been set already + data_name = 'special_data' + label_name = self.meta.labels.notes + meta_val = "test me" + assert data_name not in self.meta.keys(), "bad testing set up" + + # Assign notes metadata + self.meta[data_name, label_name] = meta_val + + # Test the assigned metadata + assert data_name in self.meta.keys() + assert self.meta[data_name, label_name] == meta_val + return + @pytest.mark.parametrize("custom_attr", [None, 'custom_meta']) @pytest.mark.parametrize("assign_type", [dict, pds.Series]) def test_meta_assignment(self, custom_attr, assign_type): @@ -793,8 +680,7 @@ def test_multiple_meta_assignment(self, custom_attr, assign_type): self.eval_meta_settings() return - # TODO(#913): remove tests for 2D metadata - @pytest.mark.parametrize('inst_name', ['testing', 'testing2d']) + @pytest.mark.parametrize('inst_name', ['testing']) @pytest.mark.parametrize('num_mvals', [0, 1, 3]) @pytest.mark.parametrize('num_dvals', [0, 1, 3]) def test_selected_meta_retrieval(self, inst_name, num_mvals, num_dvals): @@ -819,20 +705,6 @@ def test_selected_meta_retrieval(self, inst_name, num_mvals, num_dvals): mvals = [getattr(self.meta.labels, mattr) for mattr in list(self.meta_labels.keys())[:num_mvals]] - # If dvals is greater than zero and there is higher order data, - # make sure at least one is included - nd_inds = list() - if len(dvals) > 0: - nd_vals = [key for key in self.meta.keys_nD()] - - if len(nd_vals) > 0: - nd_inds = [dvals.index(self.dval) for self.dval in nd_vals - if self.dval in dvals] - - if len(nd_inds) == 0: - dvals[0] = nd_vals[0] - nd_inds = [0] - # Retrieve meta data for desired values sel_meta = self.meta[dvals, mvals] @@ -841,21 +713,6 @@ def test_selected_meta_retrieval(self, inst_name, num_mvals, num_dvals): testing.assert_lists_equal(dvals, list(sel_meta.index)) testing.assert_lists_equal(mvals, list(sel_meta.columns)) - # If there is higher order data, test the retrieval - for pind in nd_inds: - # Get the desired number of child variables - self.dval = dvals[pind] - cvals = [kk for kk in self.meta[self.dval].children.keys()][ - :num_dvals] - - # Retrieve meta data for desired values - sel_meta = self.meta[self.dval, cvals, mvals] - - # Evaluate retrieved data - assert isinstance(sel_meta, pds.DataFrame) - testing.assert_lists_equal(cvals, list(sel_meta.index)) - testing.assert_lists_equal(mvals, list(sel_meta.columns)) - return def test_replace_meta(self): @@ -962,8 +819,7 @@ def test_meta_immutable_at_instrument_instantiation(self): return - # TODO(#913): remove tests for 2D metadata - @pytest.mark.parametrize('inst_name', ['testing', 'testing2d']) + @pytest.mark.parametrize('inst_name', ['testing']) def test_assign_nonstandard_metalabels(self, inst_name): """Test labels do not conform to the standard values if set that way. @@ -976,7 +832,7 @@ def test_assign_nonstandard_metalabels(self, inst_name): # Assign meta data with non-standard labels self.set_meta(inst_kwargs={'platform': 'pysat', 'name': inst_name, - 'labels': self.meta_labels}) + 'meta_kwargs': {'labels': self.meta_labels}}) # Test that standard attributes are missing and non-standard # attributes are present @@ -1188,14 +1044,48 @@ def test_meta_merge(self): meta_dict[label].__repr__()) return + @pytest.mark.parametrize("names", ['uts', ['uts', 'mlt'], 'units', + ['units', 'uts']]) + @pytest.mark.parametrize("is_drop", [True, False]) + def test_meta_drop(self, names, is_drop): + """Test successful deletion of meta data for different types of data. + + Parameters + ---------- + names : int + Number of variables to drop in a single go. + is_drop : bool + Use `drop` if True, use `del` if False. + + """ + # Set meta data + self.set_meta(inst_kwargs={'platform': 'pysat', 'name': 'testing'}) + + # Drop the values + if is_drop: + self.meta.drop(names) + else: + del self.meta[names] + + # Test the successful deletion + for name in pysat.utils.listify(names): + if name in self.testInst.variables: + assert name not in self.meta.keys(), "didn't drop variable" + else: + assert name not in self.meta.data.columns, "didn't drop label" + return + @pytest.mark.parametrize("num_drop", [0, 1, 3]) - def test_meta_drop(self, num_drop): + @pytest.mark.parametrize("is_drop", [True, False]) + def test_meta_num_drop(self, num_drop, is_drop): """Test successful deletion of meta data for specific values. Parameters ---------- num_drop : int Number of variables to drop in a single go. + is_drop : bool + Use `drop` if True, use `del` if False. """ @@ -1208,7 +1098,10 @@ def test_meta_drop(self, num_drop): [val for val in self.meta.keys()]) # Drop the values - self.meta.drop(self.dval) + if is_drop: + self.meta.drop(self.dval) + else: + del self.meta[self.dval] # Test the successful deletion meta_vals = [val for val in self.meta.keys()] @@ -1285,7 +1178,7 @@ def test_set_wrong_case(self, num_dvals): # Set the meta object self.set_meta(inst_kwargs={'platform': 'pysat', 'name': 'testing', - 'labels': self.meta_labels}) + 'meta_kwargs': {'labels': self.meta_labels}}) # Set data using lower case labels dvals = self.testInst.vars_no_time[:num_dvals] @@ -1524,449 +1417,6 @@ def test_update_epoch(self): return - # ------------------------------- - # Tests for higher order metadata - - # TODO(#789): remove tests for higher order meta - @pytest.mark.parametrize('meta_dict', [ - None, {'units': 'V', 'long_name': 'test name'}, - {'units': 'V', 'long_name': 'test name', - 'meta': pysat.Meta(metadata=pds.DataFrame( - {'units': {'dummy_frame1': 'A', 'dummy_frame2': ''}, - 'desc': {'dummy_frame1': '', - 'dummy_frame2': 'A filler description'}, - 'long_name': {'dummy_frame1': 'Dummy 1', - 'dummy_frame2': 'Dummy 2'}}))}, - {'units': 'V', 'long_name': 'test name', 'bananas': 0, - 'meta': pysat.Meta(metadata=pds.DataFrame( - {'units': {'dummy_frame1': 'A', 'dummy_frame2': ''}, - 'desc': {'dummy_frame1': '', - 'dummy_frame2': 'A filler description'}, - 'long_name': {'dummy_frame1': 'Dummy 1', - 'dummy_frame2': 'Dummy 2'}, - 'bananas': {'dummy_frame1': 1, 'dummy_frame2': 2}}))}]) - def test_inst_ho_data_assignment(self, meta_dict): - """Test the assignment of the higher order metadata. - - Parameters - ---------- - meta_dict : dict or NoneType - Dictionary used to create test metadata - - """ - - # Initialize the Meta data - self.set_meta(inst_kwargs={'platform': 'pysat', 'name': 'testing'}) - - # Alter the Meta data - frame = pds.DataFrame({fkey: np.arange(10) for fkey in self.frame_list}, - columns=self.frame_list) - inst_data = [frame for i in range(self.testInst.index.shape[0])] - self.dval = 'test_val' - - if meta_dict is None: - self.testInst[self.dval] = inst_data - meta_dict = {'units': '', 'long_name': self.dval, 'desc': ''} - else: - meta_dict.update({'data': inst_data}) - self.testInst[self.dval] = meta_dict - - # Remove data key for evaluation - del meta_dict['data'] - - self.meta = self.testInst.meta - - # Test the ND metadata results - self.eval_ho_meta_settings(meta_dict) - return - - # TODO(#789): remove tests for higher order meta - @pytest.mark.parametrize("num_ho, num_lo", [(1, 1), (2, 2)]) - def test_assign_mult_higher_order_meta_from_dict(self, num_ho, num_lo): - """Test assign higher order metadata from dict with multiple types. - - Parameters - ---------- - num_ho : int - Number of higher order data values to initialize - num_lo : int - Number of lower order data valuess to initialize - - """ - - # Initialize the lower order evaluation data - self.default_val['units'] = 'U' - - # Initialize the higher-order meta data - ho_meta = pysat.Meta() - for flist in self.frame_list: - ho_meta[flist] = {'units': 'U', 'long_name': flist} - - # Initialize the meta dict for setting the data values - dvals = ['higher_{:d}'.format(i) for i in range(num_ho)] - dvals.extend(['lower_{:d}'.format(i) for i in range(num_lo)]) - - meta_dict = {'units': ['U' for i in range(len(dvals))], - 'long_name': [val for val in dvals], - 'meta': [ho_meta for i in range(num_ho)]} - meta_dict['meta'].extend([None for i in range(num_lo)]) - - # Assign and test the meta data - self.meta[dvals] = meta_dict - - for i, self.dval in enumerate(dvals): - if i < num_ho: - eval_dict = {label: meta_dict[label][i] - for label in meta_dict.keys()} - self.eval_ho_meta_settings(eval_dict) - else: - self.eval_meta_settings() - return - - # TODO(#789): remove tests for higher order meta - def test_inst_ho_data_assign_meta_then_data(self): - """Test assignment of higher order metadata before assigning data.""" - - # Initialize the Meta data - self.set_meta(inst_kwargs={'platform': 'pysat', 'name': 'testing'}) - - # Alter the Meta data - frame = pds.DataFrame({fkey: np.arange(10) for fkey in self.frame_list}, - columns=self.frame_list) - inst_data = [frame for i in range(self.testInst.index.shape[0])] - meta_dict = {'data': inst_data, 'units': 'V', 'long_name': 'The Doors', - 'meta': pysat.Meta(metadata=pds.DataFrame( - {'units': {dvar: "{:d}".format(i) - for i, dvar in enumerate(self.frame_list)}, - 'desc': {dvar: "{:s} desc".format(dvar) - for dvar in self.frame_list}, - 'long_name': {dvar: dvar - for dvar in self.frame_list}}))} - self.dval = 'test_data' - - # Assign the metadata - self.testInst[self.dval] = meta_dict - - # Alter the data - self.testInst[self.dval] = inst_data - - # Test the ND metadata results - self.meta = self.testInst.meta - del meta_dict['data'] - self.eval_ho_meta_settings(meta_dict) - return - - # TODO(#913): remove tests for 2D metadata - def test_inst_ho_data_assign_meta_different_labels(self): - """Test the higher order assignment of custom metadata labels.""" - - # Initialize the Meta data - self.set_meta(inst_kwargs={'platform': 'pysat', 'name': 'testing2d'}) - - # Alter the higher order metadata - ho_meta = pysat.Meta(labels={'units': ('barrels', str), - 'desc': ('Monkeys', str), - 'meta': ('meta', object)}) - self.frame_list = list( - self.testInst.meta['profiles']['children'].keys()) - for dvar in self.frame_list: - if dvar == 'density': - ho_meta[dvar] = {'barrels': 'A'} - else: - ho_meta[dvar] = {'Monkeys': 'are fun', 'bananas': 2} - - # The 'units', 'desc' and other labels used on self.testInst are - # applied to the input metadata to ensure everything remains - # consistent across the object. - self.testInst['profiles'] = {'data': self.testInst.data['profiles'], - 'units': 'V', 'long_name': 'The Doors', - 'meta': ho_meta} - self.meta = self.testInst.meta - - # Test the nD metadata - assert self.testInst.meta['profiles', 'long_name'] == 'The Doors' - testing.assert_list_contains(self.frame_list, - self.meta.ho_data['profiles']) - testing.assert_list_contains(self.frame_list, - self.meta['profiles']['children']) - - for label in ['units', 'desc']: - assert self.meta['profiles']['children'].hasattr_case_neutral(label) - - assert self.meta['profiles']['children']['density', 'units'] == 'A' - assert self.meta['profiles']['children']['density', 'desc'] == '' - - for dvar in ['dummy_str', 'dummy_ustr']: - assert self.meta['profiles']['children'][dvar, 'desc'] == 'are fun' - assert self.meta['profiles']['children'][dvar, 'bananas'] == 2 - return - - # TODO(#789): remove tests for higher order meta - def test_concat_w_ho(self): - """Test `meta.concat` adds new meta objects with higher order data.""" - - # Create meta data to concatenate - meta2 = pysat.Meta() - meta2['new3'] = {'units': 'hey3', 'long_name': 'crew_brew'} - meta2['new4'] = pysat.Meta(pds.DataFrame({ - 'units': {'new41': 'hey4'}, 'long_name': {'new41': 'crew_brew'}, - 'bob_level': {'new41': 'max'}})) - - # Perform and test for successful concatenation - self.meta = self.meta.concat(meta2) - assert self.meta['new3'].units == 'hey3' - assert self.meta['new4'].children['new41'].units == 'hey4' - return - - # TODO(#913): remove tests for 2d metadata - def test_concat_not_strict_w_ho_collision(self): - """Test non-strict concat with overlapping higher-order data.""" - - # Set the meta object - self.set_meta(inst_kwargs={'platform': 'pysat', 'name': 'testing2d'}) - - # Create a second object with the same higher-order data variables - concat_meta = pysat.Meta() - for dvar in self.meta.keys_nD(): - concat_meta[dvar] = self.meta[dvar] - - # Change the units of the HO data variables - for cvar in concat_meta[dvar].children.keys(): - # HERE FIX, DOESN'T WORK - concat_meta[dvar].children[cvar] = { - concat_meta.labels.units: "UpdatedUnits"} - - # Concatenate the data - self.meta = self.meta.concat(concat_meta, strict=False) - - # Test that the Meta data kept the original values - testing.assert_list_contains(list(concat_meta.keys_nD()), - list(self.meta.keys_nD())) - - for dvar in concat_meta.keys_nD(): - testing.assert_lists_equal(list(concat_meta[dvar].children.keys()), - list(self.meta[dvar].children.keys())) - - for cvar in concat_meta[dvar].children.keys(): - assert self.meta[dvar].children[ - cvar, self.meta.labels.units].find('Updated') >= 0 - return - - # TODO(#789): remove tests for higher order meta - @pytest.mark.parametrize("label", ['meta_label', 'META_LABEL', 'Meta_Label', - 'MeTa_lAbEl']) - def test_ho_attribute_name_case(self, label): - """Test that `meta.attribute_case_name` preserves the HO stored case. - - Parameters - ---------- - label : str - Label name - - """ - - # Only set `label` in the child data variable - self.dval = 'test_val' - self.meta[self.dval] = self.default_val - cmeta = pysat.Meta() - cval = "_".join([self.dval, "child"]) - cmeta[cval] = {label: 'Test meta data for child meta label'} - self.meta[self.dval] = cmeta - - # Test the meta method using different input variations - assert self.meta.attr_case_name(label.upper()) == label - assert self.meta.attr_case_name(label.lower()) == label - assert self.meta.attr_case_name(label.capitalize()) == label - assert self.meta.attr_case_name(label) == label - return - - # TODO(#789): remove tests for higher order meta - @pytest.mark.parametrize("label", ['meta_label', 'META_LABEL', 'Meta_Label', - 'MeTa_lAbEl']) - def test_ho_hasattr_case_neutral(self, label): - """Test `meta.hasattr_case_neutral` identifies the HO label name. - - Parameters - ---------- - label : str - Label name - - """ - - # Only set `label` in the child data variable - self.dval = 'test_val' - self.meta[self.dval] = self.default_val - cmeta = pysat.Meta() - cval = "_".join([self.dval, "child"]) - cmeta[cval] = {label: 'Test meta data for child meta label'} - self.meta[self.dval] = cmeta - - # Test the meta method using different input variations - assert self.meta[self.dval].children.hasattr_case_neutral(label.upper()) - assert self.meta[self.dval].children.hasattr_case_neutral(label.lower()) - assert self.meta[self.dval].children.hasattr_case_neutral( - label.capitalize()) - assert self.meta[self.dval].children.hasattr_case_neutral(label) - return - - # TODO(#913): remove tests for 2D metadata - def test_ho_meta_rename_function(self): - """Test `meta.rename` method with ho data using a function.""" - - # Set the meta object - self.set_meta(inst_kwargs={'platform': 'pysat', 'name': 'testing2d'}) - - # Rename the meta variables to be all upper case, this will differ - # from the Instrument variables, as pysat defaults to lower case - self.meta.rename(str.upper) - - for dvar in self.testInst.vars_no_time: - mvar = dvar.upper() - - # Test the lower order variables - assert dvar not in self.meta.keys(), \ - "variable not renamed: {:}".format(repr(dvar)) - assert mvar in self.meta.keys(), \ - "renamed variable missing: {:}".format(repr(mvar)) - - if mvar in self.meta.keys_nD(): - # Get the variable names from the children - if hasattr(self.testInst[dvar][0], 'columns'): - columns = getattr(self.testInst[dvar][0], 'columns') - else: - columns = [dvar] - - # Test the higher order variables - for cvar in columns: - cmvar = cvar.upper() - assert cmvar in self.meta[mvar].children, \ - "renamed HO variable missing: {:} ({:})".format( - repr(cmvar), repr(mvar)) - - return - - # TODO(#913): remove tests for 2D metadata - def test_ho_meta_rename_dict(self): - """Test `meta.rename` method with ho data using a dict.""" - - # Set the meta object - self.set_meta(inst_kwargs={'platform': 'pysat', 'name': 'testing2d'}) - - # Create a renaming dictionary, which only changes up to four of the - # variable names - rename_dict = {dvar: dvar.upper() - for i, dvar in enumerate(self.testInst.vars_no_time) - if i < 3 or dvar == 'profiles'} - rename_dict['profiles'] = {'density': 'DeNsItY'} - - # Rename the meta variables to be all upper case, this will differ - # from the Instrument variables, as pysat defaults to lower case - self.meta.rename(rename_dict) - - for dvar in self.testInst.vars_no_time: - # Test the lower order variables - if dvar in rename_dict.keys(): - mvar = rename_dict[dvar] - - if isinstance(mvar, dict): - assert dvar in self.meta.keys_nD() - - # Get the variable names from the children - if hasattr(self.testInst[dvar][0], 'columns'): - columns = getattr(self.testInst[dvar][0], 'columns') - else: - columns = [dvar] - - # Test the higher order variables. - for cvar in columns: - if cvar in mvar.keys(): - cmvar = mvar[cvar] - assert cmvar in self.meta[dvar].children.keys(), \ - "renamed HO variable missing: {:} ({:})".format( - repr(cmvar), repr(dvar)) - else: - assert cvar in self.meta[dvar].children.keys(), \ - "unmapped HO var altered: {:} ({:})".format( - repr(cvar), repr(dvar)) - else: - assert dvar not in self.meta.keys(), \ - "variable not renamed: {:}".format(repr(dvar)) - assert mvar in self.meta.keys(), \ - "renamed variable missing: {:}".format(repr(mvar)) - else: - mvar = dvar - assert dvar in self.meta.keys(), \ - "unmapped variable renamed: {:}".format(repr(dvar)) - - return - - # TODO(#789): remove tests for higher order meta - @pytest.mark.parametrize("label", ['meta_label', 'META_LABEL', 'Meta_Label', - 'MeTa_lAbEl']) - def test_get_attribute_name_case_preservation_w_higher_order_list_in(self, - label): - """Test that get attribute names preserves the case with ho metadata. - - Parameters - ---------- - label : str - String label used for testing metadata - - """ - - # Set a meta data variable - self.dval = 'test_val' - self.meta[self.dval] = self.default_val - - # Set an attribute with case in `label` - cval = ''.join([self.dval, '21']) - meta2 = pysat.Meta() - meta2[cval] = {label: 'Test meta data for meta label'} - - # Attach child metadata to root meta - dval2 = ''.join([self.dval, '2']) - self.meta[dval2] = meta2 - - # Attempt to assign to same label at root but potentially different - # case. - self.meta[self.dval] = {label.lower(): 'Test meta data for meta label'} - - # Create inputs and get the attribute case names - ins = [label.upper(), label.lower(), label.capitalize(), - label] - outputs = self.meta.attr_case_name(ins) - - targets = [label] * len(ins) - - # Confirm original input case retained. - assert np.all(outputs == targets) - - return - - # TODO(#789): remove tests for higher order meta - def test_ho_data_retrieval_case_insensitive(self): - """Test that higher order data variables are case insensitive.""" - - # Initalize the meta data - self.dval = "test_val" - self.meta[self.dval] = self.default_val - - cmeta = pysat.Meta() - cval = '_'.join([self.dval, 'child']) - cmeta[cval] = self.default_val - self.meta[self.dval] = cmeta - - # Test that the data value is present using real key and upper-case - # version of that key - assert self.dval in self.meta.keys() - - # Test the child variable, which should only be present through the - # children attribute. Cannot specify keys for case-insensitive look-up. - assert cval not in self.meta.keys() - assert cval in self.meta[self.dval].children.keys() - assert cval.upper() in self.meta[self.dval].children - return - class TestMetaImmutable(TestMeta): """Unit tests for immutable metadata.""" @@ -1999,33 +1449,22 @@ def teardown_method(self): del self.mutable return - # TODO(#789): remove tests for higher order meta - @pytest.mark.parametrize("prop,set_val", [('data', pds.DataFrame()), - ('ho_data', {})]) - def test_meta_mutable_properties(self, prop, set_val): - """Test that @properties are always mutable. - - Parameters - ---------- - prop : str - Attribute on `self.meta` to be tested - set_val : any - Value to be assigned to `prop` on `self.meta` - - """ + def test_meta_mutable_properties(self): + """Test that @properties are always mutable.""" # Set anything that can be immutable to be immutable self.meta.mutable = self.mutable # Test that data and label values can be updated - for prop, set_val in [('data', pds.DataFrame()), ('ho_data', {})]: - try: - # Pandas does not support dataframe equality - setattr(self.meta, prop, set_val) - except AttributeError: - raise AssertionError( - "Couldn't update mutable property {:}".format( - prop.__repr__())) + try: + # Pandas does not support dataframe equality + setattr(self.meta, 'data', pds.DataFrame()) + + # Test that data is empty + assert self.meta.empty, "`meta.data` not updated correctly" + except AttributeError: + raise AssertionError("Couldn't update mutable property 'data'") + return @pytest.mark.parametrize("label", ['units', 'name', 'desc', 'notes', 'min_val', 'max_val', 'fill_val']) @@ -2067,8 +1506,7 @@ class TestMetaMutable(object): def setup_method(self): """Set up the unit test environment for each method.""" - self.testInst = pysat.Instrument(platform='pysat', name='testing', - use_header=True) + self.testInst = pysat.Instrument(platform='pysat', name='testing') self.testInst.load(date=self.testInst.inst_module._test_dates['']['']) self.meta = self.testInst.meta self.meta.mutable = True @@ -2197,81 +1635,13 @@ def test_transfer_attributes_to_instrument_leading_underscore(self): return -class TestDeprecation(object): - """Unit tests for DeprecationWarnings in the Meta class.""" - - def setup_method(self): - """Set up the unit test environment for each method.""" - - warnings.simplefilter("always", DeprecationWarning) - self.meta = pysat.Meta() - self.warn_msgs = [] - return - - def teardown_method(self): - """Clean up the unit test environment after each method.""" - - del self.meta, self.warn_msgs - return - - def test_higher_order_meta_deprecation(self): - """Test that setting higher order meta raises DeprecationWarning.""" - - # Initialize higher-order metadata to add to the main meta object - ho_meta = pysat.Meta() - ho_meta['series_profiles'] = {'long_name': 'series'} - - # Raise and catch warnings - with warnings.catch_warnings(record=True) as war: - self.meta['series_profiles'] = {'meta': ho_meta, - 'long_name': 'series'} - - # Evaluate warnings - self.warn_msgs = ["Support for higher order metadata has been"] - self.warn_msgs = np.array(self.warn_msgs) - - # Ensure the minimum number of warnings were raised - assert len(war) >= len(self.warn_msgs) - - # Test the warning messages, ensuring each attribute is present - testing.eval_warnings(war, self.warn_msgs) - - return - - def test_higher_order_meta_rename_deprecation(self): - """Test that renaming higher order meta raises DeprecationWarning.""" - - # Initialize higher-order metadata to add to the main meta object - ho_meta = pysat.Meta() - ho_meta['series_profiles'] = {'long_name': 'series'} - self.meta['series_profiles'] = {'meta': ho_meta, - 'long_name': 'series'} - - # Raise and catch warnings - with warnings.catch_warnings(record=True) as war: - self.meta.rename(str.upper) - - # Evaluate warnings - self.warn_msgs = ["Support for higher order metadata has been"] - self.warn_msgs = np.array(self.warn_msgs) - - # Ensure the minimum number of warnings were raised - assert len(war) >= len(self.warn_msgs) - - # Test the warning messages, ensuring each attribute is present - testing.eval_warnings(war, self.warn_msgs) - - return - - class TestToDict(object): """Test `.to_dict` method using pysat test Instruments.""" def setup_method(self): """Set up the unit test environment for each method.""" - self.testInst = pysat.Instrument('pysat', 'testing', num_samples=5, - use_header=True) + self.testInst = pysat.Instrument('pysat', 'testing', num_samples=5) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] self.testInst.load(date=self.stime) @@ -2302,89 +1672,40 @@ def test_to_dict(self, preserve_case): # Confirm type assert isinstance(self.out, dict) - # Check for higher order products - ho_vars = [] - for var in self.testInst.meta.keys(): - if 'children' in self.testInst.meta[var]: - if self.testInst.meta[var]['children'] is not None: - for subvar in self.testInst.meta[var]['children'].keys(): - ho_vars.append('_'.join([var, subvar])) - # Confirm the contents of the output for variables for var in self.out.keys(): - if var not in ho_vars: - for label in self.out[var]: - assert label in self.testInst.meta.data.columns - assert testing.nan_equal(self.out[var][label], - self.testInst.meta[var][label]), \ - 'Differing values.' + for label in self.out[var]: + assert label in self.testInst.meta.data.columns + assert testing.nan_equal(self.out[var][label], + self.testInst.meta[var][label]), \ + 'Differing values.' # Confirm case if not preserve_case: # Outputs should all be lower case for key in self.out.keys(): assert key == key.lower(), 'Output not lower case.' - for key in ho_vars: - assert key == key.lower(), 'Output not lower case.' - assert key.lower() in self.out, 'Missing output variable.' else: # Case should be preserved for key in self.out.keys(): assert key == self.testInst.meta.var_case_name(key), \ 'Output case different.' - for key in ho_vars: - assert key in self.out, 'Output case different, or missing.' - num_target_vars = len(ho_vars) + len(list(self.testInst.meta.keys())) + num_target_vars = len(list(self.testInst.meta.keys())) assert num_target_vars == len(self.out), \ 'Different number of variables.' return -class TestToDictXarray(TestToDict): - """Test `.to_dict` methods using pysat test Instruments.""" - - def setup_method(self): - """Set up the unit test environment for each method.""" - - self.testInst = pysat.Instrument('pysat', 'testing_xarray', - num_samples=5, use_header=True) - self.stime = pysat.instruments.pysat_testing_xarray._test_dates[''][''] - self.testInst.load(date=self.stime) - - # For output - self.out = None - - return - - class TestToDictXarrayND(TestToDict): """Test `.to_dict` methods using pysat test Instruments.""" def setup_method(self): """Set up the unit test environment for each method.""" - self.testInst = pysat.Instrument('pysat', 'ndtesting', - num_samples=5, use_header=True) - self.stime = pysat.instruments.pysat_testing_xarray._test_dates[''][''] - self.testInst.load(date=self.stime) - - # For output - self.out = None - - return - - -class TestToDictPandas2D(TestToDict): - """Test `.to_dict` methods using pysat test Instruments.""" - - def setup_method(self): - """Set up the unit test environment for each method.""" - - self.testInst = pysat.Instrument('pysat', 'testing2d', - num_samples=5, use_header=True) - self.stime = pysat.instruments.pysat_testing2d._test_dates[''][''] + self.testInst = pysat.Instrument('pysat', 'ndtesting', num_samples=5) + self.stime = pysat.instruments.pysat_ndtesting._test_dates[''][''] self.testInst.load(date=self.stime) # For output @@ -2399,8 +1720,7 @@ class TestToDictXarrayModel(TestToDict): def setup_method(self): """Set up the unit test environment for each method.""" - self.testInst = pysat.Instrument('pysat', 'testmodel', - num_samples=5, use_header=True) + self.testInst = pysat.Instrument('pysat', 'testmodel', num_samples=5) self.stime = pysat.instruments.pysat_testmodel._test_dates[''][''] self.testInst.load(date=self.stime) diff --git a/pysat/tests/test_meta_header.py b/pysat/tests/test_meta_header.py index 42008523a..b1f3aafde 100644 --- a/pysat/tests/test_meta_header.py +++ b/pysat/tests/test_meta_header.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Tests the pysat MetaHeader object.""" @@ -121,12 +124,29 @@ def test_to_dict(self, header_data): assert header_data == out_dict return + def test_drop(self): + """Test the MetaHeader.drop method.""" + # Get the current output dict + out_dict = self.meta_header.to_dict() + + # Add an attribute + self.meta_header.PI = "pysat development team" + assert "PI" in self.meta_header.global_attrs + assert out_dict != self.meta_header.to_dict() + + # Remove the attribute + self.meta_header.drop('PI') + + # Ensure the current output matches the original output + assert out_dict == self.meta_header.to_dict() + return + # ---------------------------------------- # Test the integration with the Meta class @pytest.mark.parametrize("header_data", [{}, {"test": "value"}]) def test_init_metaheader_in_meta(self, header_data): - """Test changing case of meta labels after initialization. + """Test changing case of MetaHeader after Meta initialization. Parameters ---------- @@ -142,3 +162,29 @@ def test_init_metaheader_in_meta(self, header_data): # Ensure both initialization methods work the same assert meta.header == self.meta_header return + + def test_get_metaheader_in_meta(self): + """Test MetaHeader attribute retrieval from Meta.""" + # Initalize the header data through the meta object + test_attr = "PI" + test_val = "pysat development team" + meta = pysat.Meta(header_data={test_attr: test_val}) + + # Test value retrieval from Meta and MetaHeader + assert getattr(meta.header, test_attr) == test_val + assert meta[test_attr] == test_val + return + + def test_drop_metaheader_in_meta(self): + """Test MetaHeader attribute deletion from Meta.""" + # Initalize the header data through the meta object + test_attr = "PI" + test_val = "pysat development team" + meta = pysat.Meta(header_data={test_attr: test_val}) + + # Delete MetaHeader data + meta.drop(test_attr) + + # Test for empty MetaHeader + assert meta.header.to_dict() == {} + return diff --git a/pysat/tests/test_meta_labels.py b/pysat/tests/test_meta_labels.py index ee4b6658e..ff433245e 100644 --- a/pysat/tests/test_meta_labels.py +++ b/pysat/tests/test_meta_labels.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Tests the pysat MetaLabels object.""" @@ -11,6 +14,7 @@ import pytest import pysat +from pysat.utils import listify from pysat.utils import testing @@ -185,6 +189,24 @@ def test_default_value_from_type_int_inputs(self, in_val, comp_val): return + @pytest.mark.parametrize("drop_labels", [["units", "fill_val"], "units"]) + def test_drop(self, drop_labels): + """Test successful drop from MetaLabels. + + Parameters + ---------- + drop_labels : str or list-like + Label or labels to drop + + """ + # Drop the desired label(s) + self.meta_labels.drop(drop_labels) + + # Ensure the labels are missing + for dlabel in listify(drop_labels): + assert not hasattr(self.meta_labels, dlabel) + return + def test_update(self): """Test successful update of MetaLabels.""" self.meta_labels.update('new_label', 'new_name', int) @@ -197,6 +219,17 @@ def test_update(self): # ---------------------------------------- # Test the integration with the Meta class + def test_drop_from_meta(self): + """Test successful deletion of MetaLabels attribute from Meta.""" + # Delete the desired label + del_label = list(self.meta_labels.label_type.keys())[0] + self.meta.drop(del_label) + + # Ensure the label is missing + assert not hasattr(self.meta.labels, del_label) + assert del_label not in self.meta.data.columns + return + def test_change_case_of_meta_labels(self): """Test changing case of meta labels after initialization.""" @@ -211,29 +244,3 @@ def test_change_case_of_meta_labels(self): assert (self.meta['new2'].Units == 'hey2') assert (self.meta['new2'].Long_Name == 'boo2') return - - def test_case_change_of_meta_labels_w_ho(self): - """Test change case of meta labels after initialization with HO data.""" - - # Set the initial labels - self.meta_labels = {'units': ('units', str), 'name': ('long_Name', str)} - self.meta = pysat.Meta(labels=self.meta_labels) - meta2 = pysat.Meta(labels=self.meta_labels) - - # Set meta data values - meta2['new21'] = {'units': 'hey2', 'long_name': 'boo2'} - self.meta['new'] = {'units': 'hey', 'long_name': 'boo'} - self.meta['new2'] = meta2 - - # Change the label name - self.meta.labels.units = 'Units' - self.meta.labels.name = 'Long_Name' - - # Evaluate the results in the main data - assert (self.meta['new'].Units == 'hey') - assert (self.meta['new'].Long_Name == 'boo') - - # Evaluate the results in the higher order data - assert (self.meta['new2'].children['new21'].Units == 'hey2') - assert (self.meta['new2'].children['new21'].Long_Name == 'boo2') - return diff --git a/pysat/tests/test_methods_general.py b/pysat/tests/test_methods_general.py index d08108f72..1b0e403ce 100644 --- a/pysat/tests/test_methods_general.py +++ b/pysat/tests/test_methods_general.py @@ -1,16 +1,32 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Unit tests for the general instrument methods.""" import datetime as dt from os import path import pandas as pds import pytest -import warnings import pysat from pysat.instruments.methods import general as gen from pysat.utils import testing +def test_filename_creator(): + """Test the `filename_creator` placeholder.""" + + testing.eval_bad_input(gen.filename_creator, NotImplementedError, + 'This feature has not been implemented yet', + input_args=[0.0]) + return + + class TestListFiles(object): """Unit tests for `pysat.instrument.methods.general.list_files`.""" @@ -114,7 +130,7 @@ def setup_method(self): # Load a test instrument self.testInst = pysat.Instrument('pysat', 'testing', num_samples=12, - clean_level='clean', use_header=True) + clean_level='clean') self.testInst.load(2009, 1) self.npts = len(self.testInst['uts']) return @@ -199,10 +215,8 @@ def setup_method(self): """Set up the unit test environment for each method.""" # Load a test instrument - self.testInst = pysat.Instrument('pysat', 'ndtesting', - num_samples=12, - clean_level='clean', - use_header=True) + self.testInst = pysat.Instrument('pysat', 'ndtesting', num_samples=12, + clean_level='clean') self.testInst.load(2009, 1) self.npts = len(self.testInst['uts']) return @@ -270,7 +284,7 @@ def eval_data_cols(self): return def test_load_single_file(self): - """Test the CVS data load with a single file.""" + """Test the CSV data load with a single file.""" self.data = gen.load_csv_data(self.csv_file) assert isinstance(self.data.index, pds.RangeIndex) @@ -279,7 +293,7 @@ def test_load_single_file(self): return def test_load_file_list(self): - """Test the CVS data load with multiple files.""" + """Test the CSV data load with multiple files.""" self.data = gen.load_csv_data([self.csv_file, self.csv_file]) assert self.data.index.dtype == 'int64' @@ -288,7 +302,7 @@ def test_load_file_list(self): return def test_load_file_with_kwargs(self): - """Test the CVS data load with kwargs.""" + """Test the CSV data load with kwargs.""" self.data = gen.load_csv_data([self.csv_file], read_csv_kwargs={"parse_dates": True, @@ -298,37 +312,11 @@ def test_load_file_with_kwargs(self): assert len(self.data.columns) == len(self.data_cols) return + def test_load_empty_filelist(self): + """Test the CSV data loading with an empty file list.""" -class TestDeprecation(object): - """Unit tests for deprecated methods.""" - - def setup_method(self): - """Set up the unit test environment for each method.""" - - warnings.simplefilter("always", DeprecationWarning) - return - - def teardown_method(self): - """Clean up the unit test environment after each method.""" - - return - - def test_convert_timestamp_to_datetime(self): - """Test that convert_timestamp_to_datetime is deprecated.""" - - warn_msgs = [" ".join( - ["New kwargs added to `pysat.utils.io.load_netCDF4`", - "for generalized handling, deprecated", - "function will be removed in pysat 3.2.0+"])] - - test = pysat.Instrument('pysat', 'testing', use_header=True) - test.load(2009, 1) - with warnings.catch_warnings(record=True) as war: - gen.convert_timestamp_to_datetime(test, epoch_name='uts') - - # Ensure the minimum number of warnings were raised - assert len(war) >= len(warn_msgs) + self.data = gen.load_csv_data([]) - # Test the warning messages, ensuring each attribute is present - pysat.utils.testing.eval_warnings(war, warn_msgs) + # Evaluate the empty output + assert self.data.empty return diff --git a/pysat/tests/test_methods_testing.py b/pysat/tests/test_methods_testing.py index 0885032a7..2af644cc1 100644 --- a/pysat/tests/test_methods_testing.py +++ b/pysat/tests/test_methods_testing.py @@ -1,8 +1,14 @@ +#!/usr/bin/env python +# Full license can be found in License.md +# Full author list can be found in .zenodo.json file +# DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. +# ---------------------------------------------------------------------------- """Tests the `pysat.instruments.methods.testing` methods.""" import datetime as dt -from os import path -import pandas as pds import pytest import pysat @@ -16,7 +22,7 @@ class TestMethodsTesting(object): def setup_method(self): """Set up the unit test environment for each method.""" - self.test_inst = pysat.Instrument('pysat', 'testing', use_header=True) + self.test_inst = pysat.Instrument('pysat', 'testing') # Get list of filenames. self.fnames = [self.test_inst.files.files.values[0]] diff --git a/pysat/tests/test_orbits.py b/pysat/tests/test_orbits.py index 28d3173f6..09ba52adb 100644 --- a/pysat/tests/test_orbits.py +++ b/pysat/tests/test_orbits.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Test the pysat routines for the orbits class.""" @@ -140,7 +143,6 @@ def test_orbit_w_bad_orbit_info(self, info): """ self.in_kwargs['orbit_info'] = info - self.in_kwargs['use_header'] = True self.testInst = pysat.Instrument(*self.in_args, **self.in_kwargs) self.testInst.load(date=self.stime) @@ -168,7 +170,6 @@ def test_orbit_polar_w_missing_orbit_index(self, info): """ self.in_kwargs['orbit_info'] = info - self.in_kwargs['use_header'] = True self.testInst = pysat.Instrument(*self.in_args, **self.in_kwargs) # Force index to None beforee loading and iterating @@ -182,7 +183,6 @@ def test_orbit_repr(self): """Test the Orbit representation.""" self.in_kwargs['orbit_info'] = {'index': 'mlt'} - self.in_kwargs['use_header'] = True self.testInst = pysat.Instrument(*self.in_args, **self.in_kwargs) out_str = self.testInst.orbits.__repr__() @@ -193,7 +193,6 @@ def test_orbit_str(self): """Test the Orbit string representation with data.""" self.in_kwargs['orbit_info'] = {'index': 'mlt'} - self.in_kwargs['use_header'] = True self.testInst = pysat.Instrument(*self.in_args, **self.in_kwargs) self.testInst.load(date=self.stime) out_str = self.testInst.orbits.__str__() @@ -212,8 +211,7 @@ def setup_method(self): self.testInst = pysat.Instrument('pysat', 'testing', clean_level='clean', orbit_info={'index': 'mlt'}, - update_files=True, - use_header=True) + update_files=True) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] self.inc_min = 97 self.etime = None @@ -391,8 +389,7 @@ def setup_method(self): self.testInst = pysat.Instrument('pysat', 'testing', clean_level='clean', orbit_info={'index': 'mlt'}, - update_files=True, - use_header=True) + update_files=True) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] return @@ -589,12 +586,11 @@ class TestGeneralOrbitsMLTxarray(TestGeneralOrbitsMLT): def setup_method(self): """Set up the unit test environment for each method.""" - self.testInst = pysat.Instrument('pysat', 'testing_xarray', + self.testInst = pysat.Instrument('pysat', 'ndtesting', clean_level='clean', orbit_info={'index': 'mlt'}, - update_files=True, - use_header=True) - self.stime = pysat.instruments.pysat_testing_xarray._test_dates[''][''] + update_files=True) + self.stime = pysat.instruments.pysat_ndtesting._test_dates[''][''] return def teardown_method(self): @@ -620,8 +616,7 @@ def setup_method(self): self.testInst = pysat.Instrument('pysat', 'testing', clean_level='clean', orbit_info={'index': 'mlt'}, - update_files=True, - use_header=True) + update_files=True) self.testInst.bounds = (self.testInst.files.files.index[0], self.testInst.files.files.index[11], '2D', dt.timedelta(days=3)) @@ -683,8 +678,7 @@ def setup_method(self): clean_level='clean', orbit_info={'index': 'longitude', 'kind': 'longitude'}, - update_files=True, - use_header=True) + update_files=True) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] return @@ -701,12 +695,11 @@ class TestGeneralOrbitsLongXarray(TestGeneralOrbitsMLT): def setup_method(self): """Set up the unit test environment for each method.""" - self.testInst = pysat.Instrument('pysat', 'testing_xarray', + self.testInst = pysat.Instrument('pysat', 'ndtesting', clean_level='clean', orbit_info={'index': 'longitude', 'kind': 'longitude'}, - update_files=True, - use_header=True) + update_files=True) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] return @@ -727,8 +720,7 @@ def setup_method(self): clean_level='clean', orbit_info={'index': 'orbit_num', 'kind': 'orbit'}, - update_files=True, - use_header=True) + update_files=True) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] return @@ -745,13 +737,12 @@ class TestGeneralOrbitsOrbitNumberXarray(TestGeneralOrbitsMLT): def setup_method(self): """Set up the unit test environment for each method.""" - self.testInst = pysat.Instrument('pysat', 'testing_xarray', + self.testInst = pysat.Instrument('pysat', 'ndtesting', clean_level='clean', orbit_info={'index': 'orbit_num', 'kind': 'orbit'}, - update_files=True, - use_header=True) - self.stime = pysat.instruments.pysat_testing_xarray._test_dates[''][''] + update_files=True) + self.stime = pysat.instruments.pysat_ndtesting._test_dates[''][''] return def teardown_method(self): @@ -771,8 +762,7 @@ def setup_method(self): clean_level='clean', orbit_info={'index': 'latitude', 'kind': 'polar'}, - update_files=True, - use_header=True) + update_files=True) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] return @@ -789,13 +779,12 @@ class TestGeneralOrbitsLatitudeXarray(TestGeneralOrbitsMLT): def setup_method(self): """Set up the unit test environment for each method.""" - self.testInst = pysat.Instrument('pysat', 'testing_xarray', + self.testInst = pysat.Instrument('pysat', 'ndtesting', clean_level='clean', orbit_info={'index': 'latitude', 'kind': 'polar'}, - update_files=True, - use_header=True) - self.stime = pysat.instruments.pysat_testing_xarray._test_dates[''][''] + update_files=True) + self.stime = pysat.instruments.pysat_ndtesting._test_dates[''][''] return def teardown_method(self): @@ -835,8 +824,7 @@ def setup_method(self): self.testInst = pysat.Instrument('pysat', 'testing', clean_level='clean', orbit_info={'index': 'mlt'}, - update_files=True, - use_header=True) + update_files=True) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] self.gaps = self.stime + self.deltime self.testInst.custom_attach(filter_data, kwargs={'times': self.gaps}) @@ -874,11 +862,10 @@ class TestOrbitsGappyDataXarray(TestOrbitsGappyData): def setup_method(self): """Set up the unit test environment for each method.""" - self.testInst = pysat.Instrument('pysat', 'testing_xarray', + self.testInst = pysat.Instrument('pysat', 'ndtesting', clean_level='clean', orbit_info={'index': 'mlt'}, - update_files=True, - use_header=True) + update_files=True) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] self.gaps = self.stime + self.deltime self.testInst.custom_attach(filter_data, kwargs={'times': self.gaps}) @@ -919,8 +906,7 @@ def setup_method(self): self.testInst = pysat.Instrument('pysat', 'testing', clean_level='clean', - orbit_info={'index': 'mlt'}, - use_header=True) + orbit_info={'index': 'mlt'}) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] self.testInst.custom_attach(filter_data, kwargs={'times': self.times}) return @@ -946,10 +932,9 @@ class TestOrbitsGappyData2Xarray(TestOrbitsGappyData2): def setup_method(self): """Set up the unit test environment for each method.""" - self.testInst = pysat.Instrument('pysat', 'testing_xarray', + self.testInst = pysat.Instrument('pysat', 'ndtesting', clean_level='clean', - orbit_info={'index': 'mlt'}, - use_header=True) + orbit_info={'index': 'mlt'}) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] self.testInst.custom_attach(filter_data, kwargs={'times': self.times}) return @@ -970,8 +955,7 @@ def setup_method(self): self.testInst = pysat.Instrument('pysat', 'testing', clean_level='clean', orbit_info={'index': 'longitude', - 'kind': 'longitude'}, - use_header=True) + 'kind': 'longitude'}) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] self.gaps = self.stime + self.deltime @@ -991,11 +975,10 @@ class TestOrbitsGappyLongDataXarray(TestOrbitsGappyData): def setup_method(self): """Set up the unit test environment for each method.""" - self.testInst = pysat.Instrument('pysat', 'testing_xarray', + self.testInst = pysat.Instrument('pysat', 'ndtesting', clean_level='clean', orbit_info={'index': 'longitude', - 'kind': 'longitude'}, - use_header=True) + 'kind': 'longitude'}) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] self.gaps = self.stime + self.deltime self.testInst.custom_attach(filter_data, kwargs={'times': self.gaps}) @@ -1017,8 +1000,7 @@ def setup_method(self): self.testInst = pysat.Instrument('pysat', 'testing', clean_level='clean', orbit_info={'index': 'orbit_num', - 'kind': 'orbit'}, - use_header=True) + 'kind': 'orbit'}) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] self.gaps = self.stime + self.deltime self.testInst.custom_attach(filter_data, kwargs={'times': self.gaps}) @@ -1037,11 +1019,10 @@ class TestOrbitsGappyOrbitNumDataXarray(TestOrbitsGappyData): def setup_method(self): """Set up the unit test environment for each method.""" - self.testInst = pysat.Instrument('pysat', 'testing_xarray', + self.testInst = pysat.Instrument('pysat', 'ndtesting', clean_level='clean', orbit_info={'index': 'orbit_num', - 'kind': 'orbit'}, - use_header=True) + 'kind': 'orbit'}) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] self.gaps = self.stime + self.deltime self.testInst.custom_attach(filter_data, kwargs={'times': self.gaps}) @@ -1063,8 +1044,7 @@ def setup_method(self): self.testInst = pysat.Instrument('pysat', 'testing', clean_level='clean', orbit_info={'index': 'latitude', - 'kind': 'polar'}, - use_header=True) + 'kind': 'polar'}) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] self.gaps = self.stime + self.deltime @@ -1084,11 +1064,10 @@ class TestOrbitsGappyOrbitLatDataXarray(TestOrbitsGappyData): def setup_method(self): """Set up the unit test environment for each method.""" - self.testInst = pysat.Instrument('pysat', 'testing_xarray', + self.testInst = pysat.Instrument('pysat', 'ndtesting', clean_level='clean', orbit_info={'index': 'latitude', - 'kind': 'polar'}, - use_header=True) + 'kind': 'polar'}) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] self.gaps = self.stime + self.deltime diff --git a/pysat/tests/test_params.py b/pysat/tests/test_params.py index 5ee483c24..f51ba24c5 100644 --- a/pysat/tests/test_params.py +++ b/pysat/tests/test_params.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Tests the pysat parameters storage area.""" diff --git a/pysat/tests/test_registry.py b/pysat/tests/test_registry.py index 0501f7b1b..3f042029a 100644 --- a/pysat/tests/test_registry.py +++ b/pysat/tests/test_registry.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Tests the registration of user-defined modules.""" diff --git a/pysat/tests/test_utils.py b/pysat/tests/test_utils.py index a6869bf4c..be475873a 100644 --- a/pysat/tests/test_utils.py +++ b/pysat/tests/test_utils.py @@ -2,10 +2,12 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Tests the pysat utils core functions.""" -import contextlib from importlib import reload import inspect import numpy as np @@ -14,7 +16,6 @@ import pytest import shutil import tempfile -import warnings import pysat from pysat.tests.classes.cls_registration import TestWithRegistration @@ -50,7 +51,7 @@ def test_update_fill_values_numbers(self, name, variables): """ # Initalize the instrument - inst = pysat.Instrument('pysat', name, use_header=True) + inst = pysat.Instrument('pysat', name) inst.load(date=self.ref_time) # Ensure there are fill values to check @@ -82,7 +83,7 @@ def test_update_fill_values_by_type(self, name): """ # Initalize the instrument - inst = pysat.Instrument('pysat', name, use_header=True) + inst = pysat.Instrument('pysat', name) inst.load(date=self.ref_time) # Ensure there are fill values to check @@ -412,9 +413,8 @@ def test_stringify_non_str_types(self, astrlike): """ - target = type(astrlike) output = pysat.utils.stringify(astrlike) - assert type(output) == target + assert type(output) is type(astrlike) return @@ -480,7 +480,7 @@ def test_neg_ncols(self): [("ncols", 0, ZeroDivisionError, "integer division or modulo by zero"), ("max_num", -10, ValueError, - "max() arg is an empty sequence")]) + "empty")]) def test_fmt_raises(self, key, val, raise_type, err_msg): """Test raises appropriate Errors for bad input values. @@ -624,19 +624,22 @@ class TestNetworkLock(object): def setup_method(self): """Set up the unit test environment.""" + # Use a temporary directory so that the user's setup is not altered. + self.temp_dir = tempfile.TemporaryDirectory() + # Create and write a temporary file - self.fname = 'temp_lock_file.txt' + self.fname = os.path.join(self.temp_dir.name, 'temp_lock_file.txt') with open(self.fname, 'w') as fh: fh.write('spam and eggs') return def teardown_method(self): """Clean up the unit test environment.""" - # Remove the temporary file - os.remove(self.fname) + # Remove the temporary directory. + self.temp_dir.cleanup() # Delete the test class attributes - del self.fname + del self.fname, self.temp_dir return def test_with_timeout(self): @@ -775,50 +778,6 @@ def test_list_kwargs_passthrough(self): return -class TestDeprecation(object): - """Unit test for deprecation warnings.""" - - @pytest.mark.parametrize("kwargs,msg_inds", - [({'fnames': None}, [0, 1]), - ({'fnames': 'no_file', 'file_format': None}, - [0, 2])]) - def test_load_netcdf4(self, kwargs, msg_inds): - """Test deprecation warnings from load_netcdf4. - - Parameters - ---------- - kwargs : dict - Keyword arguments passed to `load_netcdf4` - msg_inds : list - List of indices indicating which warning message is expected - - """ - with warnings.catch_warnings(record=True) as war: - try: - # Generate relocation warning and file_format warning - utils.load_netcdf4(**kwargs) - except (FileNotFoundError, ValueError): - pass - - warn_msgs = ["".join(["function moved to `pysat.utils.io`, ", - "deprecated wrapper will be removed in ", - "pysat 3.2.0+"]), - "".join(["`fnames` as a kwarg has been deprecated, ", - "must supply a string or list of strings", - " in 3.2.0+"]), - "".join(["`file_format` must be a string value in ", - "3.2.0+, instead of None use 'NETCDF4' ", - "for same behavior."])] - - warn_msgs = [warn_msgs[ind] for ind in msg_inds] - # Ensure the minimum number of warnings were raised - assert len(war) >= len(warn_msgs) - - # Test the warning messages, ensuring each attribute is present - utils.testing.eval_warnings(war, warn_msgs) - return - - class TestMappedValue(object): """Unit tests for utility `get_mapped_value`.""" diff --git a/pysat/tests/test_utils_coords.py b/pysat/tests/test_utils_coords.py index bd08e9917..ff701bb14 100644 --- a/pysat/tests/test_utils_coords.py +++ b/pysat/tests/test_utils_coords.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Tests the `pysat.utils.coords` functions.""" @@ -63,13 +66,11 @@ def teardown_method(self): del self.py_inst, self.inst_time return - @pytest.mark.parametrize("name", ["testing", "testing_xarray", - "ndtesting", "testmodel"]) + @pytest.mark.parametrize("name", ["testing", "ndtesting", "testmodel"]) def test_update_longitude(self, name): """Test `update_longitude` successful run.""" - self.py_inst = pysat.Instrument(platform='pysat', name=name, - use_header=True) + self.py_inst = pysat.Instrument(platform='pysat', name=name) self.py_inst.load(date=self.inst_time) # Test instruments initially define longitude between 0-360 deg @@ -86,8 +87,7 @@ def test_update_longitude(self, name): def test_bad_lon_name_update_longitude(self): """Test update_longitude with a bad longitude name.""" - self.py_inst = pysat.Instrument(platform='pysat', name="testing", - use_header=True) + self.py_inst = pysat.Instrument(platform='pysat', name="testing") self.py_inst.load(date=self.inst_time) testing.eval_bad_input(coords.update_longitude, ValueError, @@ -119,13 +119,13 @@ def teardown_method(self): del self.py_inst, self.inst_time return - @pytest.mark.parametrize("name", ["testing", "testing_xarray"]) + @pytest.mark.parametrize("name", ["testing", "ndtesting"]) def test_calc_solar_local_time(self, name): """Test SLT calculation with longitudes from 0-360 deg for 0 UTH.""" # Instantiate instrument and load data self.py_inst = pysat.Instrument(platform='pysat', name=name, - num_samples=1, use_header=True) + num_samples=1) self.py_inst.load(date=self.inst_time) coords.calc_solar_local_time(self.py_inst, lon_name="longitude", @@ -143,13 +143,13 @@ def test_calc_solar_local_time(self, name): assert np.min(np.abs(cos_diff)) > 1.0 - 1.0e-6 return - @pytest.mark.parametrize("name", ["testing", "testing_xarray"]) + @pytest.mark.parametrize("name", ["testing", "ndtesting"]) def test_calc_solar_local_time_inconsistent_keywords(self, name, caplog): """Test that ref_date only works when apply_modulus=False.""" # Instantiate instrument and load data self.py_inst = pysat.Instrument(platform='pysat', name=name, - num_samples=1, use_header=True) + num_samples=1) self.py_inst.load(date=self.inst_time) # Apply solar local time method and capture logging output @@ -168,8 +168,7 @@ def test_calc_solar_local_time_w_neg_longitude(self): """Test calc_solar_local_time with longitudes from -180 to 180 deg.""" # Instantiate instrument and load data - self.py_inst = pysat.Instrument(platform='pysat', name="testing", - use_header=True) + self.py_inst = pysat.Instrument(platform='pysat', name="testing") self.py_inst.load(date=self.inst_time) coords.calc_solar_local_time(self.py_inst, lon_name="longitude", @@ -187,8 +186,7 @@ def test_bad_lon_name_calc_solar_local_time(self): """Test raises ValueError with a bad longitude name.""" # Instantiate instrument and load data - self.py_inst = pysat.Instrument(platform='pysat', name="testing", - use_header=True) + self.py_inst = pysat.Instrument(platform='pysat', name="testing") self.py_inst.load(date=self.inst_time) # Test that the correct Exception and error message are raised @@ -199,14 +197,12 @@ def test_bad_lon_name_calc_solar_local_time(self): return - @pytest.mark.parametrize("name", ["testmodel", "testing2d", - "ndtesting"]) + @pytest.mark.parametrize("name", ["testmodel", "ndtesting"]) def test_lon_broadcasting_calc_solar_local_time(self, name): """Test calc_solar_local_time with longitude coordinates.""" # Instantiate instrument and load data - self.py_inst = pysat.Instrument(platform='pysat', name=name, - use_header=True) + self.py_inst = pysat.Instrument(platform='pysat', name=name) self.py_inst.load(date=self.inst_time) coords.calc_solar_local_time(self.py_inst, lon_name="longitude", slt_name='slt') @@ -216,14 +212,12 @@ def test_lon_broadcasting_calc_solar_local_time(self, name): assert self.py_inst['slt'].min() >= 0.0 return - @pytest.mark.parametrize("name", ["testmodel", "testing2d", - "ndtesting"]) + @pytest.mark.parametrize("name", ["testmodel", "ndtesting"]) def test_lon_broadcasting_calc_solar_local_time_no_mod_multiday(self, name): """Test non modulated solar local time output for a 2 day range.""" # Instantiate instrument and load data - self.py_inst = pysat.Instrument(platform='pysat', name=name, - use_header=True) + self.py_inst = pysat.Instrument(platform='pysat', name=name) self.py_inst.load(date=self.inst_time, end_date=self.inst_time + dt.timedelta(days=2)) coords.calc_solar_local_time(self.py_inst, lon_name="longitude", @@ -235,14 +229,12 @@ def test_lon_broadcasting_calc_solar_local_time_no_mod_multiday(self, name): assert self.py_inst['slt'].min() >= 0.0 return - @pytest.mark.parametrize("name", ["testmodel", "testing2d", - "ndtesting"]) + @pytest.mark.parametrize("name", ["testmodel", "ndtesting"]) def test_lon_broadcasting_calc_solar_local_time_no_mod_ref_date(self, name): """Test non modulated SLT output for a 2 day range with a ref date.""" # Instantiate instrument and load data - self.py_inst = pysat.Instrument(platform='pysat', name=name, - use_header=True) + self.py_inst = pysat.Instrument(platform='pysat', name=name) self.py_inst.load(date=self.inst_time, end_date=self.inst_time + dt.timedelta(days=2)) coords.calc_solar_local_time(self.py_inst, lon_name="longitude", @@ -256,14 +248,12 @@ def test_lon_broadcasting_calc_solar_local_time_no_mod_ref_date(self, name): assert self.py_inst['slt'].min() >= 24.0 return - @pytest.mark.parametrize("name", ["testmodel", "testing2d", - "ndtesting"]) + @pytest.mark.parametrize("name", ["testmodel", "ndtesting"]) def test_lon_broadcasting_calc_solar_local_time_no_mod(self, name): """Test SLT calc with longitude coordinates and no modulus.""" # Instantiate instrument and load data - self.py_inst = pysat.Instrument(platform='pysat', name=name, - use_header=True) + self.py_inst = pysat.Instrument(platform='pysat', name=name) self.py_inst.load(date=self.inst_time) coords.calc_solar_local_time(self.py_inst, lon_name="longitude", slt_name='slt', apply_modulus=False) @@ -278,8 +268,7 @@ def test_single_lon_calc_solar_local_time(self): """Test calc_solar_local_time with a single longitude value.""" # Instantiate instrument and load data - self.py_inst = pysat.Instrument(platform='pysat', name="testing_xarray", - use_header=True) + self.py_inst = pysat.Instrument(platform='pysat', name="ndtesting") self.py_inst.load(date=self.inst_time) lon_name = 'lon2' @@ -367,7 +356,7 @@ class TestExpandXarrayDims(object): def setup_method(self): """Set up the unit test environment.""" self.test_inst = pysat.Instrument( - inst_module=pysat.instruments.pysat_ndtesting, use_header=True) + inst_module=pysat.instruments.pysat_ndtesting) self.start_time = pysat.instruments.pysat_ndtesting._test_dates[''][''] self.data_list = [] self.out = None @@ -402,12 +391,12 @@ def set_data_meta(self, dims_equal): # Load a second data set with half the time samples self.test_inst = pysat.Instrument( inst_module=self.test_inst.inst_module, - num_samples=num_samples, use_header=True) + num_samples=num_samples) else: # Load a second data set with different dimensions apart from time self.test_inst = pysat.Instrument( inst_module=pysat.instruments.pysat_testmodel, - num_samples=num_samples, use_header=True) + num_samples=num_samples) self.test_inst.load(date=self.start_time + dt.timedelta(days=1)) self.data_list.append(self.test_inst.data) diff --git a/pysat/tests/test_utils_files.py b/pysat/tests/test_utils_files.py index 87ba21683..1ba5d1d78 100644 --- a/pysat/tests/test_utils_files.py +++ b/pysat/tests/test_utils_files.py @@ -2,9 +2,13 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Tests the `pysat.utils.files` functions.""" +from collections import OrderedDict import datetime as dt from importlib import reload import numpy as np @@ -21,8 +25,132 @@ from pysat.utils import testing -class TestParseDelimitedFilenames(object): - """Unit tests for the `parse_delimited_filename` function.""" +class TestConstructSearchstring(object): + """Unit tests for the `construct_searchstring_from_format` function.""" + + def setup_method(self): + """Set up the unit test environment for each method.""" + self.out_dict = {} + self.num_fmt = None + self.str_len = None + self.fill_len = None + return + + def teardown_method(self): + """Clean up the unit test environment after each method.""" + + del self.out_dict, self.num_fmt, self.str_len, self.fill_len + return + + def eval_output(self): + """Evaluate the output dictionary.""" + + testing.assert_lists_equal(['search_string', 'keys', 'type', 'lengths', + 'string_blocks'], + list(self.out_dict.keys())) + + assert len(self.out_dict['keys']) == self.num_fmt + assert len(''.join(self.out_dict['string_blocks'])) == self.str_len + assert sum(self.out_dict['lengths']) == self.fill_len + + if self.out_dict['search_string'].find('*') < 0: + assert len( + self.out_dict['search_string']) == self.fill_len + self.str_len + else: + assert len( + self.out_dict['search_string']) <= self.fill_len + self.str_len + return + + @pytest.mark.parametrize("format_str,nfmt,slen,flen", [ + ("", 0, 0, 0), ("test", 0, 4, 0), ("{year:02d}{month:02d}", 2, 0, 4), + ("test_{year:04d}.ext", 1, 9, 4)]) + def test_searchstring_success(self, format_str, nfmt, slen, flen): + """Test successful construction of a searchable string. + + Parameters + ---------- + format_str : str + The naming pattern of the instrument files and the locations of + date/version/revision/cycle information needed to create an ordered + list. + nfmt : int + Number of formatting options included in the format string + slen : int + Length of non-formatted string segments + flen : int + Length of formatted segments + + """ + # Set the evaluation criteria + self.num_fmt = nfmt + self.str_len = slen + self.fill_len = flen + + # Get the streachstring dictionary + self.out_dict = futils.construct_searchstring_from_format(format_str) + + # Evaluate the output + self.eval_output() + return + + @pytest.mark.parametrize("format_str,nfmt,slen,flen, nwc", [ + ("", 0, 0, 0, 0), ("test", 0, 4, 0, 0), + ("{year:02d}{month:02d}", 2, 0, 4, 2), + ("test_{year:04d}_{month:02d}.ext", 2, 10, 6, 2)]) + def test_searchstring_w_wildcard(self, format_str, nfmt, slen, flen, nwc): + """Test successful construction of a searchable string with wildcards. + + Parameters + ---------- + format_str : str + The naming pattern of the instrument files and the locations of + date/version/revision/cycle information needed to create an ordered + list. + nfmt : int + Number of formatting options included in the format string + slen : int + Length of non-formatted string segments + flen : int + Length of formatted segments + nwc : int + Number of wildcard (*) symbols + + """ + # Set the evaluation criteria + self.num_fmt = nfmt + self.str_len = slen + self.fill_len = flen + + # Get the streachstring dictionary + self.out_dict = futils.construct_searchstring_from_format(format_str, + True) + + # Evaluate the output + self.eval_output() + assert len(self.out_dict['search_string'].split('*')) == nwc + 1 + return + + def test_searchstring_noformat(self): + """Test failure if the input argument is NoneType.""" + + testing.eval_bad_input(futils.construct_searchstring_from_format, + ValueError, + 'Must supply a filename template (format_str).', + input_args=[None]) + return + + def test_searchstring_bad_wildcard(self): + """Test failure if unsupported wildcard use is encountered.""" + + testing.eval_bad_input(futils.construct_searchstring_from_format, + ValueError, + "Couldn't determine formatting width, check", + input_args=["test{year:02d}{month:d}.txt"]) + return + + +class TestParseFilenames(object): + """Unit tests for the file parsing functions.""" def setup_method(self): """Set up the unit test environment for each method.""" @@ -46,18 +174,10 @@ def teardown_method(self): del self.fkwargs, self.file_dict, self.kw_format del self.temporary_file_list - def eval_parse_delimited_filename(self): - """Evaluate the output of a `parse_delimited_filename` unit test. - - Returns - ------- - bool - True if there is data to evalute, False if data dict is empty - - """ + def eval_parsed_filenames(self): + """Evaluate the output of a `parse_delimited_filename` unit test.""" # Evaluate the returned data dict - if len(self.file_dict.keys()) < 2: - return False + assert len(self.file_dict.keys()) >= 2, "insufficient keys in file dict" # Extract the test lists if len(self.fkwargs) > 0: @@ -78,14 +198,25 @@ def eval_parse_delimited_filename(self): assert self.file_dict[fkey] is None, \ "unused format key has a value" - return True + return @pytest.mark.parametrize("sep_char,flead,good_kwargs", [ ("_", "test_", ['year', 'month', 'day', 'hour', 'minute', 'version']), ('-', "test", ['year', 'day', 'hour', 'minute', 'second', 'cycle', 'revision']), ('fun', 'test', [])]) def test_parse_delimited_filename(self, sep_char, flead, good_kwargs): - """Check ability to parse list of delimited files.""" + """Check ability to parse list of delimited files. + + Parameters + ---------- + sep_char : str + Separation character to use in joining the filename + flead : str + File prefix + good_kwargs : list + List of kwargs to include in the file format + + """ # Format the test input fname = '{:s}{:s}.cdf'.format(flead, sep_char.join( [self.kw_format[fkey] for fkey in good_kwargs])) @@ -106,7 +237,41 @@ def test_parse_delimited_filename(self, sep_char, flead, good_kwargs): sep_char) # Test each of the return values - assert self.eval_parse_delimited_filename() + self.eval_parsed_filenames() + return + + @pytest.mark.parametrize("is_fixed", [True, False]) + def test_parse_filenames_all_bad(self, is_fixed): + """Test files with a bad format are removed from consideration. + + Parameters + ---------- + is_fixed : bool + True for the fixed-width function, false for delimted. + + """ + + # Format the test input + format_str = 'bad_test_{:s}.cdf'.format("_".join( + [self.kw_format[fkey] for fkey in self.fkwargs[0].keys()])) + bad_format = format_str.replace('revision:02d', 'revision:2s') + + # Create the input file list + file_list = [] + for kwargs in self.fkwargs: + kwargs['revision'] = 'aa' + file_list.append(bad_format.format(**kwargs)) + + # Get the test results + if is_fixed: + self.file_dict = futils.parse_fixed_width_filenames(file_list, + format_str) + else: + self.file_dict = futils.parse_delimited_filenames(file_list, + format_str, "_") + + # Test that all files were removed + assert len(self.file_dict['files']) == 0 return def test_parse_delimited_filename_empty(self): @@ -121,7 +286,262 @@ def test_parse_delimited_filename_empty(self): self.file_dict = futils.parse_delimited_filenames([], fname, sep_char) # Test each of the return values - assert self.eval_parse_delimited_filename() + self.eval_parsed_filenames() + return + + @pytest.mark.parametrize("sep_char,flead,good_kwargs", [ + ("_", "*_", ['year', 'month', 'day', 'hour', 'minute', 'version']), + ('?', "test", ['year', 'day', 'hour', 'minute', 'second', 'cycle', + 'revision']), ('fun', '*', [])]) + def test_parse_fixed_filename(self, sep_char, flead, good_kwargs): + """Check ability to parse list of fixed width files. + + Parameters + ---------- + sep_char : str + Separation character to use in joining the filename + flead : str + File prefix + good_kwargs : list + List of kwargs to include in the file format + + """ + # Format the test input + fname = '{:s}{:s}.cdf'.format(flead, sep_char.join( + [self.kw_format[fkey] for fkey in good_kwargs])) + + # Adjust the test input/comparison data for this run + bad_kwargs = [fkey for fkey in self.fkwargs[0] + if fkey not in good_kwargs] + + for kwargs in self.fkwargs: + for fkey in bad_kwargs: + del kwargs[fkey] + + # Create the input file list + file_list = [fname.format(**kwargs) for kwargs in self.fkwargs] + + # Get the test results + self.file_dict = futils.parse_fixed_width_filenames(file_list, fname) + + # Test each of the return values + self.eval_parsed_filenames() + return + + def test_parse_fixed_width_filename_empty(self): + """Check ability to parse list of fixed-width files with no files.""" + # Format the test input + fname = ''.join(('test*', '{year:04d}', '{day:03d}', '{hour:02d}', + '{minute:02d}', '{second:02d}', '{cycle:2s}.txt')) + self.fkwargs = [] + + # Get the test results + self.file_dict = futils.parse_fixed_width_filenames([], fname) + + # Test each of the return values + self.eval_parsed_filenames() + return + + def test_init_parse_filename_empty(self): + """Check the `_init_parse_filenames` output with no files.""" + # Format the test input + fname = ''.join(('test*', '{year:04d}', '{day:03d}', '{hour:02d}', + '{minute:02d}', '{second:02d}', '{cycle:2s}.txt')) + self.fkwargs = [] + + # Get the test results + self.file_dict, sdict = futils._init_parse_filenames([], fname) + + # Test each of the return values + self.eval_parsed_filenames() + assert len(sdict.keys()) == 0, "Search dict was defined unnecessarily" + return + + def test_init_parse_filename_with_files(self): + """Check the `_init_parse_filenames` output with files.""" + # Format the test input + fname = ''.join(('test*', '{year:04d}', '{day:03d}', '{hour:02d}', + '{minute:02d}', '{second:02d}', '{cycle:2s}.txt')) + + # Create the input file list + file_list = [fname.format(**kwargs) for kwargs in self.fkwargs] + + # Get the test results + self.file_dict, sdict = futils._init_parse_filenames(file_list, fname) + + # Test the initalized dictionaries + testing.assert_lists_equal(['search_string', 'keys', 'type', 'lengths', + 'string_blocks'], list(sdict.keys())) + + for skey in sdict['keys']: + assert skey in self.file_dict.keys(), "Missing key {:}".format(skey) + + for fkey in self.file_dict.keys(): + assert self.file_dict[fkey] is None, "File dict not initalized" + + assert "files" not in self.file_dict.keys(), "'files' key set early" + assert "format_str" not in self.file_dict.keys(), \ + "'format_str' key set early" + return + + @pytest.mark.parametrize("bad_files", [[], [0]]) + def test_finish_parsed_filenames(self, bad_files): + """Test output restucturing for `_finish_parsed_filenames`. + + Parameters + ---------- + bad_files : list + List of bad file indices + + """ + # Format the test input + fname = ''.join(('test*', '{year:04d}', '{day:03d}', '{hour:02d}', + '{minute:02d}', '{second:02d}', '{cycle:2s}.txt')) + + # Create the input file list and dict + file_list = [fname.format(**kwargs) for kwargs in self.fkwargs] + self.file_dict = {'int': [1 for fname in file_list], 'none': None, + 'float': [1.0 for fname in file_list], + 'str': ['hi' for fname in file_list]} + + # Get the test results + self.file_dict = futils._finish_parse_filenames(self.file_dict, + file_list, fname, + bad_files) + + # Adjust the expected file output + if len(bad_files) > 0: + file_list = [fname for i, fname in enumerate(file_list) + if i not in bad_files] + + # Test the output + for fkey in self.file_dict: + if fkey == 'none': + assert self.file_dict[fkey] is None + elif fkey == 'files': + testing.assert_lists_equal(file_list, self.file_dict[fkey]) + elif fkey == 'format_str': + assert fname == self.file_dict[fkey] + else: + testing.assert_isinstance(self.file_dict[fkey], np.ndarray) + assert len(self.file_dict[fkey]) == len(file_list) + return + + +class TestProcessParsedFilenames(object): + """Unit tests for `process_parsed_filenames` function.""" + + def setup_method(self): + """Set up the unit test environment for each method.""" + self.stored = OrderedDict({'year': np.full(shape=3, fill_value=2001), + 'month': np.full(shape=3, fill_value=2), + 'day': np.ones(shape=3, dtype=np.int64), + 'hour': np.zeros(shape=3, dtype=np.int64), + 'minute': np.zeros(shape=3, dtype=np.int64), + 'second': np.arange(0, 3, 1), + 'version': np.arange(0, 3, 1), + 'revision': np.arange(3, 0, -1), + 'cycle': np.array([1, 3, 2])}) + self.format_str = "_".join(["test", "{year:04d}", "{month:02d}", + "{day:02d}", "{hour:02d}", "{minute:02d}", + "{second:02d}", "v{version:02d}", + "r{revision:02d}", "c{cycle:02d}.cdf"]) + + return + + def teardown_method(self): + """Clean up the unit test environment for each method.""" + + del self.stored, self.format_str + return + + def complete_stored(self): + """Add the 'files' and 'format_str' kwargs to the `stored` dict.""" + + file_list = [] + for ind in range(len(self.stored['year'])): + ind_dict = {skey: self.stored[skey][ind] + for skey in self.stored.keys()} + file_list.append(self.format_str.format(**ind_dict)) + + self.stored['files'] = file_list + self.stored['format_str'] = self.format_str + return + + @pytest.mark.parametrize("year_break", [0, 50]) + def test_two_digit_years(self, year_break): + """Test the results of using different year breaks for YY formats.""" + # Complete the ordered dict of file information + self.stored['year'] -= 2000 + self.format_str = self.format_str.replace('year:04', 'year:02') + self.complete_stored() + + # Get the file series + series = futils.process_parsed_filenames( + self.stored, two_digit_year_break=year_break) + + # Test the series year + test_year = series.index.max().year + century = 1900 if year_break == 0 else 2000 + assert test_year - century < 100, "year break caused wrong century" + + # Test that the series length is correct and all filenames are unique + assert series.shape == self.stored['year'].shape + assert np.unique(series.values).shape == self.stored['year'].shape + return + + def test_version_selection(self): + """Test version selection dominates when time is the same.""" + # Complete the ordered dict of file information + self.stored['second'] = np.zeros(shape=self.stored['year'].shape, + dtype=np.int64) + self.complete_stored() + + # Get the file series + series = futils.process_parsed_filenames(self.stored) + + # Ensure there is only one file and that it has the highest version + ver_num = "v{:02d}".format(self.stored['version'].max()) + assert series.shape == (1, ) + assert series.values[0].find(ver_num) > 0 + return + + def test_revision_selection(self): + """Test revision selection dominates after time and version.""" + # Complete the ordered dict of file information + self.stored['second'] = np.zeros(shape=self.stored['year'].shape, + dtype=np.int64) + self.stored['version'] = np.zeros(shape=self.stored['year'].shape, + dtype=np.int64) + self.complete_stored() + + # Get the file series + series = futils.process_parsed_filenames(self.stored) + + # Ensure there is only one file and that it has the highest version + rev_num = "r{:02d}".format(self.stored['revision'].max()) + assert series.shape == (1, ) + assert series.values[0].find(rev_num) > 0 + return + + def test_cycle_selection(self): + """Test cycle selection dominates after time, version, and revision.""" + # Complete the ordered dict of file information + self.stored['second'] = np.zeros(shape=self.stored['year'].shape, + dtype=np.int64) + self.stored['version'] = np.zeros(shape=self.stored['year'].shape, + dtype=np.int64) + self.stored['revision'] = np.zeros(shape=self.stored['year'].shape, + dtype=np.int64) + self.complete_stored() + + # Get the file series + series = futils.process_parsed_filenames(self.stored) + + # Ensure there is only one file and that it has the highest version + cyc_num = "c{:02d}".format(self.stored['cycle'].max()) + assert series.shape == (1, ) + assert series.values[0].find(cyc_num) > 0 return @@ -152,15 +572,13 @@ def setup_method(self): self.insts_kwargs = [] # Data by day, ACE SIS data - self.insts.append(pysat.Instrument('ace', 'sis', tag='historic', - use_header=True)) + self.insts.append(pysat.Instrument('ace', 'sis', tag='historic')) test_dates = pysatSpaceWeather.instruments.ace_sis._test_dates self.insts_dates.append([test_dates['']['historic']] * 2) self.insts_kwargs.append({}) # Data with date mangling, regular F10.7 data, stored monthly - self.insts.append(pysat.Instrument('sw', 'f107', tag='historic', - use_header=True)) + self.insts.append(pysat.Instrument('sw', 'f107', tag='historic')) test_dates = pysatSpaceWeather.instruments.sw_f107._test_dates self.insts_dates.append([test_dates['']['historic'], test_dates['']['historic'] @@ -265,7 +683,7 @@ def test_updating_directories(self, capsys): # Refresh inst with the old directory template set to get now 'old' # path information. inst2 = pysat.Instrument(inst.platform, inst.name, tag=inst.tag, - inst_id=inst.inst_id, use_header=True) + inst_id=inst.inst_id) # Check that directories with simpler platform org were NOT removed. assert os.path.isdir(inst2.files.data_path) @@ -326,7 +744,7 @@ def setup_method(self): self.testInst = pysat.Instrument( inst_module=pysat.instruments.pysat_testing, clean_level='clean', - update_files=True, use_header=True) + update_files=True) # Create instrument directories in tempdir pysat.utils.files.check_and_make_path(self.testInst.files.data_path) @@ -339,6 +757,39 @@ def teardown_method(self): del self.testInst, self.out, self.tempdir, self.start, self.stop return + @pytest.mark.skipif(os.environ.get('CI') != 'true', reason="CI test only") + def test_updating_directories_no_registration(self, capsys): + """Test directory structure update method without registered insts.""" + # Convert directories to simpler platform structure, to get output + templ = '{platform}' + futils.update_data_directory_structure(new_template=templ, + full_breakdown=True) + + # Capture printouts and test the results + captured = capsys.readouterr() + captxt = captured.out + assert captxt.find("No registered instruments detected.") >= 0, \ + "Expected output not captured in STDOUT: {:}".format(captxt) + return + + def test_search_local_system_formatted_filename(self): + """Test `search_local_system_formatted_filename` success.""" + # Create a temporary file with a unique, searchable name + prefix = "test_me" + suffix = "tstfile" + searchstr = "*".join([prefix, suffix]) + with tempfile.NamedTemporaryFile(dir=self.testInst.files.data_path, + prefix=prefix, suffix=suffix): + files = futils.search_local_system_formatted_filename( + self.testInst.files.data_path, searchstr) + + assert len(files) == 1, "unexpected number of files in search results" + assert files[0].find( + prefix) >= 0, "unexpected file prefix in search results" + assert files[0].find( + suffix) > 0, "unexpected file extension in search results" + return + def test_get_file_information(self): """Test `utils.files.get_file_information` success with existing files. diff --git a/pysat/tests/test_utils_io.py b/pysat/tests/test_utils_io.py index 7ac9c1e11..34dd526c8 100644 --- a/pysat/tests/test_utils_io.py +++ b/pysat/tests/test_utils_io.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Tests the pysat utility io routines.""" import copy @@ -10,7 +13,6 @@ import logging import numpy as np import os -import shutil import sys import tempfile import warnings @@ -61,8 +63,7 @@ def setup_method(self): pysat.params['data_dirs'] = self.tempdir.name self.testInst = pysat.Instrument(platform='pysat', name='testing', - num_samples=100, update_files=True, - use_header=True) + num_samples=100, update_files=True) self.stime = pysat.instruments.pysat_testing._test_dates[''][''] self.epoch_name = 'time' @@ -104,16 +105,7 @@ def eval_loaded_data(self, test_case=True): # Test the data values for each variable for dkey in keys: - lkey = dkey.lower() - if lkey in ['profiles', 'alt_profiles', 'series_profiles']: - # Test the loaded higher-dimension data - for tframe, lframe in zip(self.testInst[dkey], - self.loaded_inst[dkey]): - assert np.all(tframe == lframe), "unequal {:s} data".format( - dkey) - else: - # Test the standard data structures - assert np.all(self.testInst[dkey] == self.loaded_inst[dkey]) + assert np.all(self.testInst[dkey] == self.loaded_inst[dkey]) # Check that names are lower case when written pysat.utils.testing.assert_lists_equal(keys, new_keys, test_case=False) @@ -123,7 +115,7 @@ def test_basic_write_and_read_netcdf_mixed_case_data_format(self): """Test basic netCDF4 read/write with mixed case data variables.""" # Create a bunch of files by year and doy outfile = os.path.join(self.tempdir.name, 'pysat_test_ncdf.nc') - self.testInst.load(date=self.stime, use_header=True) + self.testInst.load(date=self.stime) # Modify data names in data if self.testInst.pandas_format: @@ -166,7 +158,7 @@ def test_basic_write_and_read_netcdf_mixed_case_meta_format(self): """Test basic netCDF4 read/write with mixed case metadata variables.""" # Create a bunch of files by year and doy outfile = os.path.join(self.tempdir.name, 'pysat_test_ncdf.nc') - self.testInst.load(date=self.stime, use_header=True) + self.testInst.load(date=self.stime) # Modify data and metadata names in data self.testInst.meta.rename(str.upper) @@ -209,7 +201,7 @@ def test_basic_write_and_read_netcdf_export_pysat_info(self, kwargs, """ # Create a bunch of files by year and doy outfile = os.path.join(self.tempdir.name, 'pysat_test_ncdf.nc') - self.testInst.load(date=self.stime, use_header=True) + self.testInst.load(date=self.stime) io.inst_to_netcdf(self.testInst, fname=outfile, preserve_meta_case=True, epoch_name=default_epoch_name, **kwargs) @@ -244,7 +236,7 @@ def test_inst_write_and_read_netcdf(self, add_path): 'pysat_test_ncdf_%Y%j.nc')) # Load and write the test instrument data - self.testInst.load(date=self.stime, use_header=True) + self.testInst.load(date=self.stime) self.testInst.to_netcdf4(fname=outfile, epoch_name=default_epoch_name) # Load the written file directly into an Instrument @@ -253,7 +245,7 @@ def test_inst_write_and_read_netcdf(self, add_path): netcdf_inst = pysat.Instrument( 'pysat', 'netcdf', data_dir=file_path, update_files=True, file_format=file_root, pandas_format=self.testInst.pandas_format, - use_header=True, epoch_name=default_epoch_name, **tkwargs) + epoch_name=default_epoch_name, **tkwargs) # Confirm data path is correct assert os.path.normpath(netcdf_inst.files.data_path) \ @@ -262,7 +254,7 @@ def test_inst_write_and_read_netcdf(self, add_path): # Deleting the test file here via os.remove(...) does work # Load data - netcdf_inst.load(date=self.stime, use_header=True) + netcdf_inst.load(date=self.stime) # Test the loaded Instrument data self.loaded_inst = netcdf_inst.data @@ -284,8 +276,7 @@ def test_inst_write_and_read_netcdf(self, add_path): and var not in updated_attrs] tvars = [var for var in self.testInst.meta.keys() - if var not in self.testInst.meta.keys_nD() - and var.lower() not in ["epoch", "time"]] + if var.lower() not in ["epoch", "time"]] fvars = [var for var in netcdf_inst.meta.keys() if var.lower() not in ["epoch", "time"]] @@ -326,7 +317,7 @@ def test_write_netcdf4_duplicate_variable_names(self): # Create a bunch of files by year and doy outfile = os.path.join(self.tempdir.name, 'pysat_test_ncdf.nc') - self.testInst.load(date=self.stime, use_header=True) + self.testInst.load(date=self.stime) self.testInst['MLT'] = 1 # Evaluate the expected error and message @@ -358,7 +349,7 @@ def test_read_netcdf4_bad_epoch_name(self, write_epoch, err_msg, err_type): # Load data outfile = os.path.join(self.tempdir.name, 'pysat_test_ncdf.nc') - self.testInst.load(date=self.stime, use_header=True) + self.testInst.load(date=self.stime) # Write file io.inst_to_netcdf(self.testInst, fname=outfile, epoch_name=write_epoch) @@ -402,7 +393,7 @@ def test_read_netcdf4_epoch_not_xarray_dimension(self, caplog, write_epoch, # Load data outfile = os.path.join(self.tempdir.name, 'pysat_test_ncdf.nc') - self.testInst.load(date=self.stime, use_header=True) + self.testInst.load(date=self.stime) # Write file io.inst_to_netcdf(self.testInst, outfile, epoch_name=write_epoch) @@ -440,7 +431,7 @@ def test_write_and_read_netcdf4_w_kwargs(self, wkwargs, lkwargs): # Create a new file based on loaded test data outfile = os.path.join(self.tempdir.name, 'pysat_test_ncdf.nc') - self.testInst.load(date=self.stime, use_header=True) + self.testInst.load(date=self.stime) if 'epoch_name' not in wkwargs.keys(): wkwargs['epoch_name'] = default_epoch_name io.inst_to_netcdf(self.testInst, fname=outfile, **wkwargs) @@ -481,7 +472,7 @@ def test_read_netcdf4_w_epoch_kwargs(self, kwargs, target): # Create a bunch of files by year and doy outfile = os.path.join(self.tempdir.name, 'pysat_{:}_ncdf.nc'.format(self.testInst.name)) - self.testInst.load(date=self.stime, use_header=True) + self.testInst.load(date=self.stime) io.inst_to_netcdf(self.testInst, fname=outfile, epoch_name=default_epoch_name) @@ -533,7 +524,7 @@ def test_read_netcdf4_w_epoch_kwargs(self, kwargs, target): def test_netcdf_prevent_attribute_override(self): """Test that attributes will not be overridden by default.""" - self.testInst.load(date=self.stime, use_header=True) + self.testInst.load(date=self.stime) # Test that `bespoke` attribute is initially missing assert not hasattr(self.testInst, 'bespoke') @@ -553,7 +544,7 @@ def test_netcdf_prevent_attribute_override(self): def test_netcdf_attribute_override(self): """Test that attributes in the netCDF file may be overridden.""" - self.testInst.load(date=self.stime, use_header=True) + self.testInst.load(date=self.stime) self.testInst.meta.mutable = True self.testInst.meta.bespoke = True @@ -593,7 +584,7 @@ def test_decode_times(self, decode_times): # Create a file outfile = os.path.join(self.tempdir.name, 'pysat_test_ncdf.nc') - self.testInst.load(date=self.stime, use_header=True) + self.testInst.load(date=self.stime) io.inst_to_netcdf(self.testInst, fname=outfile, epoch_name=default_epoch_name) @@ -643,7 +634,7 @@ def test_drop_labels(self, drop_labels): # Create a file with additional metadata outfile = os.path.join(self.tempdir.name, 'pysat_test_ncdf.nc') - self.testInst.load(date=self.stime, use_header=True) + self.testInst.load(date=self.stime) # Add additional metadata self.testInst.meta['mlt'] = {drop_label: 1.} @@ -697,8 +688,7 @@ def setup_method(self): self.testInst = pysat.Instrument(platform='pysat', name='ndtesting', - update_files=True, num_samples=100, - use_header=True) + update_files=True, num_samples=100) self.stime = pysat.instruments.pysat_ndtesting._test_dates[ ''][''] self.epoch_name = 'time' @@ -747,7 +737,7 @@ def test_read_netcdf4_with_time_meta_labels(self, kwargs, target): # Prepare output test data outfile = os.path.join(self.tempdir.name, 'pysat_test_ncdf.nc') - self.testInst.load(date=self.stime, use_header=True) + self.testInst.load(date=self.stime) # Modify the variable attributes directly before writing to file self.testInst.meta['uts'] = {'units': 'seconds'} @@ -772,60 +762,23 @@ def test_read_netcdf4_with_time_meta_labels(self, kwargs, target): "Variable {:} not loaded correctly".format(var) return - def test_load_netcdf_pandas_3d_error(self): - """Test load_netcdf error with a pandas 3D file.""" + def test_load_netcdf_pandas_2d_error(self): + """Test load_netcdf error with a pandas 2D file.""" # Create a bunch of files by year and doy outfile = os.path.join(self.tempdir.name, 'pysat_test_ncdf.nc') - self.testInst.load(date=self.stime, use_header=True) + self.testInst.load(date=self.stime) io.inst_to_netcdf(self.testInst, fname=outfile) # Evaluate the error raised and the expected message testing.eval_bad_input( io.load_netcdf, ValueError, - "only supports 1D and 2D data in pandas", input_args=[outfile], + "only supports 1D data in pandas", input_args=[outfile], input_kwargs={"epoch_name": 'time', "pandas_format": True}) return -class TestLoadNetCDF2DPandas(TestLoadNetCDF): - """Unit tests for `load_netcdf` using 2d pandas data.""" - - def setup_method(self): - """Set up the test environment.""" - - # Create temporary directory - self.tempdir = tempfile.TemporaryDirectory() - self.saved_path = pysat.params['data_dirs'] - pysat.params['data_dirs'] = self.tempdir.name - - self.testInst = pysat.Instrument(platform='pysat', name='testing2d', - update_files=True, num_samples=100, - use_header=True) - self.stime = pysat.instruments.pysat_testing2d._test_dates[''][''] - self.epoch_name = 'time' - - # Initialize the loaded data object - self.loaded_inst = None - return - - def teardown_method(self): - """Clean up the test environment.""" - - pysat.params['data_dirs'] = self.saved_path - - # Clear the attributes with data in them - del self.loaded_inst, self.testInst, self.stime, self.epoch_name - - # Remove the temporary directory - self.tempdir.cleanup() - - # Clear the directory attributes - del self.tempdir, self.saved_path - return - - class TestNetCDF4Integration(object): """Integration tests for the netCDF4 I/O utils.""" @@ -848,10 +801,8 @@ def setup_method(self): # Create an instrument object that has a meta with some # variables allowed to be nan within metadata when exporting. - self.testInst = pysat.Instrument('pysat', 'testing', num_samples=5, - use_header=True) - self.testInst.load(date=self.testInst.inst_module._test_dates[''][''], - use_header=True) + self.testInst = pysat.Instrument('pysat', 'testing', num_samples=5) + self.testInst.load(date=self.testInst.inst_module._test_dates['']['']) self.pformat = self.testInst.pandas_format return @@ -991,9 +942,10 @@ def test_filter_netcdf4_metadata(self, remove, check_type, export_nan, assert mkey not in export_nan, \ "{:} should have been exported".format(repr(mkey)) else: - if(mkey in export_nan and not np.issubdtype(data_type, str) - and np.isnan(mdict[mkey])): - assert np.isnan(fdict[mkey]) + if all([mkey in export_nan, + not np.issubdtype(data_type, str)]): + if np.isnan(mdict[mkey]): + assert np.isnan(fdict[mkey]) else: if mkey in check_type and fdict[mkey] != mdict[mkey]: assert fdict[mkey] == data_type(mdict[mkey]), \ @@ -1049,17 +1001,6 @@ def test_add_netcdf4_standards_to_meta(self, missing): assert label not in init_meta[var] assert label in new_meta[var] - if self.testInst.name == 'testing2D': - assert 'Depend_1' not in init_meta[var] - - # Check for higher dimensional data properties - if self.testInst.name == 'testing2D': - for var in self.testInst.vars_no_time: - if self.testInst.meta[var].children is not None: - assert 'Depend_1' in new_meta[var] - else: - assert 'Depend_1' not in new_meta[var] - return @pytest.mark.parametrize('meta_trans', [{'units': ['testingFillVal', @@ -1281,14 +1222,6 @@ def test_meta_processor_to_from_netcdf4(self, assign_flag): def test_missing_metadata(self): """Test writing file with no metadata.""" - # Collect a list of higher order meta - ho_vars = [] - for var in self.testInst.meta.keys(): - if 'children' in self.testInst.meta[var]: - if self.testInst.meta[var]['children'] is not None: - for subvar in self.testInst.meta[var]['children'].keys(): - ho_vars.append((subvar, var)) - # Drop all metadata self.testInst.meta.keep([]) @@ -1306,12 +1239,6 @@ def test_missing_metadata(self): # Test the warning testing.eval_warnings(war, exp_warns, warn_type=UserWarning) - # Test warning for higher order data as well (pandas) - for (svar, var) in ho_vars: - wstr = ''.join(['Unable to find MetaData for ', - svar, ' subvariable of ', var]) - exp_warns.append(wstr) - # Test the warning testing.eval_warnings(war, exp_warns, warn_type=UserWarning) @@ -1319,41 +1246,7 @@ def test_missing_metadata(self): class TestNetCDF4IntegrationXarray(TestNetCDF4Integration): - """Integration tests for the netCDF4 I/O utils using xarray data.""" - - def setup_method(self): - """Create a testing environment.""" - - # Create an instrument object that has a meta with some - # variables allowed to be nan within metadata when exporting. - self.testInst = pysat.Instrument('pysat', 'testing_xarray', - num_samples=5, use_header=True) - self.testInst.load(date=self.testInst.inst_module._test_dates[''][''], - use_header=True) - self.pformat = self.testInst.pandas_format - - return - - -class TestNetCDF4IntegrationPandas2D(TestNetCDF4Integration): - """Integration tests for the netCDF4 I/O utils using pandas2d Instrument.""" - - def setup_method(self): - """Create a testing environment.""" - - # Create an instrument object that has a meta with some - # variables allowed to be nan within metadata when exporting. - self.testInst = pysat.Instrument('pysat', 'testing2d', num_samples=5, - use_header=True) - self.testInst.load(date=self.testInst.inst_module._test_dates[''][''], - use_header=True) - self.pformat = self.testInst.pandas_format - - return - - -class TestNetCDF4Integration2DXarray(TestNetCDF4Integration): - """Integration tests for the netCDF4 I/O utils using 2dxarray Instrument.""" + """Integration tests for the netCDF4 I/O utils using xarray Instrument.""" def setup_method(self): """Create a testing environment.""" @@ -1361,9 +1254,8 @@ def setup_method(self): # Create an instrument object that has a meta with some # variables allowed to be nan within metadata when exporting. self.testInst = pysat.Instrument('pysat', 'ndtesting', - num_samples=5, use_header=True) - self.testInst.load(date=self.testInst.inst_module._test_dates[''][''], - use_header=True) + num_samples=5) + self.testInst.load(date=self.testInst.inst_module._test_dates['']['']) self.pformat = self.testInst.pandas_format return @@ -1377,10 +1269,8 @@ def setup_method(self): # Create an instrument object that has a meta with some # variables allowed to be nan within metadata when exporting. - self.testInst = pysat.Instrument('pysat', 'testmodel', num_samples=5, - use_header=True) - self.testInst.load(date=self.testInst.inst_module._test_dates[''][''], - use_header=True) + self.testInst = pysat.Instrument('pysat', 'testmodel', num_samples=5) + self.testInst.load(date=self.testInst.inst_module._test_dates['']['']) self.pformat = self.testInst.pandas_format return @@ -1394,10 +1284,8 @@ def setup_method(self): # Create an instrument object that has a meta with some # variables allowed to be nan within metadata when exporting. - self.testInst = pysat.Instrument('pysat', 'testing_xarray', - num_samples=5, use_header=True) - self.testInst.load(date=self.testInst.inst_module._test_dates[''][''], - use_header=True) + self.testInst = pysat.Instrument('pysat', 'ndtesting', num_samples=5) + self.testInst.load(date=self.testInst.inst_module._test_dates['']['']) self.epoch_name = 'time' return @@ -1516,8 +1404,7 @@ class TestMetaTranslation(object): def setup_method(self): """Create test environment.""" - self.test_inst = pysat.Instrument('pysat', 'testing', num_samples=5, - use_header=True) + self.test_inst = pysat.Instrument('pysat', 'testing', num_samples=5) self.test_date = pysat.instruments.pysat_testing._test_dates[''][''] self.test_inst.load(date=self.test_date) self.meta_dict = self.test_inst.meta.to_dict() @@ -1767,33 +1654,34 @@ def test_remove_netcdf4_standards(self, caplog): # Enforcing netcdf4 standards removes 'fill', min, and max information # for string variables. This is not re-added by the `remove_` function # call since, strictly speaking, we don't know what to add back in. - # Also exepmting a check on long_name for higher order data with a time - # index. When loading files, pysat specifically checks for 'Epoch' as + # When loading files, pysat specifically checks for 'Epoch' as # the long_name. So, ensuring long_name for such variables is written # could break loading for existent files. I could fake it, and assign # the standard name as long_name when loading, and while that would # pass the tests here as written, it would be brittle. Check everything # else. + + def assert_meta_unchanged(old_meta, filt_meta, var, key): + """Check that filtered meta value is unchanged.""" + + assert old_meta[var][key] == filt_meta[var][key], \ + 'Value changed for {}, {}'.format(var, key) + return + for var in self.meta_dict.keys(): assert var in filt_meta, 'Lost metadata variable {}'.format(var) for key in self.meta_dict[var].keys(): - # Creating exception for time-index of higher order data. The - # long_name comes out differently. - if var == 'profiles' and (key == 'long_name'): - continue - # Test remaining variables accounting for possible exceptions # for string variables. if key not in ['fill', 'value_min', 'value_max']: assert key in filt_meta[var], \ 'Lost metadata label {} for {}'.format(key, var) - assert self.meta_dict[var][key] == filt_meta[var][key],\ - 'Value changed for {}, {}'.format(var, key) + assert_meta_unchanged(self.meta_dict, filt_meta, var, key) else: if key in filt_meta: - assert self.meta_dict[var][key] == filt_meta[var][key],\ - 'Value changed for {}, {}'.format(var, key) + assert_meta_unchanged(self.meta_dict, filt_meta, var, + key) return @@ -1801,60 +1689,12 @@ def test_remove_netcdf4_standards(self, caplog): class TestMetaTranslationXarray(TestMetaTranslation): """Test meta translation when writing/loading files xarray Instrument.""" - def setup_method(self): - """Create test environment.""" - - self.test_inst = pysat.Instrument('pysat', 'testing_xarray', - num_samples=5, use_header=True) - self.test_date = pysat.instruments.pysat_testing_xarray._test_dates - self.test_date = self.test_date[''][''] - self.test_inst.load(date=self.test_date) - self.meta_dict = self.test_inst.meta.to_dict() - self.out = None - - return - - def teardown_method(self): - """Cleanup test environment.""" - - del self.test_inst, self.test_date, self.out, self.meta_dict - - return - - -class TestMetaTranslation2DXarray(TestMetaTranslation): - """Test meta translation when writing/loading files xarray2d Instrument.""" - def setup_method(self): """Create test environment.""" self.test_inst = pysat.Instrument('pysat', 'ndtesting', - num_samples=5, use_header=True) - self.test_date = pysat.instruments.pysat_testing_xarray._test_dates - self.test_date = self.test_date[''][''] - self.test_inst.load(date=self.test_date) - self.meta_dict = self.test_inst.meta.to_dict() - self.out = None - - return - - def teardown_method(self): - """Cleanup test environment.""" - - del self.test_inst, self.test_date, self.out, self.meta_dict - - return - - -class TestMetaTranslation2DPandas(TestMetaTranslation): - """Test meta translation when writing/loading files testing2d Instrument.""" - - def setup_method(self): - """Create test environment.""" - - self.test_inst = pysat.Instrument('pysat', 'testing2d', - num_samples=5, use_header=True) - self.test_date = pysat.instruments.pysat_testing2d._test_dates[''][''] + num_samples=5) + self.test_date = pysat.instruments.pysat_ndtesting._test_dates[''][''] self.test_inst.load(date=self.test_date) self.meta_dict = self.test_inst.meta.to_dict() self.out = None @@ -1875,8 +1715,7 @@ class TestMetaTranslationModel(TestMetaTranslation): def setup_method(self): """Create test environment.""" - self.test_inst = pysat.Instrument('pysat', 'testmodel', - num_samples=5, use_header=True) + self.test_inst = pysat.Instrument('pysat', 'testmodel', num_samples=5) self.test_date = pysat.instruments.pysat_testmodel._test_dates[''][''] self.test_inst.load(date=self.test_date) self.meta_dict = self.test_inst.meta.to_dict() @@ -1890,68 +1729,3 @@ def teardown_method(self): del self.test_inst, self.test_date, self.out, self.meta_dict return - - -class TestIODeprecation(object): - """Unit tests for deprecation warnings in `utils.io`.""" - - def setup_method(self): - """Set up the test environment.""" - - # Create temporary directory - self.tempdir = tempfile.TemporaryDirectory() - self.saved_path = pysat.params['data_dirs'] - pysat.params['data_dirs'] = self.tempdir.name - - self.outfile = os.path.join(self.tempdir.name, 'pysat_test_ncdf.nc') - self.in_kwargs = {'labels': { - 'units': ('units', str), 'name': ('long_name', str), - 'notes': ('notes', str), 'desc': ('desc', str), - 'min_val': ('value_min', float), 'max_val': ('value_max', float), - 'fill_val': ('fill', float)}} - - return - - def teardown_method(self): - """Clean up the test environment.""" - - pysat.params['data_dirs'] = self.saved_path - - # Remove the temporary directory - self.tempdir.cleanup() - - # Clear the attributes - del self.tempdir, self.saved_path, self.outfile, self.in_kwargs - return - - @pytest.mark.parametrize("inst_name,load_func", [ - ("testing", io.load_netcdf_pandas), - ("ndtesting", io.load_netcdf_xarray)]) - def test_load_netcdf_labels(self, inst_name, load_func): - """Test deprecation of `labels` kwarg in different load functions. - - Parameters - ---------- - inst_name : str - Instrument name for test Instrument - load_func : function - NetCDF load method with deprecation warning - - """ - - # Create a test file - testInst = pysat.Instrument(platform='pysat', name=inst_name, - num_samples=100, update_files=True, - use_header=True) - testInst.load(date=testInst.inst_module._test_dates['']['']) - io.inst_to_netcdf(testInst, fname=self.outfile) - - # Catch the warnings - with warnings.catch_warnings(record=True) as war: - load_func(self.outfile, **self.in_kwargs) - - # Test the warnings - assert len(war) >= 1 - testing.eval_warnings(war, - ["`labels` is deprecated, use `meta_kwargs`"]) - return diff --git a/pysat/tests/test_utils_testing.py b/pysat/tests/test_utils_testing.py index f5c7fed73..8db5485e9 100644 --- a/pysat/tests/test_utils_testing.py +++ b/pysat/tests/test_utils_testing.py @@ -2,13 +2,14 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Tests the pysat utility testing routines.""" import numpy as np -import os import pytest -import tempfile import warnings from pysat.utils import testing @@ -156,50 +157,130 @@ def test_nan_equal_bad(self, val1, val2): UserWarning, DeprecationWarning, SyntaxWarning, RuntimeWarning, FutureWarning, PendingDeprecationWarning, ImportWarning, UnicodeWarning, BytesWarning, ResourceWarning]) - def test_good_eval_warnings(self, warn_type): - """Test warning evaluation function success. + @pytest.mark.parametrize("second_type", [ + None, UserWarning, DeprecationWarning, SyntaxWarning, RuntimeWarning, + FutureWarning, PendingDeprecationWarning, ImportWarning, + UnicodeWarning, BytesWarning, ResourceWarning]) + def test_good_eval_warnings(self, warn_type, second_type): + """Test warning evaluation function success including multiple types. Parameters ---------- warn_type : Warning Warning class to be raised - + second_type : Warning or None + Optional warning class to be raised """ - warn_msg = 'test warning' + if second_type is not None: + warn_msgs = ['test warning1', 'test_warning2'] + warn_types = [warn_type, second_type] + else: + # Only test a single msg/type + warn_msgs = ['test warning'] + warn_types = [warn_type] # Raise the desired warning with warnings.catch_warnings(record=True) as war: - warnings.warn(warn_msg, warn_type) + for warn_msg, loop_warn_type in zip(warn_msgs, warn_types): + warnings.warn(warn_msg, loop_warn_type) # Evaluate the warning output - testing.eval_warnings(war, [warn_msg], warn_type) + testing.eval_warnings(war, warn_msgs, warn_types) return @pytest.mark.parametrize("warn_type", [ UserWarning, DeprecationWarning, SyntaxWarning, RuntimeWarning, FutureWarning, PendingDeprecationWarning, ImportWarning, UnicodeWarning, BytesWarning, ResourceWarning]) - def test_eval_warnings_bad_type(self, warn_type): + @pytest.mark.parametrize("second_type", [ + None, UserWarning, DeprecationWarning, SyntaxWarning, RuntimeWarning, + FutureWarning, PendingDeprecationWarning, ImportWarning, + UnicodeWarning, BytesWarning, ResourceWarning]) + def test_eval_warnings_bad_types(self, warn_type, second_type): """Test warning evaluation function failure for mismatched type. Parameters ---------- warn_type : Warning Warning class to be raised + second_type : Warning or None + Optional warning class to be raised """ warn_msg = 'test warning' bad_type = UserWarning if warn_type != UserWarning else BytesWarning + sbad_type = UserWarning if second_type != UserWarning else BytesWarning + + if second_type is not None: + warn_msgs = [warn_msg, 'test_warning2'] + warn_types = [warn_type, second_type] + bad_types = [bad_type, sbad_type] + else: + # Only test a single msg/type + warn_msgs = [warn_msg] + warn_types = [warn_type] + bad_types = [bad_type] # Raise the desired warning with warnings.catch_warnings(record=True) as war: - warnings.warn(warn_msg, warn_type) + for loop_warn_msg, loop_warn_type in zip(warn_msgs, warn_types): + warnings.warn(loop_warn_msg, loop_warn_type) + + # Catch and evaluate the expected error + with pytest.raises(AssertionError) as aerr: + testing.eval_warnings(war, warn_msgs, bad_types) + + assert str(aerr).find('bad warning type for message:') >= 0 + + return + + @pytest.mark.parametrize("warn_type", [ + UserWarning, DeprecationWarning, SyntaxWarning, RuntimeWarning, + FutureWarning, PendingDeprecationWarning, ImportWarning, + UnicodeWarning, BytesWarning, ResourceWarning]) + @pytest.mark.parametrize("second_type", [ + None, UserWarning, DeprecationWarning, SyntaxWarning, RuntimeWarning, + FutureWarning, PendingDeprecationWarning, ImportWarning, + UnicodeWarning, BytesWarning, ResourceWarning]) + def test_eval_warnings_bad_msgs(self, warn_type, second_type): + """Test warning evaluation function failure for mismatched message. + + Parameters + ---------- + warn_type : Warning + Warning class to be raised + second_type : Warning or None + Optional warning class to be raised + + """ + warn_msg = 'test warning' + bad_msg = 'not correct' + + if second_type is not None: + warn_msgs = [warn_msg, 'test_warning2'] + warn_types = [warn_type, second_type] + bad_msgs = [bad_msg, 'not_correct2'] + else: + # Only test a single msg/type + warn_msgs = [warn_msg] + warn_types = [warn_type] + bad_msgs = [bad_msg] + + # Raise the desired warning + with warnings.catch_warnings(record=True) as war: + for loop_warn_msg, loop_warn_type in zip(warn_msgs, warn_types): + warnings.warn(loop_warn_msg, loop_warn_type) # Catch and evaluate the expected error with pytest.raises(AssertionError) as aerr: - testing.eval_warnings(war, [warn_msg], bad_type) + testing.eval_warnings(war, bad_msgs, warn_types) + + assert str(aerr).find('did not find') >= 0 + + # Check for warning types in message + for loop_warn_type in warn_types: + assert str(aerr).find(repr(loop_warn_type)) >= 0 - assert str(aerr).find('bad warning type for message') >= 0 return @pytest.mark.parametrize("warn_type", [ diff --git a/pysat/tests/test_utils_time.py b/pysat/tests/test_utils_time.py index 8b5c5714f..fc9199e59 100644 --- a/pysat/tests/test_utils_time.py +++ b/pysat/tests/test_utils_time.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Tests for the pysat.utils.time functions.""" diff --git a/pysat/utils/__init__.py b/pysat/utils/__init__.py index 9badbae4e..70d6d7a30 100644 --- a/pysat/utils/__init__.py +++ b/pysat/utils/__init__.py @@ -6,20 +6,19 @@ for the pysat data directory structure. """ -from pysat.utils._core import available_instruments -from pysat.utils._core import display_available_instruments -from pysat.utils._core import display_instrument_stats -from pysat.utils._core import generate_instrument_list -from pysat.utils._core import get_mapped_value -from pysat.utils._core import listify -from pysat.utils._core import load_netcdf4 -from pysat.utils._core import NetworkLock -from pysat.utils._core import scale_units -from pysat.utils._core import stringify -from pysat.utils._core import update_fill_values -from pysat.utils import coords -from pysat.utils import files -from pysat.utils import io -from pysat.utils import registry -from pysat.utils import testing -from pysat.utils import time +from pysat.utils._core import available_instruments # noqa: F401 +from pysat.utils._core import display_available_instruments # noqa: F401 +from pysat.utils._core import display_instrument_stats # noqa: F401 +from pysat.utils._core import generate_instrument_list # noqa: F401 +from pysat.utils._core import get_mapped_value # noqa: F401 +from pysat.utils._core import listify # noqa: F401 +from pysat.utils._core import NetworkLock # noqa: F401 +from pysat.utils._core import scale_units # noqa: F401 +from pysat.utils._core import stringify # noqa: F401 +from pysat.utils._core import update_fill_values # noqa: F401 +from pysat.utils import coords # noqa: F401 +from pysat.utils import files # noqa: F401 +from pysat.utils import io # noqa: F401 +from pysat.utils import registry # noqa: F401 +from pysat.utils import testing # noqa: F401 +from pysat.utils import time # noqa: F401 diff --git a/pysat/utils/_core.py b/pysat/utils/_core.py index 1241d34ef..4abdfa499 100644 --- a/pysat/utils/_core.py +++ b/pysat/utils/_core.py @@ -2,17 +2,16 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- import datetime as dt import importlib -import netCDF4 import numpy as np import os -import pandas as pds from portalocker import Lock -import warnings -import xarray as xr import pysat @@ -190,109 +189,6 @@ def stringify(strlike): return strlike -def load_netcdf4(fnames=None, strict_meta=False, file_format='NETCDF4', - epoch_name='Epoch', epoch_unit='ms', epoch_origin='unix', - pandas_format=True, decode_timedelta=False, - labels={'units': ('units', str), 'name': ('long_name', str), - 'notes': ('notes', str), 'desc': ('desc', str), - 'min_val': ('value_min', np.float64), - 'max_val': ('value_max', np.float64), - 'fill_val': ('fill', np.float64)}): - """Load netCDF-3/4 file produced by pysat. - - .. deprecated:: 3.0.2 - Function moved to `pysat.utils.io.load_netcdf`, this wrapper will be - removed in the 3.2.0+ release. - No longer allow non-string file formats in the 3.2.0+ release. - - Parameters - ---------- - fnames : str, array_like, or NoneType - Filename(s) to load, will fail if None (default=None) - strict_meta : bool - Flag that checks if metadata across fnames is the same if True - (default=False) - file_format : str - file_format keyword passed to netCDF4 routine. Expects one of - 'NETCDF3_CLASSIC', 'NETCDF3_64BIT', 'NETCDF4_CLASSIC', or 'NETCDF4'. - (default='NETCDF4') - epoch_name : str - Data key for epoch variable. The epoch variable is expected to be an - array of integer or float values denoting time elapsed from an origin - specified by `epoch_origin` with units specified by `epoch_unit`. This - epoch variable will be converted to a `DatetimeIndex` for consistency - across pysat instruments. (default='Epoch') - epoch_unit : str - The pandas-defined unit of the epoch variable ('D', 's', 'ms', 'us', - 'ns'). (default='ms') - epoch_origin : str or timestamp-convertable - Origin of epoch calculation, following convention for - `pandas.to_datetime`. Accepts timestamp-convertable objects, as well as - two specific strings for commonly used calendars. These conversions are - handled by `pandas.to_datetime`. - If ‘unix’ (or POSIX) time; origin is set to 1970-01-01. - If ‘julian’, `epoch_unit` must be ‘D’, and origin is set to beginning of - Julian Calendar. Julian day number 0 is assigned to the day starting at - noon on January 1, 4713 BC. (default='unix') - pandas_format : bool - Flag specifying if data is stored in a pandas DataFrame (True) or - xarray Dataset (False). (default=False) - decode_timedelta : bool - Used for xarray datasets. If True, variables with unit attributes that - are 'timelike' ('hours', 'minutes', etc) are converted to - `np.timedelta64`. (default=False) - labels : dict - Dict where keys are the label attribute names and the values are tuples - that have the label values and value types in that order. - (default={'units': ('units', str), 'name': ('long_name', str), - 'notes': ('notes', str), 'desc': ('desc', str), - 'min_val': ('value_min', np.float64), - 'max_val': ('value_max', np.float64), 'fill_val': ('fill', np.float64)}) - - Returns - ------- - data : pandas.DataFrame or xarray.Dataset - Class holding file data - meta : pysat.Meta - Class holding file meta data - - Raises - ------ - ValueError - If kwargs that should be args are not set on instantiation. - KeyError - If epoch/time dimension could not be identified. - - """ - warnings.warn("".join(["function moved to `pysat.utils.io`, deprecated ", - "wrapper will be removed in pysat 3.2.0+"]), - DeprecationWarning, stacklevel=2) - - if fnames is None: - warnings.warn("".join(["`fnames` as a kwarg has been deprecated, must ", - "supply a string or list of strings in 3.2.0+"]), - DeprecationWarning, stacklevel=2) - raise ValueError("Must supply a filename/list of filenames") - - if file_format is None: - warnings.warn("".join(["`file_format` must be a string value in ", - "3.2.0+, instead of None use 'NETCDF4' for ", - "same behavior."]), - DeprecationWarning, stacklevel=2) - file_format = 'NETCDF4' - - data, meta = pysat.utils.io.load_netcdf(fnames, strict_meta=strict_meta, - file_format=file_format, - epoch_name=epoch_name, - epoch_unit=epoch_unit, - epoch_origin=epoch_origin, - pandas_format=pandas_format, - decode_timedelta=decode_timedelta, - labels=labels) - - return data, meta - - def get_mapped_value(value, mapper): """Adjust value using mapping dict or function. @@ -433,6 +329,7 @@ def generate_instrument_list(inst_loc, user_info=None): instrument_download = [] instrument_optional_load = [] instrument_no_download = [] + instrument_new_tests = [] # Look through list of available instrument modules in the given location for inst_module in instrument_names: @@ -443,7 +340,7 @@ def generate_instrument_list(inst_loc, user_info=None): # If this can't be imported, we can't pull out the info for the # download / no_download tests. Leaving in basic tests for all # instruments, but skipping the rest. The import error will be - # caught as part of the pytest.mark.all_inst tests in InstTestClass + # caught as part of the pytest.mark.all_inst tests in InstLibTests pass else: # try to grab basic information about the module so we @@ -453,7 +350,7 @@ def generate_instrument_list(inst_loc, user_info=None): except AttributeError: # If a module does not have a test date, add it anyway for # other tests. This will be caught later by - # InstTestClass.test_instrument_test_dates + # InstLibTests.test_instrument_test_dates info = {} info[''] = {'': dt.datetime(2009, 1, 1)} module._test_dates = info @@ -481,6 +378,8 @@ def generate_instrument_list(inst_loc, user_info=None): # Check if instrument is configured for download tests. if inst._test_download: instrument_download.append(in_dict.copy()) + if inst._new_tests: + instrument_new_tests.append(in_dict.copy()) if hasattr(module, '_test_load_opt'): # Add optional load tests try: @@ -492,6 +391,10 @@ def generate_instrument_list(inst_loc, user_info=None): # Append as copy so kwargs are unique. instrument_optional_load.append( in_dict.copy()) + if inst._new_tests: + instrument_new_tests.append( + in_dict.copy()) + except KeyError: # Option does not exist for tag/inst_id # combo @@ -507,7 +410,8 @@ def generate_instrument_list(inst_loc, user_info=None): output = {'names': instrument_names, 'download': instrument_download, 'load_options': instrument_download + instrument_optional_load, - 'no_download': instrument_no_download} + 'no_download': instrument_no_download, + 'new_tests': instrument_new_tests} return output diff --git a/pysat/utils/coords.py b/pysat/utils/coords.py index 504dc3ff8..1ab471eee 100644 --- a/pysat/utils/coords.py +++ b/pysat/utils/coords.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Coordinate transformation functions for pysat.""" @@ -66,12 +69,7 @@ def update_longitude(inst, lon_name=None, high=180.0, low=-180.0): raise ValueError('unknown longitude variable name') new_lon = adjust_cyclic_data(inst[lon_name], high=high, low=low) - - # TODO(#988): Remove pandas/xarray logic after fixing issue in Instrument - if inst.pandas_format: - inst[lon_name] = new_lon - else: - inst.data = inst.data.update({lon_name: (inst[lon_name].dims, new_lon)}) + inst[lon_name] = new_lon return diff --git a/pysat/utils/files.py b/pysat/utils/files.py index ce0c18727..818a0b070 100644 --- a/pysat/utils/files.py +++ b/pysat/utils/files.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Utilities for file management and parsing file names.""" @@ -20,6 +23,135 @@ from pysat.utils.time import create_datetime_index +# Define hidden support functions + +def _init_parse_filenames(files, format_str): + """Set the initial output for the file parsing functions. + + Parameters + ---------- + files : list + List of files, typically provided by + `pysat.utils.files.search_local_system_formatted_filename`. + format_str : str + Provides the naming pattern of the instrument files and the + locations of date information so an ordered list may be produced. + Supports all provided string formatting codes though only 'year', + 'month', 'day', 'hour', 'minute', 'second', 'version', 'revision', + and 'cycle' will be used for time and sorting information. For example, + `instrument-{year:4d}_{month:02d}-{day:02d}_v{version:02d}.cdf`, or + `*-{year:4d}_{month:02d}hithere{day:02d}_v{version:02d}.cdf` + + Returns + ------- + stored : collections.OrderedDict + Information parsed from filenames that includes: 'year', 'month', 'day', + 'hour', 'minute', 'second', 'version', 'revision', and 'cycle', as + well as any other user provided template variables. Also + includes `files`, an input list of files, and `format_str`. + search_dict : dict + An output dict with the following keys: + - 'search_string' (format_str with data to be parsed replaced with ?) + - 'keys' (keys for data to be parsed) + - 'type' (type of data expected for each key to be parsed) + - 'lengths' (string length for data to be parsed) + - 'string_blocks' (the filenames are broken into fixed width segments). + + See Also + -------- + pysat.utils.files.parse_fixed_width_filenames + pysat.utils.files.parse_delimited_filenames + pysat.utils.files.construct_searchstring_from_format + + """ + # Create storage for data to be parsed from filenames + ordered_keys = ['year', 'month', 'day', 'hour', 'minute', 'second', + 'version', 'revision', 'cycle'] + stored = collections.OrderedDict({kk: None for kk in ordered_keys}) + + # Only define search dictionary if there are files to search + if len(files) == 0: + # Include keys that should only be added at the end, if there are no + # files to process + stored['format_str'] = format_str + stored['files'] = [] + search_dict = dict() + else: + # Parse format string to get information needed to parse filenames + search_dict = construct_searchstring_from_format(format_str, + wildcard=False) + + # Add non-standard keys + for key in search_dict['keys']: + if key not in stored: + stored[key] = None + + return stored, search_dict + + +def _finish_parse_filenames(stored, files, format_str, bad_files): + """Reshape and finalize the output for the file parsing functions. + + Parameters + ---------- + stored : collections.OrderedDict + Information parsed from filenames that includes: 'year', 'month', 'day', + 'hour', 'minute', 'second', 'version', 'revision', and 'cycle', as + well as any other user provided template variables. + files : list + List of files, typically provided by + `pysat.utils.files.search_local_system_formatted_filename`. + format_str : str + Provides the naming pattern of the instrument files and the + locations of date information so an ordered list may be produced. + Supports all provided string formatting codes though only 'year', + 'month', 'day', 'hour', 'minute', 'second', 'version', 'revision', + and 'cycle' will be used for time and sorting information. For example, + `instrument-{year:4d}_{month:02d}-{day:02d}_v{version:02d}.cdf`, or + `*-{year:4d}_{month:02d}hithere{day:02d}_v{version:02d}.cdf` + bad_files : list + List of indices for files with names that do not fit the requested + format, or an empty list if all are good. + + Returns + ------- + stored : collections.OrderedDict + Information parsed from filenames that includes: 'year', 'month', 'day', + 'hour', 'minute', 'second', 'version', 'revision', and 'cycle', as + well as any other user provided template variables. Also + includes `files`, an input list of files, and `format_str`. + + See Also + -------- + pysat.utils.files.parse_fixed_width_filenames + pysat.utils.files.parse_delimited_filenames + + """ + # Change the bad file index list to a good file index list + good_files = [i for i in range(len(files)) if i not in bad_files] + + # Convert to numpy arrays + for key in stored.keys(): + if stored[key] is not None: + # Get the data type + dtype = type(stored[key][0]) + + # Cast the good data as an array of the desired type and select + # only the values with good files + stored[key] = np.array(stored[key])[good_files].astype(dtype) + + # Include files and file format in output + stored['format_str'] = format_str + if len(bad_files) == 0: + stored['files'] = files + else: + stored['files'] = list(np.array(files)[good_files]) + + return stored + + +# Define file utility functions + def process_parsed_filenames(stored, two_digit_year_break=None): """Create a Files pandas Series of filenames from a formatted dict. @@ -165,24 +297,11 @@ def parse_fixed_width_filenames(files, format_str): """ # Create storage for data to be parsed from filenames - ordered_keys = ['year', 'month', 'day', 'hour', 'minute', 'second', - 'version', 'revision', 'cycle'] - stored = collections.OrderedDict({kk: list() for kk in ordered_keys}) + stored, search_dict = _init_parse_filenames(files, format_str) if len(files) == 0: - stored['files'] = [] - # Include format string as convenience for later functions - stored['format_str'] = format_str return stored - # Parse format string to get information needed to parse filenames - search_dict = construct_searchstring_from_format(format_str) - - # Add non-standard keys - for key in search_dict['keys']: - if key not in stored: - stored[key] = [] - # Determine the locations the date/version information in a filename is # stored and use these indices to slice out date from filenames. idx = 0 @@ -201,6 +320,7 @@ def parse_fixed_width_filenames(files, format_str): np.array(end_key, dtype=np.int64) - max_len] # Need to parse out dates for datetime index + bad_files = [] for i, temp in enumerate(files): for j, key in enumerate(search_dict['keys']): if key_str_idx[1][j] == 0: @@ -208,25 +328,23 @@ def parse_fixed_width_filenames(files, format_str): val = temp[key_str_idx[0][j]:] else: val = temp[key_str_idx[0][j]:key_str_idx[1][j]] - stored[key].append(val) - - # Convert to numpy arrays - for key in stored.keys(): - if len(stored[key]) == 0: - stored[key] = None - else: - try: - # Assume key value is numeric integer - stored[key] = np.array(stored[key]).astype(np.int64) - except ValueError: - # Store key value as string - stored[key] = np.array(stored[key]) - # Include files in output - stored['files'] = files + # Cast the data value, if possible + if search_dict['type'][j] is not None: + try: + val = search_dict['type'][j](val) + except ValueError: + # The type is wrong, exclude this file + bad_files.append(i) + + # Save the parsed variable for this key and file + if stored[key] is None: + stored[key] = [val] + else: + stored[key].append(val) - # Include format string as convenience for later functions - stored['format_str'] = format_str + # Convert to numpy arrays and add additional information to output + stored = _finish_parse_filenames(stored, files, format_str, bad_files) return stored @@ -281,26 +399,11 @@ def parse_delimited_filenames(files, format_str, delimiter): """ # Create storage for data to be parsed from filenames - ordered_keys = ['year', 'month', 'day', 'hour', 'minute', 'second', - 'version', 'revision', 'cycle'] - stored = collections.OrderedDict({kk: None for kk in ordered_keys}) + stored, search_dict = _init_parse_filenames(files, format_str) - # Exit early if there are no files if len(files) == 0: - stored['files'] = [] - - # Include format string as convenience for later functions - stored['format_str'] = format_str return stored - # Parse format string to get information needed to parse filenames - search_dict = construct_searchstring_from_format(format_str, wildcard=False) - - # Add non-standard keys - for key in search_dict['keys']: - if key not in stored: - stored[key] = None - # Going to parse the string on the delimiter. It is possible that other # regions have the delimiter but aren't going to be parsed out. # Reconstruct string from `snips` and use `{}` in place of `keys` and @@ -340,7 +443,8 @@ def parse_delimited_filenames(files, format_str, delimiter): if stored[key] is None: stored[key] = [] - for temp in files: + bad_files = list() + for ifile, temp in enumerate(files): split_name = temp.split(delimiter) idx = 0 loop_split_idx = split_idx @@ -352,6 +456,15 @@ def parse_delimited_filenames(files, format_str, delimiter): val = loop_sname[sidx:sidx + search_dict['lengths'][idx]] loop_sname = loop_sname[sidx + search_dict['lengths'][idx]:] + # Cast the value as the desired data type, if not possible + # identify a bad file + if search_dict['type'][idx] is not None: + try: + val = search_dict['type'][idx](val) + except ValueError: + # The type is wrong, exclude this file + bad_files.append(ifile) + # Store parsed info and increment key index stored[search_dict['keys'][idx]].append(val) idx += 1 @@ -361,21 +474,8 @@ def parse_delimited_filenames(files, format_str, delimiter): loop_split_idx = loop_split_idx[j + 1:] break - # Convert to numpy arrays - for key in stored.keys(): - if stored[key] is not None: - try: - # Assume key value is numeric integer - stored[key] = np.array(stored[key]).astype(np.int64) - except ValueError: - # Store key value as string - stored[key] = np.array(stored[key]) - - # Include files in output - stored['files'] = files - - # Include format string as convenience for later functions - stored['format_str'] = format_str + # Convert to numpy arrays and add additional information to output + stored = _finish_parse_filenames(stored, files, format_str, bad_files) return stored @@ -395,7 +495,7 @@ def construct_searchstring_from_format(format_str, wildcard=False): `instrument_{year:04d}{month:02d}{day:02d}_v{version:02d}.cdf` wildcard : bool If True, replaces each '?' sequence that would normally - be returned with a single '*'. + be returned with a single '*'. (default=False) Returns ------- @@ -403,6 +503,7 @@ def construct_searchstring_from_format(format_str, wildcard=False): An output dict with the following keys: - 'search_string' (format_str with data to be parsed replaced with ?) - 'keys' (keys for data to be parsed) + - 'type' (type of data expected for each key to be parsed) - 'lengths' (string length for data to be parsed) - 'string_blocks' (the filenames are broken into fixed width segments). @@ -423,10 +524,18 @@ def construct_searchstring_from_format(format_str, wildcard=False): This is the first function employed by `pysat.Files.from_os`. + If no type is supplied for datetime parameters, int will be used. + """ - out_dict = {'search_string': '', 'keys': [], 'lengths': [], + out_dict = {'search_string': '', 'keys': [], 'type': [], 'lengths': [], 'string_blocks': []} + type_dict = {'s': str, 'b': np.int64, 'c': np.int64, 'd': np.int64, + 'o': np.int64, 'e': np.float64, 'E': np.float64, + 'f': np.float64, 'F': np.float64, 'g': np.float64, + 'G': np.float64} + int_keys = ['year', 'month', 'day', 'hour', 'minute', 'second', + 'microsecond'] if format_str is None: raise ValueError("Must supply a filename template (format_str).") @@ -445,6 +554,17 @@ def construct_searchstring_from_format(format_str, wildcard=False): if snip[1] is not None: out_dict['keys'].append(snip[1]) + if snip[2] is None: + out_dict['type'].append(snip[2]) + else: + snip_type = snip[2][-1] + if snip_type in type_dict.keys(): + out_dict['type'].append(type_dict[snip_type]) + elif snip[1] in int_keys: + out_dict['type'].append(np.int64) + else: + out_dict['type'].append(None) + # Try and determine formatting width fwidths = re.findall(r'\d+', snip[2]) @@ -462,10 +582,10 @@ def construct_searchstring_from_format(format_str, wildcard=False): out_dict['search_string'] += '*' break else: - estr = ''.join(["Couldn't determine formatting width. ", - "This may be due to the use of unsupported ", - "wildcard characters."]) - raise ValueError(estr) + raise ValueError( + ''.join(["Couldn't determine formatting width, check ", + "formatting length specification (e.g., ", + "{day:03d} for day of year)."])) return out_dict diff --git a/pysat/utils/io.py b/pysat/utils/io.py index 571b6a31d..78e419829 100644 --- a/pysat/utils/io.py +++ b/pysat/utils/io.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Input/Output utilities for pysat data.""" import copy @@ -235,142 +238,28 @@ def add_netcdf4_standards_to_metadict(inst, in_meta_dict, epoch_name, meta_dict.update(time_meta) - if inst[var].dtype == np.dtype('O') and coltype != str: - # This is a Series or DataFrame, possibly with more dimensions. - # Series and DataFrame data must be treated differently. - try: - # Assume it is a DataFrame and get a list of subvariables - subvars = inst[0, var].columns - is_frame = True - except AttributeError: - # Data is Series of Series, which doesn't have columns. - subvars = [inst[0, var].name] - is_frame = False - - # Get the dimensions and their names - dims = np.shape(inst[0, var]) - obj_dim_names = [] - if len(dims) == 1: - # Pad the dimensions so that the rest of the code works - # for either a Series or a DataFrame. - dims = (dims[0], 0) - - for dim in dims[:-1]: - # Don't need to go over last dimension value, - # it covers number of columns (if a DataFrame). - obj_dim_names.append(var) - - # Set the base-level meta data. - meta_dict['Depend_1'] = obj_dim_names[-1] - - # Cycle through each of the sub-variable, updating metadata. - for svar in subvars: - # Find the subvariable data within the main variable, - # checking that this is not an empty DataFrame or - # Series. Determine the underlying data types. - good_data_loc = 0 - for idat in np.arange(len(inst.data)): - if len(inst[idat, var]) > 0: - good_data_loc = idat - break + meta_dict['Format'] = inst._get_var_type_code(coltype) - # Get the correct location of the sub-variable based on - # the object type. - if is_frame: - good_data = inst[good_data_loc, var][svar] - else: - good_data = inst[good_data_loc, var] - - # Get subvariable information - _, sctype, sdflag = inst._get_data_info(good_data) - - if not sdflag: - # Not a datetime index - smeta_dict = {'Depend_0': epoch_name, - 'Depend_1': obj_dim_names[-1], - 'Display_Type': 'Multidimensional', - 'Format': inst._get_var_type_code(sctype), - 'Var_Type': 'data'} - else: - # Attach datetime index metadata - smeta_dict = return_epoch_metadata(inst, epoch_name) - smeta_dict.pop('MonoTon') - - # Construct name, variable_subvariable, and store. - sname = '_'.join([lower_var, svar.lower()]) - if sname in out_meta_dict: - out_meta_dict[sname].update(smeta_dict) - else: - warnings.warn(''.join(['Unable to find MetaData for ', - svar, ' subvariable of ', var])) - out_meta_dict[sname] = smeta_dict - - # Filter metadata - remove = True if sctype == str else False - out_meta_dict[sname] = filter_netcdf4_metadata( - inst, out_meta_dict[sname], sctype, remove=remove, - check_type=check_type, export_nan=export_nan, varname=sname) - - # Get information on the subvar index. This information - # stored under primary variable name. - sub_index = inst[good_data_loc, var].index - _, coltype, datetime_flag = inst._get_data_info(sub_index) - meta_dict['Format'] = inst._get_var_type_code(coltype) - - # Deal with index information for holding variable - _, index_type, index_flag = inst._get_data_info( - inst[good_data_loc, var].index) - - # Update metadata when a datetime index found - if index_flag: - update_dict = return_epoch_metadata(inst, epoch_name) - update_dict.pop('MonoTon') - update_dict.update(meta_dict) - else: - if inst[good_data_loc, var].index.name is not None: - name = inst[good_data_loc, var].index.name - else: - name = var - update_dict = {inst.meta.labels.name: name} - update_dict.update(meta_dict) - - if lower_var in out_meta_dict: - out_meta_dict[lower_var].update(update_dict) - else: - warnings.warn(''.join(['Unable to find MetaData for ', - var])) - out_meta_dict[lower_var] = update_dict - - # Filter metdata for other netCDF4 requirements - remove = True if index_type == str else False - out_meta_dict[lower_var] = filter_netcdf4_metadata( - inst, out_meta_dict[lower_var], index_type, remove=remove, - check_type=check_type, export_nan=export_nan, varname=lower_var) + if not inst.pandas_format: + for i, dim in enumerate(list(inst[var].dims)): + meta_dict['Depend_{:1d}'.format(i)] = dim + num_dims = len(inst[var].dims) + if num_dims >= 2: + meta_dict['Display_Type'] = 'Multidimensional' + # Update the meta data + if lower_var in out_meta_dict: + out_meta_dict[lower_var].update(meta_dict) else: - # Dealing with 1D data or xarray format - meta_dict['Format'] = inst._get_var_type_code(coltype) - - if not inst.pandas_format: - for i, dim in enumerate(list(inst[var].dims)): - meta_dict['Depend_{:1d}'.format(i)] = dim - num_dims = len(inst[var].dims) - if num_dims >= 2: - meta_dict['Display_Type'] = 'Multidimensional' - - # Update the meta data - if lower_var in out_meta_dict: - out_meta_dict[lower_var].update(meta_dict) - else: - warnings.warn(''.join(['Unable to find MetaData for ', - var])) - out_meta_dict[lower_var] = meta_dict + warnings.warn(''.join(['Unable to find MetaData for ', + var])) + out_meta_dict[lower_var] = meta_dict - # Filter metdata for other netCDF4 requirements - remove = True if coltype == str else False - out_meta_dict[lower_var] = filter_netcdf4_metadata( - inst, out_meta_dict[lower_var], coltype, remove=remove, - check_type=check_type, export_nan=export_nan, varname=lower_var) + # Filter metdata for other netCDF4 requirements + remove = True if coltype == str else False + out_meta_dict[lower_var] = filter_netcdf4_metadata( + inst, out_meta_dict[lower_var], coltype, remove=remove, + check_type=check_type, export_nan=export_nan, varname=lower_var) return out_meta_dict @@ -418,11 +307,6 @@ def remove_netcdf4_standards_from_meta(mdict, epoch_name, labels): lower_sub_keys = [ckey.lower() for ckey in mdict[key].keys()] sub_keys = list(mdict[key].keys()) - if 'meta' in lower_sub_keys: - # Higher dimensional data, recursive treatment. - mdict[key]['meta'] = remove_netcdf4_standards_from_meta( - mdict[key]['meta'], '', labels) - # Check for presence of time information for lval in lower_sub_keys: if lval in lower_time_vals: @@ -631,13 +515,6 @@ def apply_table_translation_from_file(trans_table, meta_dict): wstr = 'Translation label "{}" not found for variable "{}".' pysat.logger.debug(wstr.format(trans_key, var_key)) - # Check for higher order metadata - if 'meta' in meta_dict[var_key].keys(): - # Recursive call to process metadata - ldict = meta_dict[var_key]['meta'] - filt_dict[var_key]['meta'] = \ - apply_table_translation_from_file(trans_table, ldict) - return filt_dict @@ -673,12 +550,6 @@ def meta_array_expander(meta_dict): for key in meta_dict.keys(): loop_dict = {} for meta_key in meta_dict[key].keys(): - # Check for higher order data from 2D pandas support - if meta_key == 'meta': - loop_dict[meta_key] = meta_array_expander( - meta_dict[key][meta_key]) - continue - tst_array = np.asarray(meta_dict[key][meta_key]) if tst_array.shape == (): loop_dict[meta_key] = meta_dict[key][meta_key] @@ -697,7 +568,7 @@ def meta_array_expander(meta_dict): def load_netcdf(fnames, strict_meta=False, file_format='NETCDF4', epoch_name=None, epoch_unit='ms', epoch_origin='unix', pandas_format=True, decode_timedelta=False, - combine_by_coords=True, meta_kwargs=None, labels=None, + combine_by_coords=True, meta_kwargs=None, meta_processor=None, meta_translation=None, drop_meta_labels=None, decode_times=None, strict_dim_check=True): @@ -748,10 +619,6 @@ def load_netcdf(fnames, strict_meta=False, file_format='NETCDF4', meta_kwargs : dict or NoneType Dict to specify custom Meta initialization or None to use Meta defaults (default=None) - labels : dict or NoneType - Dict where keys are the label attribute names and the values are tuples - that have the label values and value types in that order. None to use - meta defaults. Deprecated, use `meta_kwargs` instead. (default=None) meta_processor : function or NoneType If not None, a dict containing all of the loaded metadata will be passed to `meta_processor` which should return a filtered version @@ -810,7 +677,7 @@ def load_netcdf(fnames, strict_meta=False, file_format='NETCDF4', epoch_name=epoch_name, epoch_unit=epoch_unit, epoch_origin=epoch_origin, - meta_kwargs=meta_kwargs, labels=labels, + meta_kwargs=meta_kwargs, meta_processor=meta_processor, meta_translation=meta_translation, drop_meta_labels=drop_meta_labels) @@ -822,7 +689,7 @@ def load_netcdf(fnames, strict_meta=False, file_format='NETCDF4', epoch_origin=epoch_origin, decode_timedelta=decode_timedelta, combine_by_coords=combine_by_coords, - meta_kwargs=meta_kwargs, labels=labels, + meta_kwargs=meta_kwargs, meta_processor=meta_processor, meta_translation=meta_translation, drop_meta_labels=drop_meta_labels, @@ -834,7 +701,7 @@ def load_netcdf(fnames, strict_meta=False, file_format='NETCDF4', def load_netcdf_pandas(fnames, strict_meta=False, file_format='NETCDF4', epoch_name='Epoch', epoch_unit='ms', epoch_origin='unix', - meta_kwargs=None, labels=None, meta_processor=None, + meta_kwargs=None, meta_processor=None, meta_translation=None, drop_meta_labels=None): """Load netCDF-3/4 file produced by pysat in a pandas format. @@ -870,10 +737,6 @@ def load_netcdf_pandas(fnames, strict_meta=False, file_format='NETCDF4', meta_kwargs : dict or NoneType Dict to specify custom Meta initialization or None to use Meta defaults (default=None) - labels : dict or NoneType - Dict where keys are the label attribute names and the values are tuples - that have the label values and value types in that order or None to use - Meta defaults. Deprecated, use `meta_kwargs` instead. (default=None) meta_processor : function or NoneType If not None, a dict containing all of the loaded metadata will be passed to `meta_processor` which should return a filtered version @@ -928,19 +791,10 @@ def load_netcdf_pandas(fnames, strict_meta=False, file_format='NETCDF4', saved_meta = None running_idx = 0 running_store = [] - two_d_keys = [] - two_d_dims = [] if meta_kwargs is None: meta_kwargs = {} - if labels is not None: - warnings.warn("".join(["`labels` is deprecated, use `meta_kwargs`", - "with the 'labels' key instead. Support ", - "for `labels` will be removed in v3.2.0+"]), - DeprecationWarning, stacklevel=2) - meta_kwargs['labels'] = labels - meta = pysat.Meta(**meta_kwargs) # Store all metadata in a dict that may be filtered before @@ -957,12 +811,6 @@ def load_netcdf_pandas(fnames, strict_meta=False, file_format='NETCDF4', else: drop_meta_labels = pysat.utils.listify(drop_meta_labels) - # Need to use name label later to identify variables with long_name 'Epoch' - name_label = meta.labels.name - for key in meta_translation.keys(): - if meta.labels.name in meta_translation[key]: - name_label = key - # Load data for each file for fname in fnames: with netCDF4.Dataset(fname, mode='r', format=file_format) as data: @@ -986,152 +834,11 @@ def load_netcdf_pandas(fnames, strict_meta=False, file_format='NETCDF4', nc_key) full_mdict[key] = meta_dict - # TODO(#913): Remove 2D support - if len(data.variables[key].dimensions) == 2: - # Part of a DataFrame to store within the main DataFrame - two_d_keys.append(key) - two_d_dims.append(data.variables[key].dimensions) - - if len(data.variables[key].dimensions) >= 3: - raise ValueError(' '.join(('pysat only supports 1D and 2D', + if len(data.variables[key].dimensions) >= 2: + raise ValueError(' '.join(('pysat only supports 1D', 'data in pandas. Please use', 'xarray for this file.'))) - # TODO(#913): Remove 2D support - # We now have a list of keys that need to go into a dataframe, - # could be more than one, collect unique dimensions for 2D keys. - for dim in set(two_d_dims): - # First or second dimension could be epoch. Use other - # dimension name as variable name. - if dim[0] == epoch_name: - obj_key = dim[1] - elif dim[1] == epoch_name: - obj_key = dim[0] - else: - estr = ''.join(['Epoch label: "', epoch_name, '"', - ' was not found in loaded dimensions [', - ', '.join(dim), ']']) - raise KeyError(estr) - - # Collect variable names associated with dimension - idx_bool = [dim == i for i in two_d_dims] - idx, = np.where(np.array(idx_bool)) - obj_var_keys = [] - clean_var_keys = [] - for i in idx: - obj_var_keys.append(two_d_keys[i]) - clean_var_keys.append( - two_d_keys[i].split(obj_key + '_')[-1]) - - # Figure out how to index this data, it could provide its - # own index - or we may have to create simple integer based - # DataFrame access. If the dimension is stored as its own - # variable then use that info for index. - if obj_key in obj_var_keys: - # String used to indentify dimension also in - # data.variables will be used as an index. - index_key_name = obj_key - - # If the object index uses UNIX time, process into - # datetime index. - if data.variables[obj_key].getncattr( - name_label) == epoch_name: - # Found the name to be used in DataFrame index - index_name = epoch_name - time_index_flag = True - else: - time_index_flag = False - - # Label to be used in DataFrame index - index_name = data.variables[obj_key].getncattr( - name_label) - else: - # Dimension is not itself a variable - index_key_name = None - - # Iterate over the variables and grab metadata - dim_meta_data = {} - - # Store attributes in metadata, except for the dimension name. - for key, clean_key in zip(obj_var_keys, clean_var_keys): - meta_dict = {} - for nc_key in data.variables[key].ncattrs(): - meta_dict[nc_key] = data.variables[key].getncattr( - nc_key) - - dim_meta_data[clean_key] = meta_dict - - dim_meta_dict = {'meta': dim_meta_data} - - # Add top level meta - if index_key_name is not None: - for nc_key in data.variables[obj_key].ncattrs(): - dim_meta_dict[nc_key] = data.variables[ - obj_key].getncattr(nc_key) - full_mdict[obj_key] = dim_meta_dict - - # Iterate over all variables with this dimension - # data storage, whole shebang. - loop_dict = {} - - # List holds a series of slices, parsed from dict above. - loop_list = [] - for key, clean_key in zip(obj_var_keys, clean_var_keys): - loop_dict[clean_key] = data.variables[ - key][:, :].flatten(order='C') - - # Find the number of time values - loop_lim = data.variables[obj_var_keys[0]].shape[0] - - # Find the number of values per time - step = len(data.variables[obj_var_keys[0]][0, :]) - - # Check if there is an index we should use - if not (index_key_name is None): - time_var = loop_dict.pop(index_key_name) - if time_index_flag: - # Create datetime index from data - time_var = pds.to_datetime(time_var, unit=epoch_unit, - origin=epoch_origin) - new_index = time_var - new_index_name = index_name - else: - # Using integer indexing if no index identified - new_index = np.arange((loop_lim * step), - dtype=np.int64) % step - new_index_name = 'index' - - # Load all data into frame - if len(loop_dict.keys()) > 1: - loop_frame = pds.DataFrame(loop_dict, - columns=clean_var_keys) - if obj_key in loop_frame: - del loop_frame[obj_key] - - # Break massive frame into bunch of smaller frames - for i in np.arange(loop_lim, dtype=np.int64): - loop_list.append(loop_frame.iloc[(step * i): - (step * (i + 1)), :]) - loop_list[-1].index = new_index[(step * i): - (step * (i + 1))] - loop_list[-1].index.name = new_index_name - else: - loop_frame = pds.Series(loop_dict[clean_var_keys[0]], - name=obj_var_keys[0]) - - # Break massive series into bunch of smaller series - for i in np.arange(loop_lim, dtype=np.int64): - loop_list.append(loop_frame.iloc[(step * i): - (step * (i + 1))]) - loop_list[-1].index = new_index[(step * i): - (step * (i + 1))] - loop_list[-1].index.name = new_index_name - - # Add 2D object data, all based on a unique dimension within - # netCDF, to loaded data dictionary. - loaded_vars[obj_key] = loop_list - del loop_list - # Prepare dataframe index for this netcdf file if epoch_name not in loaded_vars.keys(): estr = ''.join(['Epoch label: "', epoch_name, '"', @@ -1163,10 +870,6 @@ def load_netcdf_pandas(fnames, strict_meta=False, file_format='NETCDF4', for label in drop_meta_labels: if label in full_mdict[var]: full_mdict[var].pop(label) - if 'meta' in full_mdict[var]: - for var2 in full_mdict[var]['meta'].keys(): - if label in full_mdict[var]['meta'][var2]: - full_mdict[var]['meta'][var2].pop(label) # Second, remove some items pysat added for netcdf compatibility. filt_mdict = remove_netcdf4_standards_from_meta(full_mdict, epoch_name, @@ -1186,23 +889,7 @@ def load_netcdf_pandas(fnames, strict_meta=False, file_format='NETCDF4', # Assign filtered metadata to pysat.Meta instance for key in filt_mdict: - if 'meta' in filt_mdict[key].keys(): - # Higher order metadata - dim_meta = pysat.Meta(**meta_kwargs) - for skey in filt_mdict[key]['meta'].keys(): - dim_meta[skey] = filt_mdict[key]['meta'][skey] - - # Remove HO metdata that was just transfered elsewhere - filt_mdict[key].pop('meta') - - # Assign standard metdata - meta[key] = filt_mdict[key] - - # Assign HO metadata - meta[key] = {'meta': dim_meta} - else: - # Standard metadata - meta[key] = filt_mdict[key] + meta[key] = filt_mdict[key] return data, meta @@ -1210,7 +897,7 @@ def load_netcdf_pandas(fnames, strict_meta=False, file_format='NETCDF4', def load_netcdf_xarray(fnames, strict_meta=False, file_format='NETCDF4', epoch_name='time', epoch_unit='ms', epoch_origin='unix', decode_timedelta=False, combine_by_coords=True, - meta_kwargs=None, labels=None, meta_processor=None, + meta_kwargs=None, meta_processor=None, meta_translation=None, drop_meta_labels=None, decode_times=False, strict_dim_check=True): """Load netCDF-3/4 file produced by pysat into an xarray Dataset. @@ -1254,10 +941,6 @@ def load_netcdf_xarray(fnames, strict_meta=False, file_format='NETCDF4', meta_kwargs : dict or NoneType Dict to specify custom Meta initialization or None to use Meta defaults (default=None) - labels : dict or NoneType - Dict where keys are the label attribute names and the values are tuples - that have the label values and value types in that order or None to use - Meta defaults. Deprecated, use `meta_kwargs` instead. (default=None) meta_processor : function or NoneType If not None, a dict containing all of the loaded metadata will be passed to `meta_processor` which should return a filtered version @@ -1316,13 +999,6 @@ def load_netcdf_xarray(fnames, strict_meta=False, file_format='NETCDF4', if meta_kwargs is None: meta_kwargs = {} - if labels is not None: - warnings.warn("".join(["`labels` is deprecated, use `meta_kwargs`", - "with the 'labels' key instead. Support ", - "for `labels` will be removed in v3.2.0+"]), - DeprecationWarning, stacklevel=2) - meta_kwargs['labels'] = labels - meta = pysat.Meta(**meta_kwargs) # Store all metadata in a dict that may be filtered before @@ -1638,8 +1314,6 @@ def inst_to_netcdf(inst, fname, base_instrument=None, epoch_name=None, Stores 1-D data along dimension 'Epoch' - the date time index. - Stores higher order data (e.g. dataframes within series) separately - - The name of the main variable column is used to prepend subvariable names within netCDF, var_subvar_sub - A netCDF4 dimension is created for each main variable column @@ -1789,7 +1463,6 @@ def inst_to_netcdf(inst, fname, base_instrument=None, epoch_name=None, meta_translation = copy.deepcopy(meta_translation) # Ensure `meta_translation` has default values for items not assigned. - # This is needed for the higher order pandas support and may be removed. def_meta_trans = default_to_netcdf_translation_table(inst) for key in def_meta_trans.keys(): if key not in meta_translation: @@ -1892,180 +1565,24 @@ def inst_to_netcdf(inst, fname, base_instrument=None, epoch_name=None, # Not datetime data, just store as is. cdfkey[:] = data.values.astype(coltype) else: - # It is a Series of objects. First, figure out what the - # individual object types are. Then, act as needed. - - # Use info in coltype to get real datatype of object - if coltype == str: - if '_FillValue' in export_meta[lower_key].keys(): - str_fill = export_meta[lower_key]['_FillValue'] - del export_meta[lower_key]['_FillValue'] - else: - str_fill = '' + if '_FillValue' in export_meta[lower_key].keys(): + str_fill = export_meta[lower_key]['_FillValue'] + del export_meta[lower_key]['_FillValue'] + else: + str_fill = '' - cdfkey = out_data.createVariable(case_key, coltype, - dimensions=epoch_name, - complevel=complevel, - shuffle=shuffle, - fill_value=str_fill) + cdfkey = out_data.createVariable(case_key, coltype, + dimensions=epoch_name, + complevel=complevel, + shuffle=shuffle, + fill_value=str_fill) - # Set metadata - cdfkey.setncatts(export_meta[lower_key]) + # Set metadata + cdfkey.setncatts(export_meta[lower_key]) - # Time to actually write the data now - cdfkey[:] = data.values + # Time to actually write the data now + cdfkey[:] = data.values - else: - # Still dealing with an object, not just a Series of - # strings. Maps to `if` check on coltypes, being - # string-based. Presuming a Series with a DataFrame or - # Series in each location. Start by collecting some - # basic info on dimensions sizes, names, then create - # corresponding netCDF4 dimensions total dimensions - # stored for object are epoch plus ones created below - dims = np.shape(inst[key].iloc[0]) - obj_dim_names = [] - - # Pad dimensions so that the rest of the code works - # for either a Series or a DataFrame. - if len(dims) == 1: - dims = (dims[0], 0) - - # Don't need to go over last dimension value, - # it covers number of columns (if a frame). - for i, dim in enumerate(dims[:-1]): - obj_dim_names.append(case_key) - out_data.createDimension(obj_dim_names[-1], dim) - - # Create simple tuple with information needed to create - # the right dimensions for variables that will - # be written to file. - var_dim = tuple([epoch_name] + obj_dim_names) - - # Determine whether data is in a DataFrame or Series - try: - # Start by assuming it is a DataFrame - iterable = inst[key].iloc[0].columns - is_frame = True - except AttributeError: - # Otherwise get sub-variables for a Series - iterable = [inst[key].iloc[0].name] - is_frame = False - - # Find the subvariable data within the main variable, - # checking that this is not an empty DataFrame or - # Series. Determine the underlying data types. - good_data_loc = 0 - for idat in np.arange(len(inst.data)): - if len(inst.data[key].iloc[0]) > 0: - good_data_loc = idat - break - - # Found a place with data, if there is one - # now iterate over the subvariables, get data info - # create netCDF4 variables and store the data - # stored name is variable_subvariable. - for col in iterable: - if is_frame: - # We are working with a DataFrame, so - # multiple subvariables stored under a single - # main variable heading. - idx = inst[key].iloc[good_data_loc][col] - data, coltype, _ = inst._get_data_info(idx) - - # netCDF4 doesn't support string compression - if coltype == str: - lzlib = False - else: - lzlib = zlib - - cdfkey = out_data.createVariable( - '_'.join((case_key, col)), coltype, - dimensions=var_dim, zlib=lzlib, - complevel=complevel, shuffle=shuffle) - - # Set metadata - lkey = '_'.join((lower_key, col.lower())) - cdfkey.setncatts(export_meta[lkey]) - - # Attach data. It may be slow to repeatedly - # call the store method as well astype method - # below collect data into a numpy array, then - # write the full array in one go. - temp_cdf_data = np.zeros( - (num, dims[0])).astype(coltype) - for i in range(num): - temp_cdf_data[i, :] = inst[ - key].iloc[i][col].values - - # Write data - cdfkey[:, :] = temp_cdf_data - else: - # We are dealing with a Series. Get - # information from within the series. - idx = inst[key].iloc[good_data_loc] - data, coltype, _ = inst._get_data_info(idx) - - # netCDF4 doesn't support string compression - if coltype == str: - lzlib = False - else: - lzlib = zlib - - cdfkey = out_data.createVariable( - case_key + '_data', coltype, - dimensions=var_dim, zlib=lzlib, - complevel=complevel, shuffle=shuffle) - - # Set metadata - tempk = '_'.join([lower_key, lower_key]) - cdfkey.setncatts(export_meta[tempk]) - - # Attach data - temp_cdf_data = np.zeros( - (num, dims[0]), dtype=coltype) - for i in range(num): - temp_cdf_data[i, :] = inst[i, key].values - - # Write data - cdfkey[:, :] = temp_cdf_data - - # We are done storing the actual data for the given - # higher order variable. Now we need to store the index - # for all of that fancy data. - - # Get index information - data, coltype, datetime_flag = inst._get_data_info( - inst[key].iloc[good_data_loc].index) - - # Create dimension variable to store index in netCDF4 - cdfkey = out_data.createVariable(case_key, coltype, - dimensions=var_dim, - zlib=zlib, - complevel=complevel, - shuffle=shuffle) - # Set meta data - cdfkey.setncatts(export_meta[lower_key]) - - # Treat time and non-time data differently - if datetime_flag: - # Set data - temp_cdf_data = np.zeros((num, dims[0]), - dtype=coltype) - for i in range(num): - temp_cdf_data[i, :] = inst[i, key].index.values - cdfkey[:, :] = (temp_cdf_data * 1.0E-6).astype( - coltype) - - else: - # Set data - temp_cdf_data = np.zeros((num, dims[0]), - dtype=coltype) - - for i in range(num): - temp_cdf_data[i, :] = inst[ - key].iloc[i].index.astype(coltype) - cdfkey[:, :] = temp_cdf_data else: # Attach the metadata to a separate xarray.Dataset object, ensuring # the Instrument data object is unchanged. diff --git a/pysat/utils/registry.py b/pysat/utils/registry.py index 97193ccbc..b0a1f6d9e 100644 --- a/pysat/utils/registry.py +++ b/pysat/utils/registry.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """pysat user module registry utilities. @@ -49,10 +52,9 @@ """ import importlib -import logging import pysat -import pysat.tests.instrument_test_class as itc +import pysat.tests.classes.cls_instrument_library as itc def load_saved_modules(): @@ -92,7 +94,7 @@ def register(module_names, overwrite=False): specify package name and instrument modules overwrite : bool If True, an existing registration will be updated - with the new module information. + with the new module information. (default=False) Raises ------ @@ -147,7 +149,7 @@ def register(module_names, overwrite=False): raise # Second, check that module is itself pysat compatible - validate = itc.InstTestClass() + validate = itc.InstLibTests() # Work with test code, create dummy structure to make things work class Foo(object): @@ -208,7 +210,7 @@ class Foo(object): return -def register_by_module(module): +def register_by_module(module, overwrite=False): """Register all sub-modules attached to input module. Enables instantiation of a third-party Instrument module using @@ -221,6 +223,9 @@ def register_by_module(module): module : Python module Module with one or more pysat.Instrument support modules attached as sub-modules to the input `module` + overwrite : bool + If True, an existing registration will be updated + with the new module information. (default=False) Raises ------ @@ -249,7 +254,7 @@ def register_by_module(module): module_names = [module.__name__ + '.' + mod for mod in module_names] # Register all of the sub-modules - register(module_names) + register(module_names, overwrite=overwrite) return diff --git a/pysat/utils/testing.py b/pysat/utils/testing.py index e654ab944..9367bc891 100644 --- a/pysat/utils/testing.py +++ b/pysat/utils/testing.py @@ -2,10 +2,14 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Utilities to perform common evaluations.""" import numpy as np +import pysat.utils def assert_list_contains(small_list, big_list, test_nan=False, test_case=True): @@ -171,7 +175,7 @@ def eval_warnings(warns, check_msgs, warn_type=DeprecationWarning): check_msgs : list List of strings containing the expected warning messages warn_type : type - Type for the warning messages (default=DeprecationWarning) + Type or list-like for the warning messages (default=DeprecationWarning) Raises ------ @@ -180,19 +184,38 @@ def eval_warnings(warns, check_msgs, warn_type=DeprecationWarning): """ + # Ensure inputs are list-like + warn_types = pysat.utils.listify(warn_type) + check_msgs = pysat.utils.listify(check_msgs) + # Initialize the output found_msgs = [False for msg in check_msgs] + # If only one warning type provided then expand to match + # number of messages + simple_out = False + if len(warn_types) == 1: + warn_types = warn_types * len(check_msgs) + simple_out = True + # Test the warning messages, ensuring each attribute is present for iwar in warns: - for i, msg in enumerate(check_msgs): + for i, (msg, iwartype) in enumerate(zip(check_msgs, warn_types)): if str(iwar.message).find(msg) >= 0: - assert iwar.category == warn_type, \ + assert iwar.category == iwartype, \ "bad warning type for message: {:}".format(msg) found_msgs[i] = True + # If all warnings are of the same kind, we don't need to repeat the + # same type in the output string. + if simple_out: + warn_repr_str = repr(warn_type) + else: + not_found_msgs = [not msg for msg in found_msgs] + warn_repr_str = repr((np.array(warn_types)[not_found_msgs])) + assert np.all(found_msgs), "did not find {:d} expected {:}".format( - len(found_msgs) - np.sum(found_msgs), repr(warn_type)) + len(found_msgs) - np.sum(found_msgs), warn_repr_str) return diff --git a/pysat/utils/time.py b/pysat/utils/time.py index 4523aaf85..f4e2a8386 100644 --- a/pysat/utils/time.py +++ b/pysat/utils/time.py @@ -2,6 +2,9 @@ # Full license can be found in License.md # Full author list can be found in .zenodo.json file # DOI:10.5281/zenodo.1199703 +# +# DISTRIBUTION STATEMENT A: Approved for public release. Distribution is +# unlimited. # ---------------------------------------------------------------------------- """Date and time handling utilities.""" @@ -218,13 +221,9 @@ def freq_to_res(freq): -------- pds.offsets.DateOffset - References - ---------- - Separating alpha and numeric portions of strings, as described in: - https://stackoverflow.com/a/12409995 - """ - # Separate the alpha and numeric portions of the string + # Separate the alpha and numeric portions of the string as described in: + # https://stackoverflow.com/a/12409995 regex = re.compile(r'(\d+|\s+)') out_str = [sval for sval in regex.split(freq) if len(sval) > 0] @@ -364,8 +363,8 @@ def filter_datetime_input(date): if hasattr(date, '__iter__'): out_date = [] for in_date in date: - if(in_date.tzinfo is not None - and in_date.utcoffset() is not None): + if all([in_date.tzinfo is not None, + in_date.utcoffset() is not None]): in_date = in_date.astimezone(tz=dt.timezone.utc) out_date.append(dt.datetime(in_date.year, in_date.month, diff --git a/pysat/version.txt b/pysat/version.txt deleted file mode 100644 index fd2a01863..000000000 --- a/pysat/version.txt +++ /dev/null @@ -1 +0,0 @@ -3.1.0 diff --git a/setup.cfg b/setup.cfg index 579ecfbc8..39623ca78 100644 --- a/setup.cfg +++ b/setup.cfg @@ -1,60 +1,6 @@ [metadata] name = pysat -version = file: pysat/version.txt -url = https://github.com/pysat/pysat -author = Russell Stoneback, et al. -author_email = pysat.developers@gmail.com -description = 'Supports science analysis across disparate data platforms' -keywords = - pysat - ionosphere - atmosphere - thermosphere - magnetosphere - heliosphere - observations - models - space - satellites - analysis -classifiers = - Development Status :: 5 - Production/Stable - Intended Audience :: Science/Research - Topic :: Scientific/Engineering :: Astronomy - Topic :: Scientific/Engineering :: Physics - Topic :: Scientific/Engineering :: Atmospheric Science - License :: OSI Approved :: BSD License - Natural Language :: English - Programming Language :: Python :: 3.6 - Programming Language :: Python :: 3.8 - Programming Language :: Python :: 3.9 - Programming Language :: Python :: 3.10 - Operating System :: MacOS :: MacOS X - Operating System :: POSIX :: Linux - Operating System :: Microsoft :: Windows -license_file = LICENSE -long_description = file: README.md -long_description_content_type = text/markdown - -[options] -python_requires = >= 3.6 -setup_requires = setuptools >= 38.6; pip >= 10 -include_package_data = True -zip_safe = False -packages = find: -install_requires = dask - netCDF4 - numpy - pandas - portalocker - pytest - scipy - toolz - xarray - -[coverage:report] -omit = - */instruments/templates/* +version = 3.2.0 [flake8] max-line-length = 80 @@ -62,15 +8,3 @@ ignore = D200 D202 W503 - pysat/__init__.py E402 F401 - pysat/instruments/methods/__init__.py F401 - pysat/utils/__init__.py F401 - -[tool:pytest] -markers = - all_inst: tests all instruments - download: tests for downloadable instruments - no_download: tests for instruments without download support - load_options: tests for instruments including optional load kwargs - first: first tests to run - second: second tests to run diff --git a/setup.py b/setup.py deleted file mode 100644 index dfba3800e..000000000 --- a/setup.py +++ /dev/null @@ -1,16 +0,0 @@ -#!/usr/bin/env python -# -*- coding: utf-8 -*- -# Copyright (C) 2020, Authors -# Full license can be found in License.md -# ----------------------------------------------------------------------------- -"""Setup routine for pysat. - -Note ----- -package metadata stored in setup.cfg - -""" - -from setuptools import setup - -setup() diff --git a/test_requirements.txt b/test_requirements.txt index 823ea6406..366154df8 100644 --- a/test_requirements.txt +++ b/test_requirements.txt @@ -1,12 +1,13 @@ coveralls<3.3 flake8 flake8-docstrings -hacking>=1.0,<6.0 +hacking>=1.0 ipython m2r2 numpydoc -pysatSpaceWeather +pysatSpaceWeather<0.1.0 pytest-cov pytest-ordering -sphinx<7.0 -sphinx_rtd_theme +readthedocs-sphinx-search==0.3.2 +sphinx +sphinx_rtd_theme>=1.2.2,<2.0.0