Declare required and optional dependency #714

DanielYang59 · 2024-10-20T04:44:34Z

Summary

Add rualmel.yaml and numpy to required dependencies (as they are used by more than one module), and remove import guard for ruamel.yaml and numpy
Declare the rest as optional dependencies, group by module, should make it easier for users to install
lzma is built in after python 3.3

It would then be much easier to install extra dependencies (pip install monty[json] or pip install monty[optional] to install all extras) without need to dig into the code to find out what is required

codecov · 2024-10-20T04:45:40Z

Codecov Report

Attention: Patch coverage is 68.42105% with 6 lines in your changes missing coverage. Please review.

Project coverage is 82.42%. Comparing base (1270c7b) to head (8a4d33f).
Report is 21 commits behind head on master.

Files with missing lines	Patch %	Lines
src/monty/dev.py	0.00%	2 Missing ⚠️
src/monty/json.py	81.81%	1 Missing and 1 partial ⚠️
src/monty/multiprocessing.py	0.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #714      +/-   ##
==========================================
+ Coverage   81.02%   82.42%   +1.40%     
==========================================
  Files          27       27              
  Lines        1586     1582       -4     
  Branches      361      285      -76     
==========================================
+ Hits         1285     1304      +19     
+ Misses        236      215      -21     
+ Partials       65       63       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

coderabbitai · 2024-10-20T06:26:08Z

Walkthrough

The pull request introduces significant changes to the pyproject.toml file, enhancing dependency management by adding new dependencies and restructuring optional dependencies. Additionally, various modules in the src/monty package have been updated to streamline import statements and improve error handling related to dependencies. New unit tests have been added across multiple test files to ensure robust coverage for the modified functionalities. Overall, the changes focus on improving organization, clarity, and functionality within the project.

Changes

File Path	Change Summary
`pyproject.toml`	- Added dependency: `"ruamel.yaml"` - Added version constraint for `"numpy"`: `<2.0.0` - Restructured optional dependencies, including new groups: `dev`, `json`, `multiprocessing`, `serialization`, and `optional`.
`src/monty/dev.py`	- Updated warning message in `install_excepthook` function for missing `IPython`.
`src/monty/io.py`	- Removed try-except block for unconditional import of `lzma`. - Enhanced `zopen` function to consistently support `lzma` files.
`src/monty/itertools.py`	- Removed try-except block for unconditional import of `numpy`.
`src/monty/json.py`	- Updated import statements for unconditional import of `numpy` and `ruamel.yaml`. - Simplified handling of `numpy` arrays in `MontyEncoder` and `MontyDecoder` classes.
`src/monty/multiprocessing.py`	- Improved error handling for `tqdm` import to raise a descriptive `ImportError`.
`src/monty/serialization.py`	- Removed try-except block for unconditional import of `ruamel.yaml`. - Clarified error handling for YAML loading.
`tasks.py`	- Added new task function `set_ver` to update version in `pyproject.toml`. - Modified `release` function to call `set_ver` before proceeding with the release process.
`tests/test_collections.py`	- Added unit tests for collection types: `TestFrozenDict` and `TestTree`.
`tests/test_design_patterns.py`	- Added tests for `cached_class` and `singleton` decorators, enhancing coverage for their functionalities.
`tests/test_dev.py`	- Added tests for `deprecated` and `requires` decorators, and `install_excepthook` function.
`tests/test_fractions.py`	- Added tests for `gcd`, `lcm`, and `gcd_float` functions.
`tests/test_functools.py`	- Added tests for `lazy_property`, `return_if_raise`, `return_none_if_raise`, and `timeout` decorators.
`tests/test_io.py`	- Enhanced tests for reading files in reverse order and handling various formats, including empty files.
`tests/test_json.py`	- Adjusted import statements and variable names for clarity.
`tests/test_os.py`	- Introduced tests for `monty.os` module functions: `cd`, `makedirs_p`, `find_exts`, and `zpath`.
`tests/test_serialization.py`	- Organized imports and updated `teardown_class` for improved cleanup in tests.
`tests/test_shutil.py`	- Added comprehensive tests for file and directory manipulation functions.
`tests/test_tempfile.py`	- Added tests for `ScratchDir` class focusing on various functionalities related to temporary directories.

Possibly related PRs

Add missing functools.wraps decorator to deprecated decorator and handle dataclass properly #699: Enhances the deprecated decorator, relevant due to changes in pyproject.toml that may affect decorator usage.
Fix line ending in reverse_readfile/readline in Windows #700: Addresses newline handling in file reading functions, indirectly related to the main PR's focus on dependency management.
Fix test_with_copy_gzip which alters test file in place #703: Modifies the gzip_dir function to prevent in-place alterations of test files, relating to the main PR's focus on improving dependency management.

"In the garden of code, where dependencies bloom,
A rabbit hops lightly, dispelling the gloom.
With ruamel.yaml and numpy in tow,
Our project now flourishes, ready to grow!
Tests leap with joy, ensuring all's right,
In this world of Python, we dance through the night!" 🐇✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (25)

tests/test_fractions.py (4)

Line range hint 8-9: Consider adding more test cases for test_gcd

While the current test case is valid, it only checks one scenario. To ensure robust testing of the gcd function, consider adding more test cases, such as:

GCD of two numbers

GCD of negative numbers

GCD with zero as one of the inputs

GCD of large numbers

Line range hint 12-13: Consider expanding test cases for test_lcm

The current test case is valid but limited. To improve the test coverage for the lcm function, consider adding more test cases, such as:

LCM of two numbers

LCM of negative numbers

LCM with zero as one of the inputs

LCM of large numbers

LCM of numbers with common factors

Line range hint 16-18: LGTM: Good test for gcd_float, consider expanding

The test_gcd_float function is well-implemented:

It tests floating-point GCD calculation.

Uses pytest.approx for appropriate floating-point comparison.

Includes a case with a small perturbation to test numerical stability.

To further improve the test coverage, consider adding:

A test case with negative floating-point numbers.

A test case with very large and very small floating-point numbers to check for potential overflow or underflow issues.

A test case where the GCD should be a whole number to ensure correct rounding behavior.

Line range hint 1-18: Consider implementing a more comprehensive test suite

While the current tests cover basic functionality for gcd, lcm, and gcd_float, a more comprehensive test suite would ensure robustness across various scenarios. Consider implementing:

Parameterized tests to cover multiple input combinations efficiently.

Edge case tests (e.g., very large numbers, zero, negative numbers).

Property-based tests to verify mathematical properties (e.g., commutativity, associativity).

Tests for error handling (e.g., invalid inputs).

This would significantly increase confidence in the correctness of these mathematical functions across a wide range of inputs.

Would you like assistance in generating a more comprehensive test suite?
src/monty/multiprocessing.py (2)
12-13: Improved error handling for tqdm import.

The new error handling is more descriptive and aligns well with the optional dependency structure. It provides clearer feedback to users about the requirement for tqdm.

Consider adding a comment above the import statement to indicate that tqdm is an optional dependency. This can help users understand why the import might fail. For example:
# tqdm is an optional dependency required for the imap_tqdm function
try:
    from tqdm.autonotebook import tqdm
except ImportError as exc:
    raise ImportError("tqdm must be installed for this function.") from exc
Line range hint 16-41: Enhance type hinting for better static analysis.

While the function implementation is solid, we can improve the type hinting to provide better static analysis support.

Consider the following improvements:

Use TypeVar for the iterable and return type to preserve type information:
from typing import Callable, Iterable, TypeVar, List

T = TypeVar('T')
R = TypeVar('R')

def imap_tqdm(nprocs: int, func: Callable[[T], R], iterable: Iterable[T], *args, **kwargs) -> List[R]:
    ...
Specify the types for *args and **kwargs:
from typing import Any

def imap_tqdm(nprocs: int, func: Callable[[T], R], iterable: Iterable[T], *args: Any, **kwargs: Any) -> List[R]:
    ...
These changes will provide more precise type information, especially when using static type checkers like mypy.
tests/test_collections.py (4)
Line range hint 26-30: LGTM with suggestion: test_attr_dict covers main functionality.

The test method effectively demonstrates the dual nature of AttrDict, showing both attribute and dictionary-style access. It also verifies that changes are reflected correctly.

Consider adding a test for dictionary-style assignment to ensure complete coverage:
d['baz'] = 3
assert d.baz == 3
Line range hint 32-43: LGTM with suggestion: test_frozen_attrdict covers key functionality.

This test method effectively demonstrates the behavior of FrozenAttrDict, showing both attribute-style and dictionary-style access, as well as the immutability of the dictionary. The use of pytest.raises for checking various immutability aspects is appropriate.

Consider separating the different exception cases for improved clarity and easier debugging:
with pytest.raises(KeyError):
    d["foo"] = "bar"

with pytest.raises(KeyError):
    d.foo = "bar"

with pytest.raises(KeyError):
    d.hello = "new"
Line range hint 46-53: LGTM with suggestion: test_tree demonstrates nested structure creation and access.

The test method effectively shows how the tree function creates a nested dictionary-like structure and allows for deep key access. It's good that it checks for both the presence and absence of keys at different levels.

Consider adding type checks to ensure each level is of the expected type:
assert isinstance(x, dict)
assert isinstance(x['a'], dict)
assert isinstance(x['a']['b'], dict)
assert isinstance(x['a']['b']['c'], dict)
This would provide additional confidence in the tree function's behavior.

Line range hint 1-53: Overall: Well-structured and comprehensive test suite for collections module.

This test file provides good coverage for the various collection types in the monty.collections module. The tests are organized logically and cover both positive and negative cases for each collection type. The use of pytest features like pytest.raises is appropriate and consistent throughout the file.

A few minor suggestions were made to improve individual tests, but overall, this is a solid foundation for ensuring the reliability of the collections module.

Consider adding parameterized tests for some of the collection types to test with a wider range of inputs and edge cases. This could help catch any potential issues with different data types or structures.
tests/test_os.py (1)
Line range hint 38-47: Remove duplicate test method in TestCd class.

The TestCd class contains two identical test methods: test_cd and test_cd_exception. Having duplicate tests doesn't provide any additional coverage and may lead to confusion. Consider removing the test_cd_exception method to keep the test suite clean and efficient.

Suggested change:
class TestCd:
    def test_cd(self):
        with cd(TEST_DIR):
            assert os.path.exists("empty_file.txt")
        assert not os.path.exists("empty_file.txt")

-    def test_cd_exception(self):
-        with cd(TEST_DIR):
-            assert os.path.exists("empty_file.txt")
-        assert not os.path.exists("empty_file.txt")
pyproject.toml (2)
22-23: LGTM! Consider specifying a minimum version for ruamel.yaml.

The addition of "ruamel.yaml" and the version constraint for numpy are good improvements. They align with the PR objectives and ensure better dependency management.

Consider specifying a minimum version for ruamel.yaml to ensure compatibility:
- "ruamel.yaml",
+ "ruamel.yaml>=0.17.21",
36-53: LGTM! Well-organized optional dependencies. Consider adding a comment for the 'json' group.

The new optional dependency groups are well-organized and align perfectly with the PR objectives. This structure will make it easier for users to install only the dependencies they need.

Consider adding a brief comment for the 'json' group, similar to the one for 'dev', to clarify its purpose:
+ # json is for extended JSON functionality
json = [
  "bson",
  "orjson",
  "pandas",
  "pydantic",
  "pint",
  "torch",
]
tests/test_serialization.py (1)
Line range hint 23-27: LGTM: Improved test cleanup process.

The teardown_class method has been updated to use glob for finding and removing test files. This change enhances the cleanup process by catching files with various extensions, reducing the risk of leftover files affecting subsequent test runs.

Consider adding a print statement or logging to indicate when files are being cleaned up, which could be helpful for debugging:
for fn in files_to_clean_up:
    print(f"Cleaning up test file: {fn}")
    os.remove(fn)
tasks.py (3)
Line range hint 137-143: LGTM with a suggestion for improvement.

The new set_ver function is well-implemented, using context managers for file operations and explicitly setting the encoding. However, the regex pattern for version matching could be more specific to avoid potential false matches.

Consider using a more specific regex pattern to ensure only the correct version line is matched:
contents = re.sub(r'^\s*version\s*=\s*["\']([\d.]+)["\']', f'version = "{version}"', contents, flags=re.MULTILINE)
This pattern will only match lines that start with optional whitespace, followed by "version", optional whitespace, an equals sign, optional whitespace, and then the version number in quotes.

Line range hint 146-156: LGTM with a suggestion for improvement.

The modifications to the release function, including the new version parameter and the call to set_ver, improve the flexibility and consistency of the release process. The order of operations is logical and comprehensive.

Consider adding version string validation to prevent potential issues with invalid version strings:
import re

def is_valid_version(version: str) -> bool:
    return bool(re.match(r'^\d+\.\d+\.\d+$', version))

@task
def release(ctx: Context, notest: bool = False, version: str = NEW_VER) -> None:
    if not is_valid_version(version):
        raise ValueError(f"Invalid version string: {version}")
    set_ver(ctx, version)
    # ... rest of the function
This ensures that only valid semantic version strings (e.g., "1.2.3") are accepted.

Line range hint 102-105: Consider deprecating or updating setver for consistency.

With the introduction of set_ver which updates pyproject.toml, the setver function might be redundant or lead to inconsistencies.

Consider one of the following options:

Deprecate setver if setup.py is no longer needed for version management.

Update setver to use the same approach as set_ver:
@task
def setver(ctx: Context, version: str = NEW_VER) -> None:
    with open("setup.py", "r", encoding="utf-8") as f:
        contents = f.read()
    contents = re.sub(r'version=["\']([\d.]+)["\']', f'version="{version}"', contents)
    with open("setup.py", "w", encoding="utf-8") as f:
        f.write(contents)
This ensures consistency in version management across files and uses a more robust Python-based approach.
src/monty/serialization.py (3)
12-12: Approve the direct import of YAML and suggest removing redundant checks

The direct import of YAML from ruamel.yaml aligns with the PR objective of declaring it as a required dependency. This simplifies the import logic and makes the code more straightforward.

Consider removing the redundant checks for YAML is None in the loadfn and dumpfn functions, as these checks are no longer necessary. The import will raise an error if ruamel.yaml is not installed. Here's a suggested change for the loadfn function (apply similar changes to dumpfn):
 if fmt == "yaml":
-    if YAML is None:
-        raise RuntimeError("Loading of YAML files requires ruamel.yaml.")
     yaml = YAML()
     return yaml.load(fp, *args, **kwargs)
This change will make the code more concise and align it with the new dependency management approach.

Line range hint 29-85: Consider removing redundant YAML error handling in loadfn

The loadfn function's logic is correct and aligns with the file's purpose. However, with the direct import of YAML from ruamel.yaml, the error handling for missing YAML in the function body is now redundant.

Consider removing the redundant check and error raising for YAML is None:
 if fmt == "yaml":
-    if YAML is None:
-        raise RuntimeError("Loading of YAML files requires ruamel.yaml.")
     yaml = YAML()
     return yaml.load(fp, *args, **kwargs)
This change will make the code more concise and align it with the new dependency management approach, while maintaining the function's core functionality.

Line range hint 88-144: Consider removing redundant YAML error handling in dumpfn

The dumpfn function's logic is correct and aligns with the file's purpose. However, similar to the loadfn function, the error handling for missing YAML in the function body is now redundant due to the direct import of YAML from ruamel.yaml.

Consider removing the redundant check and error raising for YAML is None:
 if fmt == "yaml":
-    if YAML is None:
-        raise RuntimeError("Loading of YAML files requires ruamel.yaml.")
     yaml = YAML()
     yaml.dump(obj, fp, *args, **kwargs)
This change will make the code more concise and align it with the new dependency management approach, while maintaining the function's core functionality.
tests/test_io.py (1)
Line range hint 199-213: Great addition: TestFileLock class

The new TestFileLock class is a valuable addition to the test suite. It covers the FileLock functionality, which wasn't tested before. The use of pytest fixtures for setup and teardown is a good practice.

One suggestion for improvement:
Consider using a with statement in the test_raise method to ensure the new lock is always released, even if an exception other than FileLockException is raised:
def test_raise(self):
    with pytest.raises(FileLockException):
        with FileLock(self.file_name, timeout=1) as new_lock:
            new_lock.acquire()
This change would make the test more robust and prevent potential issues with unreleased locks in case of unexpected errors.
src/monty/io.py (2)
Line range hint 63-89: Consider using contextlib.closing for file handling in reverse_readfile.

To ensure proper resource management, consider using contextlib.closing for file handling in the reverse_readfile function. This can help prevent resource leaks in case of exceptions.

Import closing from contextlib at the beginning of the file:
from contextlib import closing
Then, modify the reverse_readfile function as follows:
def reverse_readfile(filename: Union[str, Path]) -> Generator[str, str, None]:
    try:
        with closing(zopen(filename, "rb")) as file:
            if isinstance(file, (gzip.GzipFile, bz2.BZ2File)):
                for line in reversed(file.readlines()):
                    yield line.decode("utf-8").rstrip(os.linesep)
            else:
                with mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ) as filemap:
                    n = len(filemap)
                    while n > 0:
                        i = filemap.rfind(os.linesep.encode(), 0, n)
                        yield filemap[i + 1 : n].decode("utf-8").rstrip(os.linesep)
                        n = i
    except ValueError:
        return
This change ensures that the file and mmap objects are properly closed, even if an exception occurs.

Line range hint 220-307: Consider using contextlib.contextmanager for FileLock.

The FileLock class can be simplified by using the @contextlib.contextmanager decorator. This can make the code more concise and easier to maintain.

Here's a suggested refactoring of the FileLock class:
from contextlib import contextmanager

@contextmanager
def file_lock(file_name: str, timeout: float = 10, delay: float = 0.05):
    lock = FileLock(file_name, timeout, delay)
    try:
        lock.acquire()
        yield lock
    finally:
        lock.release()

class FileLock:
    def __init__(self, file_name: str, timeout: float = 10, delay: float = 0.05) -> None:
        self.file_name = os.path.abspath(file_name)
        self.lockfile = f"{self.file_name}.lock"
        self.timeout = timeout
        self.delay = delay
        self.is_locked = False
        self.fd = None

        if self.delay > self.timeout or self.delay <= 0 or self.timeout <= 0:
            raise ValueError("delay and timeout must be positive with delay <= timeout")

    def acquire(self) -> None:
        start_time = time.time()
        while True:
            try:
                self.fd = os.open(self.lockfile, os.O_CREAT | os.O_EXCL | os.O_RDWR)
                self.is_locked = True
                break
            except OSError as exc:
                if exc.errno != errno.EEXIST:
                    raise
                if (time.time() - start_time) >= self.timeout:
                    raise FileLockException(f"{self.lockfile}: Timeout occurred.")
                time.sleep(self.delay)

    def release(self) -> None:
        if self.is_locked:
            os.close(self.fd)
            os.unlink(self.lockfile)
            self.is_locked = False
            self.fd = None
This refactoring simplifies the FileLock class and provides a more Pythonic way to use it with the context manager.
src/monty/json.py (1)
Line range hint 931-948: Added support for numpy arrays and generic types in jsanitize

The addition of numpy array and generic type handling in the jsanitize function is a valuable improvement. It ensures that these types are properly sanitized for JSON serialization.

However, there's a potential optimization for the numpy array handling:

Consider using numpy's tolist() method directly for non-object dtypes to improve performance. Here's a suggested modification:
 if isinstance(obj, np.ndarray):
+    if obj.dtype.kind in 'biufc':
+        return obj.tolist()
     try:
         return [
             jsanitize(
                 i,
                 strict=strict,
                 allow_bson=allow_bson,
                 enum_values=enum_values,
                 recursive_msonable=recursive_msonable,
             )
             for i in obj.tolist()
         ]
     except TypeError:
         return obj.tolist()
This change would avoid the list comprehension for numeric dtypes, potentially improving performance for large arrays.
tests/test_dev.py (1)
Line range hint 96-97: Remove @pytest.fixture() decorator from test method

The @pytest.fixture() decorator is intended for fixture functions that provide setup for tests, not for actual test methods. Decorating test_deprecated_deadline_no_warn with @pytest.fixture() may prevent it from being recognized and executed as a test by pytest. Additionally, including self as a parameter is not appropriate for fixtures.

Apply this diff to correct the test method:
-    @pytest.fixture()
     def test_deprecated_deadline_no_warn(self, monkeypatch):
         """Test cases where no warning should be raised."""

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 1270c7b and 20e81bb.

📒 Files selected for processing (19)

pyproject.toml (2 hunks)
src/monty/dev.py (1 hunks)
src/monty/io.py (1 hunks)
src/monty/itertools.py (1 hunks)
src/monty/json.py (6 hunks)
src/monty/multiprocessing.py (1 hunks)
src/monty/serialization.py (1 hunks)
tasks.py (1 hunks)
tests/test_collections.py (1 hunks)
tests/test_design_patterns.py (1 hunks)
tests/test_dev.py (1 hunks)
tests/test_fractions.py (1 hunks)
tests/test_functools.py (1 hunks)
tests/test_io.py (1 hunks)
tests/test_json.py (3 hunks)
tests/test_os.py (1 hunks)
tests/test_serialization.py (1 hunks)
tests/test_shutil.py (1 hunks)
tests/test_tempfile.py (1 hunks)

✅ Files skipped from review due to trivial changes (1)

src/monty/dev.py

🧰 Additional context used

🔇 Additional comments (37)

tests/test_fractions.py (1)

4-4: LGTM: Improved code formatting

The addition of a blank line between import statements and function imports enhances code readability and adheres to PEP 8 style guidelines.

src/monty/multiprocessing.py (1)

Line range hint 1-41: Overall assessment: Good changes with room for minor improvements.

The changes to error handling for the tqdm import are beneficial, providing clearer feedback to users. The imap_tqdm function remains well-implemented and documented. Consider the suggested improvements to type hinting and adding a comment about the optional dependency to further enhance the code quality and maintainability.

tests/test_collections.py (3)

Line range hint 1-8: LGTM: Import statements are appropriate and complete.

The import statements are well-organized and include all necessary modules and classes for the tests. The use of __future__ import is a good practice for ensuring compatibility.

Line range hint 13-17: LGTM: test_frozen_dict covers basic functionality.

The test method effectively checks both the immutability of frozendict (by attempting to add a new key) and the correct retrieval of existing values. Good use of pytest.raises for exception testing.

Line range hint 19-24: LGTM: test_namespace_dict covers key functionality.

This test method effectively demonstrates the behavior of the Namespace class, showing that new keys can be added but existing ones cannot be updated. The use of pytest.raises for checking the immutability of existing keys is appropriate.

tests/test_os.py (4)

Line range hint 1-9: LGTM: Import statements are well-organized.

The import statements are correctly structured, with built-in modules first, followed by third-party imports (pytest), and then local imports from the monty package. The added empty line (line 7) provides good separation between import groups.

Line range hint 15-35: LGTM: TestPath class provides good test coverage.

The TestPath class contains well-structured tests for the zpath and find_exts functions:

test_zpath correctly verifies the behavior for both gzipped and non-gzipped files.

test_find_exts provides comprehensive coverage, including tests for:

Counting Python files

Counting bz2 files

Excluding specific directories

Including specific directories

These tests align well with the PR objectives and ensure the robustness of the os-related functions.

Line range hint 50-63: LGTM: TestMakedirs_p class is well-structured and comprehensive.

The TestMakedirs_p class provides a thorough test suite for the makedirs_p function:

The setup_method correctly initializes the test directory path.

test_makedirs_p covers multiple scenarios:

Successful directory creation

Re-creation of an existing directory without errors

Attempting to create a file instead of a directory (expecting an OSError)

The teardown_method properly cleans up the created directory after tests.

The use of pytest.raises for checking the OSError is appropriate and aligns with best practices for exception testing.

Line range hint 1-63: Overall, the test file is well-structured and comprehensive.

The tests/test_os.py file provides excellent coverage for the os-related functions in the monty package. The tests are well-organized into separate classes, each focusing on specific functionalities. They align well with the PR objectives of enhancing dependency management and streamlining imports.

Key strengths:

Comprehensive testing of zpath, find_exts, cd, and makedirs_p functions.

Good use of pytest fixtures and assertions.

Proper setup and teardown methods where necessary.

Areas for improvement:

Remove the duplicate test method in the TestCd class, as mentioned in a previous comment.

Consider adding more edge cases or error scenarios if applicable, but overall, this test file provides a solid foundation for ensuring the reliability of the os-related functions.

pyproject.toml (3)

29-34: LGTM! Good updates to CI dependencies.

The updates to pytest and pytest-cov versions, along with the addition of type stubs for orjson and requests, are excellent improvements. These changes will enhance the development and testing processes.

111-111: LGTM! Good addition to isort configuration.

Adding "monty" to the known-first-party list for isort is a good practice. This will help maintain consistent and clean import statements throughout the project.

Line range hint 1-111: Excellent improvements to dependency management!

The changes in this file significantly enhance the project's dependency management:

Required dependencies are clearly defined.

Optional dependencies are well-organized by module.

CI dependencies are updated with specific versions.

The new structure provides flexibility for users to install only needed dependencies.

These improvements align perfectly with the PR objectives and will make the project easier to maintain and use.

tests/test_serialization.py (2)

10-11: LGTM: Import statements reorganized.

The import statements for dumpfn, loadfn, and ScratchDir have been moved to the top of the file. This change improves code organization and adheres to PEP 8 style guidelines.

Line range hint 1-94: Overall assessment: Good improvements to test organization and cleanup.

The changes made to this file enhance code organization and test cleanup without altering the core functionality of the tests. The import statements have been properly reorganized, and the teardown process has been improved. These changes align well with the PR objectives of enhancing dependency management and improving code structure.
src/monty/itertools.py (3)
Line range hint 15-114: LGTM: Functions remain unchanged and compatible with the new import structure.

The functions chunks, iterator_from_slice, iuptri, and ilotri have not been modified. They continue to use numpy correctly with the new import structure. This maintains the existing functionality while simplifying the overall code structure.

Line range hint 1-114: Summary: Import simplification aligns with PR objectives

The changes in this file successfully simplify the import structure by directly importing numpy. This aligns well with the PR objective of making numpy a required dependency. The existing functions remain unchanged, maintaining their functionality while benefiting from the simplified import structure.

Overall, these changes improve code clarity without introducing new risks. Ensure that the verification step for the numpy dependency in pyproject.toml is completed to maintain consistency between the code and the project configuration.

10-10: LGTM: Direct import of numpy simplifies the code.

The removal of the conditional import for numpy and its direct import at the top of the file aligns with the PR objective of making numpy a required dependency. This change simplifies the code and makes the dependency more explicit.

To ensure that this change is properly reflected in the project configuration, please run the following script:

✅ Verification successful

Verified: numpy is correctly listed as a dependency in pyproject.toml and the import changes are appropriate.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify numpy is listed as a required dependency in pyproject.toml

# Test: Check if numpy is listed in pyproject.toml
if grep -q 'numpy.*"' pyproject.toml; then
    echo "numpy is correctly listed as a dependency in pyproject.toml"
else
    echo "Warning: numpy is not listed as a dependency in pyproject.toml"
fi
Length of output: 164
src/monty/serialization.py (1)

Line range hint 1-144: Summary of changes and suggestions

The main change in this file is the direct import of YAML from ruamel.yaml, which aligns with the PR objectives of declaring it as a required dependency. This change simplifies the import logic and makes the code more straightforward.

To further improve the code:

Remove the redundant checks for YAML is None in both loadfn and dumpfn functions.

Consider updating the error messages to reflect that ruamel.yaml is now a required dependency.

These changes will make the code more concise and align it with the new dependency management approach while maintaining the core functionality of the serialization module.

tests/test_tempfile.py (2)

7-7: Improved import organization

The addition of a blank line between standard library imports and project-specific imports enhances code readability and adheres to PEP 8 style guidelines. This small change contributes to better code organization.

Line range hint 1-164: Verify test coverage for potential ScratchDir changes

While this test file remains largely unchanged, it's important to ensure that any modifications to the ScratchDir class resulting from the dependency management changes in this PR are adequately covered by these tests. Please verify if any new functionality or edge cases need to be added to maintain comprehensive test coverage.

To help verify the test coverage, you can run the following command to check for any recent changes to the ScratchDir class:

This script will help identify if there have been any recent changes to the ScratchDir class that might require additional test coverage.

tests/test_io.py (5)

8-8: LGTM: Improved import organization

The addition of an empty line between import groups improves readability and adheres to PEP 8 guidelines for import organization.

Line range hint 67-73: Great addition: Test for empty file handling

The new test_empty_file method is a valuable addition to the test suite. It ensures that the reverse_readline function handles empty files correctly, which is an important edge case. This test improves the overall robustness of the code and aligns well with the PR objectives.

Line range hint 141-147: Excellent: Consistent empty file test for reverse_readfile

The addition of test_empty_file for reverse_readfile is a great improvement. It ensures consistent behavior with reverse_readline when handling empty files. This test enhances the robustness of the codebase and maintains parity between related functions.

Line range hint 191-196: Excellent addition: Testing Path object compatibility

The new test_Path_objects method is a valuable addition to the TestZopen class. It ensures that zopen can handle Path objects from the pathlib module, which is the recommended way of working with file paths in modern Python. This test improves the robustness of the zopen function and aligns well with current best practices.

Line range hint 1-213: Overall excellent improvements to the test suite

The changes made to tests/test_io.py significantly enhance the test coverage and robustness of the codebase. Key improvements include:

Testing empty file handling for both reverse_readline and reverse_readfile.

Ensuring compatibility with pathlib.Path objects in zopen.

Adding comprehensive tests for the FileLock functionality.

These additions align well with the PR objectives of improving dependency management and enhancing functionality. They cover important edge cases and modern Python features, which will contribute to the overall stability and maintainability of the project.

Great work on improving the test suite!
src/monty/io.py (1)
Line range hint 309-328: Consider platform-independent implementation for get_open_fds.

The get_open_fds function currently only works on UNIX-like systems. Consider implementing a platform-independent version or providing alternatives for other operating systems.

Here's a suggestion for a more platform-independent implementation:
import os
import psutil

def get_open_fds() -> int:
    """
    Get the number of open file descriptors for the current process.

    Returns:
        int: The number of open file descriptors for the current process.
    """
    try:
        return psutil.Process().num_fds()
    except AttributeError:
        # Fallback for systems where num_fds() is not available
        return len(psutil.Process().open_files())
This implementation uses the psutil library, which provides cross-platform system and process utilities. You'll need to add psutil to your project dependencies.

Let's check if psutil is already a dependency:
#!/bin/bash
# Check if psutil is already a dependency

grep -q "psutil" pyproject.toml && echo "psutil is already a dependency" || echo "psutil needs to be added as a dependency"
tests/test_json.py (4)

11-12: LGTM: Import changes improve code clarity and align with project goals.

The direct imports of numpy and pytest, along with the renaming of __version__ to TESTS_VERSION, simplify the code and follow better naming conventions. These changes are in line with the PR objective of making numpy a required dependency.

Also applies to: 23-23

320-320: LGTM: Consistent use of renamed constant.

The use of TESTS_VERSION instead of tests_version is consistent with the import change and follows the convention of using uppercase for constants.

871-871: LGTM: Minor adjustment in test assertion.

The change in the assertion statement doesn't significantly alter the test's functionality and appears to be a minor refinement.

Line range hint 1-1134: Overall, changes improve code quality and align with PR objectives.

The modifications in this file, including import changes, variable renaming, and minor adjustments, contribute to better code clarity and consistency. These changes align well with the PR's objective of enhancing dependency management. No significant issues or concerns were identified during the review.

src/monty/json.py (3)

594-607: Improved numpy array handling in MontyEncoder

The changes to the numpy array handling in the default method of MontyEncoder are well-implemented. The new code:

Removes the unnecessary check for np being None, as numpy is now a required dependency.

Handles both real and complex numpy arrays more consistently.

Simplifies the overall structure of the array encoding.

These improvements make the code more robust and easier to maintain.

608-609: Simplified numpy generic type handling

The handling of numpy generic types has been streamlined by removing the unnecessary check for np being None. This change is consistent with the numpy array handling improvements and makes the code more concise.

Line range hint 807-815: Improved numpy array decoding in MontyDecoder

The changes to numpy array handling in the MontyDecoder class are well-implemented and consistent with the improvements made in the MontyEncoder class. The code now handles both real and complex numpy arrays efficiently, without unnecessary checks.

tests/test_design_patterns.py (1)

9-9: Importing pytest for test assertions

The addition of import pytest is necessary for utilizing pytest functionalities such as pytest.raises in your test cases.

tests/test_shutil.py (1)

12-12: Appropriate addition of import pytest

The inclusion of import pytest is necessary for using pytest features such as pytest.raises and pytest.warns in the test cases.

tests/test_dev.py (1)

10-10: Added import for pytest module

The pytest module is now imported to support the use of pytest.raises, pytest.warns, and other pytest functionalities in the test cases.

tests/test_functools.py (1)

8-8: Appropriate addition of import pytest

The addition of import pytest is necessary for using pytest features such as pytest.raises in the test cases below. This ensures that the tests correctly handle and assert expected exceptions.

src/monty/io.py

DanielYang59 · 2024-10-20T07:00:06Z

pyproject.toml

@@ -30,7 +30,6 @@ ci = [
  "monty[optional]",
  "pytest>=8",
  "pytest-cov>=4",
-  "types-orjson",


This should be unnecessary now:

Note: The orjson package includes type annotations or type stubs since version 3.6.1. Please uninstall the types-orjson package if you use this or a newer version.

orjson 3.6.1 is released Aug 4, 2021

Update: pinned lower bound in 3edd4d5 just in case (it's recommended to pin a lower bound to avoid outdated version for any dependency anyway)

DanielYang59 · 2024-10-20T07:07:14Z

@shyuep Please review this, thank you!

This reverts commit b89d7a9.

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (4)

.github/workflows/test.yml (1)

8-8: Approved: Addition of fail-fast: false enhances test coverage.

The addition of fail-fast: false to the strategy is a good change. It allows all jobs in the matrix to continue running even if one fails, which can help identify issues across different environments in a single run.

However, be aware that this might lead to longer overall run times and potentially higher costs, especially if there are persistent failures in specific environments.

Consider monitoring the workflow run times and costs after this change. If they increase significantly, you might want to implement conditional logic to abort the workflow early in case of multiple failures.
pyproject.toml (2)
35-36: LGTM: New dev dependency group added

The addition of a specific dev group for the "dev" module is a good organizational practice. The comment helps clarify its purpose.

Consider updating the comment to be more specific:
-# dev is for "dev" module, not for development
+# 'dev' group is for the "dev" module, not for development dependencies
This minor change could further reduce any potential confusion.

41-52: LGTM: Well-organized optional dependency groups

The addition of specific optional dependency groups (json, multiprocessing, serialization, task) and the reorganization of the 'optional' group are excellent improvements. This structure aligns perfectly with the PR objectives and will simplify the installation process for users.

For consistency, consider capitalizing the first letter of each group name in the comments, similar to how you've done for the 'dev' group:
-# dev is for "dev" module, not for development
+# Dev is for "dev" module, not for development
+# Json is for json-related optional dependencies
+# Multiprocessing is for multiprocessing-related optional dependencies
+# Serialization is for serialization-related optional dependencies
+# Task is for task-related optional dependencies
This minor change would improve readability and maintain consistency across the file.
src/monty/dev.py (1)

234-235: Add test coverage for new error handling.

The static analysis tool indicates that these new lines are not covered by tests. To ensure the robustness of this change and prevent potential regressions, it's important to add a test case that covers this scenario.

Consider adding a test case that attempts to call install_excepthook in an environment where IPython is not available, and verify that it raises the expected ImportError with the correct message.

🧰 Tools

🪛 GitHub Check: codecov/patch

[warning] 234-235: src/monty/dev.py#L234-L235
Added lines #L234 - L235 were not covered by tests

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 20e81bb and 8a4d33f.

📒 Files selected for processing (4)

.github/workflows/test.yml (1 hunks)
pyproject.toml (2 hunks)
src/monty/dev.py (1 hunks)
src/monty/io.py (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

src/monty/io.py

🧰 Additional context used

🪛 GitHub Check: codecov/patch

src/monty/dev.py

[warning] 234-235: src/monty/dev.py#L234-L235
Added lines #L234 - L235 were not covered by tests

🔇 Additional comments (5)

pyproject.toml (4)

22-23: LGTM: Dependencies updated as per PR objectives

The addition of "ruamel.yaml" as a required dependency and the version constraint on numpy (<2.0.0) are in line with the PR objectives and represent good dependency management practices.

29-33: LGTM: CI dependencies updated appropriately

The updates to the ci optional dependencies, including the addition of coverage tools and newer versions of pytest, are appropriate for maintaining robust testing and code quality practices.

110-110: LGTM: Ruff configuration updated for better import sorting

The addition of lint.isort.known-first-party = ["monty"] to the ruff configuration is a good practice. This will help maintain clean and consistent import statements throughout the project by correctly categorizing imports from the monty package as first-party imports.

Line range hint 1-110: Summary: Excellent improvements to dependency management

Overall, the changes to the pyproject.toml file are well-structured and align perfectly with the PR objectives. The updates improve dependency management by:

Adding required dependencies (ruamel.yaml and numpy) with appropriate version constraints.

Reorganizing optional dependencies into logical groups.

Updating CI dependencies for better testing and coverage.

Improving import sorting configuration.

These changes will significantly enhance the project's dependency management and make it easier for users to install only the dependencies they need. The minor suggestions provided in the review comments are for further refinement and consistency, but the core changes are solid and beneficial to the project.

src/monty/dev.py (1)

234-235: Improved error handling for missing IPython dependency.

The change from issuing a warning to raising an ImportError is a good improvement. It provides a clearer and more actionable message when the IPython dependency is missing. This aligns well with the PR's objective of enhancing dependency management.

However, there are two points to consider:

The function's docstring should be updated to reflect that it now may raise an ImportError.

A test should be added to cover this new error handling.

To ensure the docstring is updated, please run:

Would you like assistance in updating the docstring and creating a test for this new error handling?

🧰 Tools

🪛 GitHub Check: codecov/patch

[warning] 234-235: src/monty/dev.py#L234-L235
Added lines #L234 - L235 were not covered by tests

DanielYang59 added 2 commits October 20, 2024 12:36

fix typo and tweak warn msg

49cfde8

declare optional dependencies

9efb43c

DanielYang59 mentioned this pull request Oct 20, 2024

Lazily import torch/pydantic in json module, speedup from monty.json import by 10x #713

Merged

DanielYang59 added 2 commits October 20, 2024 12:47

remove seeming unused pymongo

cf8a260

sort dependencies by alphabet order

473dbc4

DanielYang59 force-pushed the declare-opt-dependency branch from a5fc5c4 to 473dbc4 Compare October 20, 2024 04:55

DanielYang59 added 4 commits October 20, 2024 12:57

add missing io

3449b4e

remove lzma as its builtin after python 3.3

6f7fdbd

tweak tqdm import guard

0c9d844

drop guard for from ruamel.yaml import YAML

78f16e5

DanielYang59 force-pushed the declare-opt-dependency branch from 5b9a8c8 to 172f796 Compare October 20, 2024 05:59

drop import guard for numpy

fe58998

DanielYang59 force-pushed the declare-opt-dependency branch from 172f796 to fe58998 Compare October 20, 2024 06:02

try to fix linting error, materialsvirtuallab#702

200da7e

DanielYang59 force-pushed the declare-opt-dependency branch from 91b5dc4 to 200da7e Compare October 20, 2024 06:08

pre-commit auto-fixes

f27b134

DanielYang59 mentioned this pull request Oct 20, 2024

fix TestClass name, make module level variable all capital and other code cleanup #702

Open

DanielYang59 added 2 commits October 20, 2024 14:14

also add opt in task.py

d18a134

suppress mypy error

20e81bb

DanielYang59 marked this pull request as ready for review October 20, 2024 06:23

DanielYang59 changed the title ~~Declare optional dependency~~ Declare required and optional dependency Oct 20, 2024

coderabbitai bot reviewed Oct 20, 2024

View reviewed changes

src/monty/io.py Show resolved Hide resolved

DanielYang59 added 3 commits October 20, 2024 14:40

tweak error msg when ipython not installed

d3cf365

remove lzma check

b558f21

remove type stub for orjson

b8e0f51

DanielYang59 commented Oct 20, 2024

View reviewed changes

DanielYang59 added 2 commits October 20, 2024 15:01

continue test upon failure

689de31

perhaps set orjson lower bound just in case

3edd4d5

remove necessary version as could be retrieved from install log

b89d7a9

Revert "remove necessary version as could be retrieved from install log"

8a4d33f

This reverts commit b89d7a9.

coderabbitai bot reviewed Oct 20, 2024

View reviewed changes

shyuep merged commit 189b6e6 into materialsvirtuallab:master Oct 21, 2024
9 of 10 checks passed

DanielYang59 deleted the declare-opt-dependency branch October 21, 2024 03:31

This was referenced Oct 21, 2024

Fix line ending handling in reverse_readfile/readline across OS, and not skipping empty lines #712

Merged

Drop Python 3.9 support #709

Merged

This was referenced Dec 8, 2024

zopen changes: forbid implicit binary/text mode, signature change, default UTF-8 encoding in text mode, drop .z support after one-year #730

Merged

Fix collections.ControlledDict handling of Iterators via update method #732

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Declare required and optional dependency #714

Declare required and optional dependency #714

DanielYang59 commented Oct 20, 2024 •

edited

Loading

codecov bot commented Oct 20, 2024 •

edited

Loading

coderabbitai bot commented Oct 20, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

DanielYang59 Oct 20, 2024 •

edited

Loading

DanielYang59 commented Oct 20, 2024 •

edited

Loading

coderabbitai bot left a comment

Declare required and optional dependency #714

Declare required and optional dependency #714

Conversation

DanielYang59 commented Oct 20, 2024 • edited Loading

Summary

codecov bot commented Oct 20, 2024 • edited Loading

Codecov Report

coderabbitai bot commented Oct 20, 2024 • edited Loading

Walkthrough

Changes

Possibly related PRs

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

DanielYang59 Oct 20, 2024 • edited Loading

Choose a reason for hiding this comment

DanielYang59 commented Oct 20, 2024 • edited Loading

coderabbitai bot left a comment

Choose a reason for hiding this comment

DanielYang59 commented Oct 20, 2024 •

edited

Loading

codecov bot commented Oct 20, 2024 •

edited

Loading

coderabbitai bot commented Oct 20, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

DanielYang59 Oct 20, 2024 •

edited

Loading

DanielYang59 commented Oct 20, 2024 •

edited

Loading