Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Numpy >= 2.0.0 and Pandas >= 2.0.0 #65

Merged

Conversation

mkopec87
Copy link
Contributor

@mkopec87 mkopec87 commented Dec 6, 2024

  • Support numpy >= 2.0.0 (np.string_ -> np.bytes_).
  • Fix test utility corner case error (test_numpy.twosigfigs function).
  • Fix typo in build pipeline Python versions config list.
  • aaand other stuff I'm discovering along the way ...

@mkopec87 mkopec87 mentioned this pull request Dec 6, 2024
@mkopec87
Copy link
Contributor Author

mkopec87 commented Dec 6, 2024

I struggle with 4 tests failing already on master (using numpy 1.26.4 and pandas 1.5.3):

FAILED tests/test_numpy.py::TestNumpy::testSparselyBin - AssertionError:
FAILED tests/test_numpy.py::TestNumpy::testSparselyBinAverage - AssertionError:
FAILED tests/test_numpy.py::TestNumpy::testSparselyBinDeviate - AssertionError:
FAILED tests/test_numpy.py::TestNumpy::testSparselyBinTrans - AssertionError:

I've tried debugging one of those and it seems there is a difference between py and np versions in the bin "0":

        hnpj = json.dumps(hnp.toJson(), sort_keys=True)
        hpyj = json.dumps(hpy.toJson(), sort_keys=True)
    
        if Factory.fromJson(hnp.toJson()) != Factory.fromJson(hpy.toJson()):
>           raise AssertionError("\n numpy: {0}\npython: {1}".format(hnpj, hpyj))

hnpj has "0": 481.0, while hpyj has "0" : 383.0.

Maybe someone from the project could help with those?

@mbaak
Copy link
Contributor

mbaak commented Dec 7, 2024

Hi Mateusz, I made this PR that fixes the histogrammar unit tests for numpy<2.0.0 and pandas<2.0.0:
#66
They were minor issues only. I asked Simon to do a short review, or perhaps you can have a quick look.

For your PR, once this one is merged can you please rebase? Once your PR is merged in I will switch to pyproject.toml, then release the new patch version, and then switch to upgrading popmon.

Btw, thanks for the work you already put in, I appreciate it.

@mkopec87
Copy link
Contributor Author

mkopec87 commented Dec 8, 2024

Thanks @mbaak !

Shall I include version bump / changelog / README updates in my PR or this will be part of pyproject.toml/release version PR ?

@mbaak
Copy link
Contributor

mbaak commented Dec 8, 2024

I thought of adding my (small) fixes to the changelog & README after your fixes are in, so for the new release.
(But you can add it too if you want.)

@mkopec87 mkopec87 changed the title Support Numpy >= 2.0.0 and fix tests Support Numpy >= 2.0.0 and Pandas >= 2.0.0 Dec 9, 2024
Mateusz Kopeć added 6 commits December 9, 2024 11:09
- Fix test_numpy.twosigfigs method corner-case
- Sort json keys for assertion error message
- Refactor tests with Numpy() and Pandas() context managers
- Remove pd.util.testing.makeMixedDataFrame not available
- Drop pandas and numpy version constraints
- Remove unused twosigfigs method in test_gpu.py
- Add test output files from historgrammar/notebooks to .gitignore
@mkopec87 mkopec87 force-pushed the fix/failing-tests-and-numpy-update branch from b63ca26 to d7a90c1 Compare December 9, 2024 10:28
@mkopec87
Copy link
Contributor Author

mkopec87 commented Dec 9, 2024

Still getting one error in the tests:

self = <tests.test_numpy.TestPandas testMethod=test_n_bins>

    def test_n_bins(self):
        """ Test getting the number of allocated bins
        """
        with Pandas() as pd, Numpy() as np:  # noqa
            if pd is None or np is None:
                return
            sys.stderr.write("\n")
    
            df, hist1, hist2, hist3 = get_test_histograms1()
    
            assert hist1.n_bins == 5
            assert hist2.n_bins == 5
>           assert hist3.n_bins == 7
E           assert 5 == 7
E            +  where 5 = <SparselyBin binWidth=86400000000000.0 bins=Bin nanflow=Count>.n_bins

test_numpy.py:714: AssertionError

This test was not really tested in master because of error in Pandas() context manager.
With numpy < 2 and pandas < 2 it gives the same error:
FAILED tests/test_numpy.py::TestPandas::test_n_bins - assert 5 == 7

@mbaak
Copy link
Contributor

mbaak commented Dec 9, 2024

Hi Mateusz, thanks, good catch!
Indeed the correct number is 5. Tbh don't know why it was ever set differently, the histogram only has 5 entries.
(I confirm that fixing the Pandas() class catches and confirms the error.)
Can you fix it in your PR?

@mkopec87 mkopec87 marked this pull request as ready for review December 10, 2024 05:55
@mkopec87
Copy link
Contributor Author

Fixed the test as you suggested, ready for review :)

histogrammar/util.py Outdated Show resolved Hide resolved
@mbaak mbaak merged commit f0b9f72 into histogrammar:master Dec 10, 2024
4 checks passed
@mkopec87 mkopec87 deleted the fix/failing-tests-and-numpy-update branch December 11, 2024 04:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants