CI Fixups #997

TomAugspurger · 2024-07-20T22:34:57Z

numpy 2 combat
changed error messgae

TomAugspurger · 2024-07-20T22:37:53Z

There's still one error I haven't been able to fix:

================================================================================ test session starts ================================================================================
platform darwin -- Python 3.11.0, pytest-8.3.1, pluggy-1.5.0
rootdir: /Users/tom/gh/dask/dask-ml
configfile: pyproject.toml
plugins: cov-5.0.0, mock-3.14.0
collected 1 item

tests/test_incremental_pca.py F                                                                                                                                               [100%]

===================================================================================== FAILURES ======================================================================================
_______________________________________________________________________________ test_whitening[auto] ________________________________________________________________________________

svd_solver = 'auto'

    @pytest.mark.parametrize("svd_solver", ["full", "auto", "randomized"])
    @pytest.mark.filterwarnings("ignore:invalid value:RuntimeWarning")
    def test_whitening(svd_solver):
        # Test that PCA and IncrementalPCA transforms match to sign flip.
        X = datasets.make_low_rank_matrix(
            1000, 10, tail_strength=0.0, effective_rank=2, random_state=1999
        )
        X = da.from_array(X, chunks=[200, -1])
        prec = 3
        n_samples, n_features = X.shape
        for nc in [None, 9]:
            pca = PCA(whiten=True, n_components=nc, svd_solver=svd_solver).fit(X.compute())
            ipca = IncrementalPCA(
                whiten=True, n_components=nc, batch_size=250, svd_solver=svd_solver
            ).fit(X)

            Xt_pca = pca.transform(X)
            Xt_ipca = ipca.transform(X)
>           assert_almost_equal(np.abs(Xt_pca), np.abs(Xt_ipca), decimal=prec)

tests/test_incremental_pca.py:454:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../../mambaforge/envs/python=3.11/lib/python3.11/contextlib.py:81: in inner
    return func(*args, **kwds)
../../../mambaforge/envs/python=3.11/lib/python3.11/contextlib.py:81: in inner
    return func(*args, **kwds)
.direnv/python-3.11/lib/python3.11/site-packages/numpy/_utils/__init__.py:85: in wrapper
    return fun(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

args = (<function assert_array_almost_equal.<locals>.compare at 0x123d02160>, array([[3.46374689e-01, 6.42854227e-01, 1.28803...2.04242514e+05]]), dask.array<absolute, shape=(1000, 10), dtype=float64, chunksize=(200, 10), chunktype=numpy.ndarray>)
kwds = {'err_msg': '', 'header': 'Arrays are not almost equal to 3 decimals', 'precision': 3, 'verbose': True}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError:
E           Arrays are not almost equal to 3 decimals
E
E           Mismatched elements: 1430 / 10000 (14.3%)
E           Max absolute difference among violations: 874440.31622524
E           Max relative difference among violations: 14845029.47333545
E            ACTUAL: array([[3.464e-01, 6.429e-01, 1.288e+00, ..., 8.527e-01, 4.654e-01,
E                   2.602e+05],
E                  [9.195e-02, 6.557e-01, 1.029e+00, ..., 8.861e-01, 3.697e-01,...
E            DESIRED: array([[0.346, 0.643, 1.288, ..., 0.853, 0.464, 1.238],
E                  [0.092, 0.656, 1.029, ..., 0.886, 0.369, 0.19 ],
E                  [0.092, 1.329, 1.784, ..., 0.104, 0.395, 0.606],...

../../../mambaforge/envs/python=3.11/lib/python3.11/contextlib.py:81: AssertionError
============================================================================== short test summary info ==============================================================================
FAILED tests/test_incremental_pca.py::test_whitening[auto] - AssertionError:
================================================================================= 1 failed in 0.42s =================================================================================

The only thing I've found so far are that the components_ are different when whiten=True

TomAugspurger · 2024-07-21T15:08:16Z

cc @fujiisoup in case you have a chance to look (no worries if not)

fujiisoup · 2024-07-21T17:50:39Z

Hi @TomAugspurger

Do you know when the test starts failing?
This PR does not seem relevant.

fujiisoup · 2024-07-22T00:44:02Z

I tried an investigation, and seems like an upstream issue. Rose an issue (there)[https://github.com/scikit-learn/scikit-learn/issues/29534].

With numpy==2.0, it seems like that sklearn.decomposition.PCA is unstable, sometimes giving strange values.

TomAugspurger · 2024-07-22T12:06:07Z

Thanks for looking into it. I've subscribed to the upstream issue in scikit-learn and will skip or adjust this test as needed with NumPy 2.0.

numpy 2 compat

2931bec

This was referenced Jul 21, 2024

Fixing incremental_pca #998

Closed

decomposition.PCA(svd_solver='covariance_eigh') is less stable with numpy==2.0 scikit-learn/scikit-learn#29534

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI Fixups #997

CI Fixups #997

TomAugspurger commented Jul 20, 2024

TomAugspurger commented Jul 20, 2024 •

edited

Loading

TomAugspurger commented Jul 21, 2024

fujiisoup commented Jul 21, 2024

fujiisoup commented Jul 22, 2024

TomAugspurger commented Jul 22, 2024

CI Fixups #997

Are you sure you want to change the base?

CI Fixups #997

Conversation

TomAugspurger commented Jul 20, 2024

TomAugspurger commented Jul 20, 2024 • edited Loading

TomAugspurger commented Jul 21, 2024

fujiisoup commented Jul 21, 2024

fujiisoup commented Jul 22, 2024

TomAugspurger commented Jul 22, 2024

TomAugspurger commented Jul 20, 2024 •

edited

Loading