KMeans is slow on gpu #1444

fcharras · 2023-09-06T17:29:21Z

The following snippet

import numpy as np
import sklearn

device = "cpu"
# device = "gpu:0"
from sklearnex import patch_sklearn
patch_sklearn()
sklearn.set_config(target_offload=f"{device}")
from sklearn.cluster import KMeans

seed = 123
rng = np.random.default_rng(seed)

n_samples = 50_000_000
dim = 14
n_clusters = 127

data = rng.random((n_samples, dim), dtype=np.float32)
init = rng.random((n_clusters, dim), dtype=np.float32)

kmeans = KMeans(n_clusters=n_clusters, algorithm="lloyd", init=init, max_iter=100, tol=0, n_init=1)
%time kmeans.fit(data)

show for device=cpu:

Intel(R) Extension for Scikit-learn* enabled (https://github.com/intel/scikit-learn-intelex)
CPU times: user 8min 24s, sys: 4.31 s, total: 8min 28s
Wall time: 8.76 s

(CPU with 224 cores)

and when device="gpu:0" (running with a max series gpu) it's very slow (I have it running for several minuts now, it's not over yet). On 100x less data it completes in about 4.5sc, extrapolating from that, the walltime would be almost an hour.

We show with the implementation provided in the sklearn-numba-dpex project that this amount of data can run in less than 10sc on max series too.

Environment:

Linux 5.15 kernel
conda installation of scikit-learn-intelex and dpcpp-cpp-rt with -c conda channel

The text was updated successfully, but these errors were encountered:

ogrisel · 2023-09-06T17:34:40Z

For reference, see the benchmark in the https://github.com/soda-inria/sklearn-numba-dpex repo.

Also note that we already notified @samir-nasibli about this problem but we decided to open a dedicated issue to track its resolution transparently.

samir-nasibli · 2023-09-07T09:29:14Z

Hi @ogrisel thank you for this report as well!
Could you please confirm that you are using SKLREANEX_PREIVEW env variable enabled?

ogrisel · 2023-09-07T10:30:10Z

Let me try with:

export SKLEARNEX_PREVIEW=YES

If I run the above reproducer with the cpu device I get:

AttributeError                            Traceback (most recent call last)
File <timed eval>:1

File ~/mambaforge/envs/intel/lib/python3.10/site-packages/sklearnex/preview/cluster/k_means.py:192, in KMeans.fit(self, X, y, sample_weight)
    189 if sklearn_check_version("1.2"):
    190     self._validate_params()
--> 192 dispatch(self, 'fit', {
    193     'onedal': self.__class__._onedal_fit,
    194     'sklearn': sklearn_KMeans.fit,
    195 }, X, y, sample_weight)
    197 return self

File ~/mambaforge/envs/intel/lib/python3.10/site-packages/sklearnex/_device_offload.py:161, in dispatch(obj, method_name, branches, *args, **kwargs)
    158 backend, q, cpu_fallback = _get_backend(obj, q, method_name, *hostargs)
    160 if backend == 'onedal':
--> 161     return branches[backend](obj, *hostargs, **hostkwargs, queue=q)
    162 if backend == 'sklearn':
    163     return branches[backend](obj, *hostargs, **hostkwargs)

File ~/mambaforge/envs/intel/lib/python3.10/site-packages/sklearnex/preview/cluster/k_means.py:216, in KMeans._onedal_fit(self, X, _, sample_weight, queue)
    213 self._initialize_onedal_estimator()
    214 self._onedal_estimator.fit(X, queue=queue)
--> 216 self._save_attributes()

File ~/mambaforge/envs/intel/lib/python3.10/site-packages/sklearnex/preview/cluster/_common.py:70, in BaseKMeans._save_attributes(self)
     68 self._labels_ = self._onedal_estimator.labels_
     69 self._inertia_ = self._onedal_estimator.inertia_
---> 70 self._algorithm = self._onedal_estimator._algorithm
     71 self._cluster_centers_ = self._onedal_estimator.cluster_centers_
     72 self._sparse = False

AttributeError: 'KMeans' object has no attribute '_algorithm'

then with gpu:0 device, the code still breaks as follows but after waiting for approximately one minute (or more):

AttributeError                            Traceback (most recent call last)
File <timed eval>:1

File ~/mambaforge/envs/intel/lib/python3.10/site-packages/sklearnex/preview/cluster/k_means.py:192, in KMeans.fit(self, X, y, sample_weight)
    189 if sklearn_check_version("1.2"):
    190     self._validate_params()
--> 192 dispatch(self, 'fit', {
    193     'onedal': self.__class__._onedal_fit,
    194     'sklearn': sklearn_KMeans.fit,
    195 }, X, y, sample_weight)
    197 return self

File ~/mambaforge/envs/intel/lib/python3.10/site-packages/sklearnex/_device_offload.py:161, in dispatch(obj, method_name, branches, *args, **kwargs)
    158 backend, q, cpu_fallback = _get_backend(obj, q, method_name, *hostargs)
    160 if backend == 'onedal':
--> 161     return branches[backend](obj, *hostargs, **hostkwargs, queue=q)
    162 if backend == 'sklearn':
    163     return branches[backend](obj, *hostargs, **hostkwargs)

File ~/mambaforge/envs/intel/lib/python3.10/site-packages/sklearnex/preview/cluster/k_means.py:216, in KMeans._onedal_fit(self, X, _, sample_weight, queue)
    213 self._initialize_onedal_estimator()
    214 self._onedal_estimator.fit(X, queue=queue)
--> 216 self._save_attributes()

File ~/mambaforge/envs/intel/lib/python3.10/site-packages/sklearnex/preview/cluster/_common.py:70, in BaseKMeans._save_attributes(self)
     68 self._labels_ = self._onedal_estimator.labels_
     69 self._inertia_ = self._onedal_estimator.inertia_
---> 70 self._algorithm = self._onedal_estimator._algorithm
     71 self._cluster_centers_ = self._onedal_estimator.cluster_centers_
     72 self._sparse = False

AttributeError: 'KMeans' object has no attribute '_algorithm'

Versions:

daal4py                   2023.2.1         py310_intel_32    intel
dal                       2023.2.1               intel_32    intel
scikit-learn-intelex      2023.2.1         py310_intel_32    intel

samir-nasibli · 2023-09-07T14:54:48Z

AttributeError: 'KMeans' object has no attribute '_algorithm'

Already fixed and available with 2024.0.

napetrov · 2023-09-07T14:58:12Z

but it will not be available for quite a while - and currently we are going through the integration and bunch of changes so even building from sources currently would be painfully. We can share internal build in 2-3 weeks

samir-nasibli · 2023-09-07T19:34:29Z

@napetrov Many thanks for the clarification!
@ogrisel I have already reproduced the issues and trying investigate it. will update as soon as possible.

fcharras · 2023-10-12T09:58:04Z

Am a bit confused by performance in latest releases, should we look at scikit-learn-intelex==2023.2.1 or scikit-learn-intelex==20230725.122141 ? seems that the latter is more recent but it has a different versionning scheme that differs from official oneapi usual versionning, and I suspect that there's a performance regression for kmeans on CPU (see this benchmark table, I recorded 50sc walltime on some big dataset for 100 lloyd iteration with random initialization for scikit-learn-intelex==20230725.122141 but only 18sc for an identical benchmark with scikit-learn-intelex==2023.2.1 except that it should be slower since the walltime now includes kmeans++ initialization.)

fcharras · 2023-10-12T12:48:54Z

Re-ran the scikit-learn-intelex kmeans benchmark while carefully installing scikit-learn-intelex==2023.2.1 from pip, it looks much better now 🤔 , the 20230725 comes from conda apparently. (the sheet will be synchronized in a few minutes)

ethanglaser · 2024-01-25T00:30:26Z

Re-ran the scikit-learn-intelex kmeans benchmark while carefully installing scikit-learn-intelex==2023.2.1 from pip, it looks much better now 🤔 , the 20230725 comes from conda apparently. (the sheet will be synchronized in a few minutes)

@fcharras Based on this result, can the issue be closed now? Also FYI that once #1634 is merged, KMeans will be out of preview and the SKLEARNEX_PREVIEW env variable will no longer be necessary.

ogrisel · 2024-02-01T13:10:32Z

Re-ran the scikit-learn-intelex kmeans benchmark while carefully installing scikit-learn-intelex==2023.2.1 from pip, it looks much better now 🤔 , the 20230725 comes from conda apparently. (the sheet will be synchronized in a few minutes)

Was this about the perf regression observed on CPU or the new run on GPU? It does not seem that you ran it on GPU based on the results.

fcharras added the bug Something isn't working label Sep 6, 2023

fcharras changed the title ~~KMeans slow on gpu~~ KMeans is slow on gpu Sep 6, 2023

napetrov assigned KulikovNikita Sep 6, 2023

samir-nasibli assigned ethanglaser and unassigned KulikovNikita Feb 1, 2024

ethanglaser added enhancement New feature or request and removed bug Something isn't working labels Feb 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KMeans is slow on gpu #1444

KMeans is slow on gpu #1444

fcharras commented Sep 6, 2023 •

edited

Loading

ogrisel commented Sep 6, 2023

samir-nasibli commented Sep 7, 2023

ogrisel commented Sep 7, 2023

samir-nasibli commented Sep 7, 2023

napetrov commented Sep 7, 2023

samir-nasibli commented Sep 7, 2023

fcharras commented Oct 12, 2023 •

edited

Loading

fcharras commented Oct 12, 2023 •

edited

Loading

ethanglaser commented Jan 25, 2024 •

edited

Loading

ogrisel commented Feb 1, 2024

KMeans is slow on gpu #1444

KMeans is slow on gpu #1444

Comments

fcharras commented Sep 6, 2023 • edited Loading

ogrisel commented Sep 6, 2023

samir-nasibli commented Sep 7, 2023

ogrisel commented Sep 7, 2023

samir-nasibli commented Sep 7, 2023

napetrov commented Sep 7, 2023

samir-nasibli commented Sep 7, 2023

fcharras commented Oct 12, 2023 • edited Loading

fcharras commented Oct 12, 2023 • edited Loading

ethanglaser commented Jan 25, 2024 • edited Loading

ogrisel commented Feb 1, 2024

fcharras commented Sep 6, 2023 •

edited

Loading

fcharras commented Oct 12, 2023 •

edited

Loading

fcharras commented Oct 12, 2023 •

edited

Loading

ethanglaser commented Jan 25, 2024 •

edited

Loading