Long delay of suggestion list update with scikit-learn #4016

wangtz0607 · 2023-02-28T02:51:15Z

Environment data

Language Server version: v2023.2.40
OS and version: Windows 11 22H2 64-bit
Python version (& distribution if applicable, e.g. Anaconda): 3.11.1

Code Snippet

from sklearn import linear_model

model = linear_model.LinearRegression()
model.fit

scikit-learn==1.2.1

Expected behavior

The suggestion list updates immediately as I type.

Actual behavior

Logs

https://paste.ubuntu.com/p/wf9HZJdckR/

The text was updated successfully, but these errors were encountered:

wangtz0607 · 2023-02-28T03:01:40Z

Also reproduced on:

Language Server version: v2023.2.40
OS and version: Ubuntu 20.04 focal (x86_64 Linux 5.15.0-60-generic)
Python version (& distribution if applicable, e.g. Anaconda): 3.8.10

heejaechang · 2023-02-28T03:19:37Z

I looked the log. the delay it shows is expected. most of work it was doing was parsing/binding. that are one time cost per a file when they are used the first time. your particular code touched a lot of files transitively (a lot of nested imports inside of the sklearn) so all those files are needed to be read/parsed and bound.

but once that are done, things are fast. you can confirm that, by typing this

from sklearn import linear_model

model = linear_model.LinearRegression()

model.f

and wait until completion shows up like you did in the repro steps.

and then delete everything and retype again. the second time should be fast since all necessary files are already brought in and cached.

wangtz0607 · 2023-02-28T03:40:52Z

Unfortunately that's not the case:

heejaechang · 2023-02-28T06:20:50Z

if it still happens when you retype, please provide us logs for the retype case as well so we can take a look what is taking the time.

heejaechang · 2023-02-28T06:36:52Z

I reproed. it wasn't actually completion that is slow. it was getting tooltip for some particular items in the completion taking long time when it is selected by default when completion was shown. such as coef_ for this particular case. or fit and etc.

if the item selected by default happen to be a cheap one to calculate tooltip such as copy_X then it would have been fast.

anyway, now taking a look to see why those are taking long time (and see why cancellation didn't kick in)

wangtz0607 · 2023-02-28T07:14:51Z

In case someone still need the logs: https://paste.ubuntu.com/p/DpGqMQHW5j/

heejaechang · 2023-02-28T07:40:12Z

thank you @wangtz0607

@erictraut can you take a look? this is a similar issue with #3978

from sklearn import linear_model

model = linear_model.LinearRegression()
model. <= completion here

repro step is opening a completion at model. and select item such as coef_ or fit so we get resolveCompletionItem request for the symbol.

I attached the log below that shows code flow when we try getting the tooltip.

(12832) [FG] completion at c:\pyTemp\junk\test1.py:4:6 ...
(12832) [FG]   runWithCancellationToken ...
(12832) [FG]     parsing: c:\pyTemp\junk\test1.py (0ms)
(12832) [FG]     binding: c:\pyTemp\junk\test1.py (0ms)
(12832) [FG]   runWithCancellationToken (19ms)
(12832) [FG] completion at c:\pyTemp\junk\test1.py:4:6 [found 52 items] (19ms)
(12832) [FG] runWithCancellationToken ...
(12832) [FG]   getEffectiveTypeOfSymbol ...
(12832) [FG]     isAfterNodeReachable ...
(12832) [FG]       isAfterNodeReachable ...
(12832) [FG]         getTypeOfExpression ...
(12832) [FG]           _getFunctionInferredReturnType [Function 'getdtype' (scipy.sparse._sputils)] (57ms)
(12832) [FG]           assignType [OverloadedFunction ["Function '__getitem__' (numpy)","Function '__getitem__' (numpy)","Function '__getitem__' (numpy)","Function '__getitem__' (numpy)","Function '__getitem__' (numpy)"]] (61ms)
(12832) [FG]           getTypeOfExpression ...
(12832) [FG]             _getFunctionInferredReturnType ...
(12832) [FG]               isAfterNodeReachable ...
(12832) [FG]                 getTypeOfExpression ...
(12832) [FG]                   assignType [OverloadedFunction ["Function '__getitem__' (numpy)","Function '__getitem__' (numpy)","Function '__getitem__' (numpy)","Function '__getitem__' (numpy)","Function '__getitem__' (numpy)"]] (52ms)
(12832) [FG]                   getTypeOfExpression ...
(12832) [FG]                     evaluateTypeForSubnode ...
(12832) [FG]                       evaluateTypesForStatement ["arr" (scipy.sparse._sputils) [183:9]] (359ms)
(12832) [FG]                     evaluateTypeForSubnode ["arr" (scipy.sparse._sputils) [183:9]] (359ms)
(12832) [FG]                   getTypeOfExpression ["arr.max" (scipy.sparse._sputils) [190:30]] (362ms)
(12832) [FG]                 getTypeOfExpression ["arr.min" (scipy.sparse._sputils) [191:30]] (550ms)
(12832) [FG]               isAfterNodeReachable [function '"get_index_dtype"  [147:5]' (scipy.sparse._sputils) [147:1]] (550ms)
(12832) [FG]               getTypeOfExpression ...
(12832) [FG]                 evaluateTypeForSubnode ...
(12832) [FG]                   evaluateTypesForStatement ...
(12832) [FG]                     evaluateTypeForSubnode ...
(12832) [FG]                       evaluateTypesForStatement ...
(12832) [FG]                         getTypeOfExpression ...
(12832) [FG]                           _getFunctionInferredReturnType [Function 'to_native' (scipy.sparse._sputils)] (979ms)
(12832) [FG]                         getTypeOfExpression ["self.data.astype" (scipy.sparse._coo) [195:25]] (2092ms)
[Info  - 11:16:10 PM] (12832) [FG] Long operation: getTypeOfExpression (2092ms)
(12832) [FG]                       evaluateTypesForStatement ["mask" (scipy.sparse._coo) [554:9]] (2096ms)
[Info  - 11:16:10 PM] (12832) [FG] Long operation: evaluateTypesForStatement (2096ms)
(12832) [FG]                     evaluateTypeForSubnode ["mask" (scipy.sparse._coo) [554:9]] (2096ms)
[Info  - 11:16:10 PM] (12832) [FG] Long operation: evaluateTypeForSubnode (2096ms)
(12832) [FG]                   evaluateTypesForStatement ["summed" (scipy.sparse._coo) [528:9]] (2142ms)
[Info  - 11:16:10 PM] (12832) [FG] Long operation: evaluateTypesForStatement (2142ms)
(12832) [FG]                 evaluateTypeForSubnode ["summed" (scipy.sparse._coo) [528:9]] (2142ms)
[Info  - 11:16:10 PM] (12832) [FG] Long operation: evaluateTypeForSubnode (2142ms)
(12832) [FG]               getTypeOfExpression ["mask.all" (scipy.sparse._coo) [311:20]] (2579ms)
[Info  - 11:16:11 PM] (12832) [FG] Long operation: getTypeOfExpression (2579ms)
(12832) [FG]             _getFunctionInferredReturnType [Function 'todia' (scipy.sparse._coo)] (3383ms)
[Info  - 11:16:11 PM] (12832) [FG] Long operation: _getFunctionInferredReturnType (3383ms)
(12832) [FG]           getTypeOfExpression ["self._coo_container(arg1, dtyp <shortened> " (scipy.sparse._dia) [135:17]] (3385ms)
[Info  - 11:16:11 PM] (12832) [FG] Long operation: getTypeOfExpression (3385ms)
(12832) [FG]         getTypeOfExpression ["self.data.astype" (scipy.sparse._dia) [141:25]] (3803ms)
[Info  - 11:16:11 PM] (12832) [FG] Long operation: getTypeOfExpression (3803ms)
(12832) [FG]       isAfterNodeReachable [function '"__init__"  [89:9]' (scipy.sparse._dia) [89:5]] (3818ms)
[Info  - 11:16:11 PM] (12832) [FG] Long operation: isAfterNodeReachable (3818ms)
(12832) [FG]     isAfterNodeReachable [function '"_rescale_data"  [294:5]' (sklearn.linear_model._base) [294:1]] (3828ms)
[Info  - 11:16:11 PM] (12832) [FG] Long operation: isAfterNodeReachable (3828ms)
(12832) [FG]     isAfterNodeReachable ...
(12832) [FG]       getTypeOfExpression ...
(12832) [FG]         _getFunctionInferredReturnType ...
(12832) [FG]           getTypeOfExpression ...
(12832) [FG]             getTypeOfExpression ...
(12832) [FG]               getTypeOfExpression ...
(12832) [FG]                 getTypeOfExpression ...
(12832) [FG]                   getTypeOfExpression ...
(12832) [FG]                     getTypeOfExpression ...
(12832) [FG]                       getTypeOfExpression ...
(12832) [FG]                         getTypeOfExpression ...
(12832) [FG]                           getTypeOfExpression ...
(12832) [FG]                             _getFunctionInferredReturnType ...
(12832) [FG]                               getTypeOfExpression ...
(12832) [FG]                                 _getFunctionInferredReturnType [Function 'configure' (joblib._parallel_backends)] (54ms)
(12832) [FG]                               getTypeOfExpression ["self._backend.configure" (joblib.parallel) [775:22]] (61ms)
(12832) [FG]                             _getFunctionInferredReturnType [Function '_initialize_backend' (joblib.parallel)] (75ms)
(12832) [FG]                           getTypeOfExpression ["self._initialize_backend" (joblib.parallel) [1008:22]] (75ms)
(12832) [FG]                         getTypeOfExpression ["self._print" (joblib.parallel) [1041:9]] (76ms)
(12832) [FG]                       getTypeOfExpression ["self._backend.start_call" (joblib.parallel) [1044:13]] (77ms)
(12832) [FG]                     getTypeOfExpression ["self.dispatch_one_batch" (joblib.parallel) [1085:16]] (117ms)
(12832) [FG]                   getTypeOfExpression ["self.dispatch_one_batch" (joblib.parallel) [1088:19]] (117ms)
(12832) [FG]                 getTypeOfExpression ["self._backend.retrieval_contex <shortened> " (joblib.parallel) [1097:18]] (118ms)
(12832) [FG]               getTypeOfExpression ["self._print" (joblib.parallel) [1101:13]] (118ms)
(12832) [FG]             getTypeOfExpression ["self._backend.stop_call" (joblib.parallel) [1106:17]] (118ms)
(12832) [FG]           getTypeOfExpression ["self._terminate_backend" (joblib.parallel) [1108:17]] (119ms)
(12832) [FG]         _getFunctionInferredReturnType [Function '__call__' (joblib.parallel)] (120ms)
(12832) [FG]       getTypeOfExpression ["super().__call__" (sklearn.utils.parallel) [63:16]] (120ms)
(12832) [FG]     isAfterNodeReachable [function '"__call__"  [40:9]' (sklearn.utils.parallel) [40:5]] (123ms)
(12832) [FG]     isAfterNodeReachable ...
(12832) [FG]       getTypeOfExpression ...
(12832) [FG]         _getFunctionInferredReturnType [Function 'aslinearoperator' (scipy.sparse.linalg._interface)] (1469ms)
(12832) [FG]         _getFunctionInferredReturnType ...
(12832) [FG]           getTypeOfExpression ...
(12832) [FG]             _getFunctionInferredReturnType ...
(12832) [FG]               getTypeOfExpression ["np.asarray(data, dtype=dtype). <shortened> " (scipy.sparse._sputils) [383:12]] (125ms)
(12832) [FG]             _getFunctionInferredReturnType [Function 'asmatrix' (scipy.sparse._sputils)] (228ms)
(12832) [FG]           getTypeOfExpression ["y.reshape" (scipy.sparse.linalg._interface) [287:17]] (232ms)
(12832) [FG]         _getFunctionInferredReturnType [Function 'rmatvec' (scipy.sparse.linalg._interface)] (233ms)
(12832) [FG]         _getFunctionInferredReturnType ...
(12832) [FG]           getTypeOfExpression ...
(12832) [FG]             _getFunctionInferredReturnType ...
(12832) [FG]               getTypeOfExpression ...
(12832) [FG]                 _getFunctionInferredReturnType ...
(12832) [FG]                   _getFunctionInferredReturnType [Function 'asmatrix' (scipy.sparse._sputils)] (91ms)
(12832) [FG]                 _getFunctionInferredReturnType [Function 'matmat' (scipy.sparse.linalg._interface)] (94ms)
(12832) [FG]               getTypeOfExpression ["self.matmat" (scipy.sparse.linalg._interface) [199:16]] (95ms)
(12832) [FG]             _getFunctionInferredReturnType [Function '_matvec' (scipy.sparse.linalg._interface)] (95ms)
(12832) [FG]             _getFunctionInferredReturnType [Function 'asmatrix' (scipy.sparse._sputils)] (142ms)
(12832) [FG]           getTypeOfExpression ["y.reshape" (scipy.sparse.linalg._interface) [240:17]] (577ms)
(12832) [FG]         _getFunctionInferredReturnType [Function 'matvec' (scipy.sparse.linalg._interface)] (577ms)
(12832) [FG]         _getFunctionInferredReturnType ...
(12832) [FG]           evaluateTypeForSubnode ...
(12832) [FG]             evaluateTypesForStatement ...
(12832) [FG]               evaluateTypeForSubnode ...
(12832) [FG]                 evaluateTypesForStatement ["y" (scipy.sparse.linalg._interface) [237:13]] (266ms)
(12832) [FG]               evaluateTypeForSubnode ["y" (scipy.sparse.linalg._interface) [237:13]] (266ms)
(12832) [FG]             evaluateTypesForStatement ["y" (scipy.sparse.linalg._interface) [240:13]] (268ms)
(12832) [FG]           evaluateTypeForSubnode ["y" (scipy.sparse.linalg._interface) [240:13]] (268ms)
(12832) [FG]         _getFunctionInferredReturnType [Function 'matvec' (scipy.sparse.linalg._interface)] (269ms)
(12832) [FG]       getTypeOfExpression ["v.copy" (scipy.sparse.linalg._isolve.lsqr) [392:9]] (2565ms)
[Info  - 11:16:14 PM] (12832) [FG] Long operation: getTypeOfExpression (2565ms)
(12832) [FG]     isAfterNodeReachable [function '"lsqr"  [96:5]' (scipy.sparse.linalg._isolve.lsqr) [96:1]] (2565ms)
[Info  - 11:16:14 PM] (12832) [FG] Long operation: isAfterNodeReachable (2565ms)
(12832) [FG]   getEffectiveTypeOfSymbol [symbol Variable, "coef_"  [669:22] (sklearn.linear_model._base)] (6829ms)
[Info  - 11:16:14 PM] (12832) [FG] Long operation: getEffectiveTypeOfSymbol (6829ms)
(12832) [FG] runWithCancellationToken (6830ms)
[Info  - 11:16:14 PM] (12832) [FG] Long operation: runWithCancellationToken (6830ms)

the log is with this option

    "python.analysis.logLevel": "Trace",
    "python.analysis.logTypeEvaluationTime": true,
    "python.analysis.minimumLoggingThreshold": 500,

and now why resolveCompletionItem didn't seem to be cancelled when user typed more chars. it turns out vscode (hence LSP) only cancel resolveCompletionItem when completion itself is either cancelled, disposed or different item is manually selected by user. just typing more won't cancel on-going resolve item request. furthermore, filtering by requesting new completion (which could change selected item) could be blocked by the pending resolve item request. so it is kind of a dead lock situation for us.

anyway, that's why the cancellation seems not working for resolveCompletionItem and unless we move to async server, it seems there is not much we can do.

heejaechang · 2023-02-28T21:40:19Z

merging all issues about scikit-learn to #3978

github-actions bot assigned judej Feb 28, 2023

github-actions bot added the triage-needed label Feb 28, 2023

heejaechang mentioned this issue Feb 28, 2023

Intellisense suggestions is very slow with scikit learn package #3908

Closed

heejaechang assigned heejaechang and unassigned judej Feb 28, 2023

heejaechang closed this as completed Feb 28, 2023

heejaechang added the duplicate This issue or pull request already exists label Feb 28, 2023

github-actions bot removed the triage-needed label Feb 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Long delay of suggestion list update with scikit-learn #4016

Long delay of suggestion list update with scikit-learn #4016

wangtz0607 commented Feb 28, 2023

wangtz0607 commented Feb 28, 2023

heejaechang commented Feb 28, 2023

wangtz0607 commented Feb 28, 2023

heejaechang commented Feb 28, 2023

heejaechang commented Feb 28, 2023

wangtz0607 commented Feb 28, 2023

heejaechang commented Feb 28, 2023

heejaechang commented Feb 28, 2023

Long delay of suggestion list update with scikit-learn #4016

Long delay of suggestion list update with scikit-learn #4016

Comments

wangtz0607 commented Feb 28, 2023

Environment data

Code Snippet

Expected behavior

Actual behavior

Logs

wangtz0607 commented Feb 28, 2023

heejaechang commented Feb 28, 2023

wangtz0607 commented Feb 28, 2023

heejaechang commented Feb 28, 2023

heejaechang commented Feb 28, 2023

wangtz0607 commented Feb 28, 2023

heejaechang commented Feb 28, 2023

heejaechang commented Feb 28, 2023