Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long delay of suggestion list update with scikit-learn #4016

Closed
wangtz0607 opened this issue Feb 28, 2023 · 8 comments
Closed

Long delay of suggestion list update with scikit-learn #4016

wangtz0607 opened this issue Feb 28, 2023 · 8 comments
Assignees
Labels
duplicate This issue or pull request already exists

Comments

@wangtz0607
Copy link

Environment data

  • Language Server version: v2023.2.40
  • OS and version: Windows 11 22H2 64-bit
  • Python version (& distribution if applicable, e.g. Anaconda): 3.11.1

Code Snippet

from sklearn import linear_model

model = linear_model.LinearRegression()
model.fit
scikit-learn==1.2.1

Expected behavior

The suggestion list updates immediately as I type.

Actual behavior

Recording

Logs

https://paste.ubuntu.com/p/wf9HZJdckR/

@wangtz0607
Copy link
Author

Also reproduced on:

  • Language Server version: v2023.2.40
  • OS and version: Ubuntu 20.04 focal (x86_64 Linux 5.15.0-60-generic)
  • Python version (& distribution if applicable, e.g. Anaconda): 3.8.10

@heejaechang
Copy link
Contributor

I looked the log. the delay it shows is expected. most of work it was doing was parsing/binding. that are one time cost per a file when they are used the first time. your particular code touched a lot of files transitively (a lot of nested imports inside of the sklearn) so all those files are needed to be read/parsed and bound.

but once that are done, things are fast. you can confirm that, by typing this

from sklearn import linear_model

model = linear_model.LinearRegression()

model.f

and wait until completion shows up like you did in the repro steps.

and then delete everything and retype again. the second time should be fast since all necessary files are already brought in and cached.

@wangtz0607
Copy link
Author

Unfortunately that's not the case:

Recording

@heejaechang
Copy link
Contributor

if it still happens when you retype, please provide us logs for the retype case as well so we can take a look what is taking the time.

@heejaechang
Copy link
Contributor

I reproed. it wasn't actually completion that is slow. it was getting tooltip for some particular items in the completion taking long time when it is selected by default when completion was shown. such as coef_ for this particular case. or fit and etc.

if the item selected by default happen to be a cheap one to calculate tooltip such as copy_X then it would have been fast.

anyway, now taking a look to see why those are taking long time (and see why cancellation didn't kick in)

@wangtz0607
Copy link
Author

In case someone still need the logs: https://paste.ubuntu.com/p/DpGqMQHW5j/

@heejaechang
Copy link
Contributor

thank you @wangtz0607

@erictraut can you take a look? this is a similar issue with #3978

from sklearn import linear_model

model = linear_model.LinearRegression()
model. <= completion here

repro step is opening a completion at model. and select item such as coef_ or fit so we get resolveCompletionItem request for the symbol.

I attached the log below that shows code flow when we try getting the tooltip.

(12832) [FG] completion at c:\pyTemp\junk\test1.py:4:6 ...
(12832) [FG]   runWithCancellationToken ...
(12832) [FG]     parsing: c:\pyTemp\junk\test1.py (0ms)
(12832) [FG]     binding: c:\pyTemp\junk\test1.py (0ms)
(12832) [FG]   runWithCancellationToken (19ms)
(12832) [FG] completion at c:\pyTemp\junk\test1.py:4:6 [found 52 items] (19ms)
(12832) [FG] runWithCancellationToken ...
(12832) [FG]   getEffectiveTypeOfSymbol ...
(12832) [FG]     isAfterNodeReachable ...
(12832) [FG]       isAfterNodeReachable ...
(12832) [FG]         getTypeOfExpression ...
(12832) [FG]           _getFunctionInferredReturnType [Function 'getdtype' (scipy.sparse._sputils)] (57ms)
(12832) [FG]           assignType [OverloadedFunction ["Function '__getitem__' (numpy)","Function '__getitem__' (numpy)","Function '__getitem__' (numpy)","Function '__getitem__' (numpy)","Function '__getitem__' (numpy)"]] (61ms)
(12832) [FG]           getTypeOfExpression ...
(12832) [FG]             _getFunctionInferredReturnType ...
(12832) [FG]               isAfterNodeReachable ...
(12832) [FG]                 getTypeOfExpression ...
(12832) [FG]                   assignType [OverloadedFunction ["Function '__getitem__' (numpy)","Function '__getitem__' (numpy)","Function '__getitem__' (numpy)","Function '__getitem__' (numpy)","Function '__getitem__' (numpy)"]] (52ms)
(12832) [FG]                   getTypeOfExpression ...
(12832) [FG]                     evaluateTypeForSubnode ...
(12832) [FG]                       evaluateTypesForStatement ["arr" (scipy.sparse._sputils) [183:9]] (359ms)
(12832) [FG]                     evaluateTypeForSubnode ["arr" (scipy.sparse._sputils) [183:9]] (359ms)
(12832) [FG]                   getTypeOfExpression ["arr.max" (scipy.sparse._sputils) [190:30]] (362ms)
(12832) [FG]                 getTypeOfExpression ["arr.min" (scipy.sparse._sputils) [191:30]] (550ms)
(12832) [FG]               isAfterNodeReachable [function '"get_index_dtype"  [147:5]' (scipy.sparse._sputils) [147:1]] (550ms)
(12832) [FG]               getTypeOfExpression ...
(12832) [FG]                 evaluateTypeForSubnode ...
(12832) [FG]                   evaluateTypesForStatement ...
(12832) [FG]                     evaluateTypeForSubnode ...
(12832) [FG]                       evaluateTypesForStatement ...
(12832) [FG]                         getTypeOfExpression ...
(12832) [FG]                           _getFunctionInferredReturnType [Function 'to_native' (scipy.sparse._sputils)] (979ms)
(12832) [FG]                         getTypeOfExpression ["self.data.astype" (scipy.sparse._coo) [195:25]] (2092ms)
[Info  - 11:16:10 PM] (12832) [FG] Long operation: getTypeOfExpression (2092ms)
(12832) [FG]                       evaluateTypesForStatement ["mask" (scipy.sparse._coo) [554:9]] (2096ms)
[Info  - 11:16:10 PM] (12832) [FG] Long operation: evaluateTypesForStatement (2096ms)
(12832) [FG]                     evaluateTypeForSubnode ["mask" (scipy.sparse._coo) [554:9]] (2096ms)
[Info  - 11:16:10 PM] (12832) [FG] Long operation: evaluateTypeForSubnode (2096ms)
(12832) [FG]                   evaluateTypesForStatement ["summed" (scipy.sparse._coo) [528:9]] (2142ms)
[Info  - 11:16:10 PM] (12832) [FG] Long operation: evaluateTypesForStatement (2142ms)
(12832) [FG]                 evaluateTypeForSubnode ["summed" (scipy.sparse._coo) [528:9]] (2142ms)
[Info  - 11:16:10 PM] (12832) [FG] Long operation: evaluateTypeForSubnode (2142ms)
(12832) [FG]               getTypeOfExpression ["mask.all" (scipy.sparse._coo) [311:20]] (2579ms)
[Info  - 11:16:11 PM] (12832) [FG] Long operation: getTypeOfExpression (2579ms)
(12832) [FG]             _getFunctionInferredReturnType [Function 'todia' (scipy.sparse._coo)] (3383ms)
[Info  - 11:16:11 PM] (12832) [FG] Long operation: _getFunctionInferredReturnType (3383ms)
(12832) [FG]           getTypeOfExpression ["self._coo_container(arg1, dtyp <shortened> " (scipy.sparse._dia) [135:17]] (3385ms)
[Info  - 11:16:11 PM] (12832) [FG] Long operation: getTypeOfExpression (3385ms)
(12832) [FG]         getTypeOfExpression ["self.data.astype" (scipy.sparse._dia) [141:25]] (3803ms)
[Info  - 11:16:11 PM] (12832) [FG] Long operation: getTypeOfExpression (3803ms)
(12832) [FG]       isAfterNodeReachable [function '"__init__"  [89:9]' (scipy.sparse._dia) [89:5]] (3818ms)
[Info  - 11:16:11 PM] (12832) [FG] Long operation: isAfterNodeReachable (3818ms)
(12832) [FG]     isAfterNodeReachable [function '"_rescale_data"  [294:5]' (sklearn.linear_model._base) [294:1]] (3828ms)
[Info  - 11:16:11 PM] (12832) [FG] Long operation: isAfterNodeReachable (3828ms)
(12832) [FG]     isAfterNodeReachable ...
(12832) [FG]       getTypeOfExpression ...
(12832) [FG]         _getFunctionInferredReturnType ...
(12832) [FG]           getTypeOfExpression ...
(12832) [FG]             getTypeOfExpression ...
(12832) [FG]               getTypeOfExpression ...
(12832) [FG]                 getTypeOfExpression ...
(12832) [FG]                   getTypeOfExpression ...
(12832) [FG]                     getTypeOfExpression ...
(12832) [FG]                       getTypeOfExpression ...
(12832) [FG]                         getTypeOfExpression ...
(12832) [FG]                           getTypeOfExpression ...
(12832) [FG]                             _getFunctionInferredReturnType ...
(12832) [FG]                               getTypeOfExpression ...
(12832) [FG]                                 _getFunctionInferredReturnType [Function 'configure' (joblib._parallel_backends)] (54ms)
(12832) [FG]                               getTypeOfExpression ["self._backend.configure" (joblib.parallel) [775:22]] (61ms)
(12832) [FG]                             _getFunctionInferredReturnType [Function '_initialize_backend' (joblib.parallel)] (75ms)
(12832) [FG]                           getTypeOfExpression ["self._initialize_backend" (joblib.parallel) [1008:22]] (75ms)
(12832) [FG]                         getTypeOfExpression ["self._print" (joblib.parallel) [1041:9]] (76ms)
(12832) [FG]                       getTypeOfExpression ["self._backend.start_call" (joblib.parallel) [1044:13]] (77ms)
(12832) [FG]                     getTypeOfExpression ["self.dispatch_one_batch" (joblib.parallel) [1085:16]] (117ms)
(12832) [FG]                   getTypeOfExpression ["self.dispatch_one_batch" (joblib.parallel) [1088:19]] (117ms)
(12832) [FG]                 getTypeOfExpression ["self._backend.retrieval_contex <shortened> " (joblib.parallel) [1097:18]] (118ms)
(12832) [FG]               getTypeOfExpression ["self._print" (joblib.parallel) [1101:13]] (118ms)
(12832) [FG]             getTypeOfExpression ["self._backend.stop_call" (joblib.parallel) [1106:17]] (118ms)
(12832) [FG]           getTypeOfExpression ["self._terminate_backend" (joblib.parallel) [1108:17]] (119ms)
(12832) [FG]         _getFunctionInferredReturnType [Function '__call__' (joblib.parallel)] (120ms)
(12832) [FG]       getTypeOfExpression ["super().__call__" (sklearn.utils.parallel) [63:16]] (120ms)
(12832) [FG]     isAfterNodeReachable [function '"__call__"  [40:9]' (sklearn.utils.parallel) [40:5]] (123ms)
(12832) [FG]     isAfterNodeReachable ...
(12832) [FG]       getTypeOfExpression ...
(12832) [FG]         _getFunctionInferredReturnType [Function 'aslinearoperator' (scipy.sparse.linalg._interface)] (1469ms)
(12832) [FG]         _getFunctionInferredReturnType ...
(12832) [FG]           getTypeOfExpression ...
(12832) [FG]             _getFunctionInferredReturnType ...
(12832) [FG]               getTypeOfExpression ["np.asarray(data, dtype=dtype). <shortened> " (scipy.sparse._sputils) [383:12]] (125ms)
(12832) [FG]             _getFunctionInferredReturnType [Function 'asmatrix' (scipy.sparse._sputils)] (228ms)
(12832) [FG]           getTypeOfExpression ["y.reshape" (scipy.sparse.linalg._interface) [287:17]] (232ms)
(12832) [FG]         _getFunctionInferredReturnType [Function 'rmatvec' (scipy.sparse.linalg._interface)] (233ms)
(12832) [FG]         _getFunctionInferredReturnType ...
(12832) [FG]           getTypeOfExpression ...
(12832) [FG]             _getFunctionInferredReturnType ...
(12832) [FG]               getTypeOfExpression ...
(12832) [FG]                 _getFunctionInferredReturnType ...
(12832) [FG]                   _getFunctionInferredReturnType [Function 'asmatrix' (scipy.sparse._sputils)] (91ms)
(12832) [FG]                 _getFunctionInferredReturnType [Function 'matmat' (scipy.sparse.linalg._interface)] (94ms)
(12832) [FG]               getTypeOfExpression ["self.matmat" (scipy.sparse.linalg._interface) [199:16]] (95ms)
(12832) [FG]             _getFunctionInferredReturnType [Function '_matvec' (scipy.sparse.linalg._interface)] (95ms)
(12832) [FG]             _getFunctionInferredReturnType [Function 'asmatrix' (scipy.sparse._sputils)] (142ms)
(12832) [FG]           getTypeOfExpression ["y.reshape" (scipy.sparse.linalg._interface) [240:17]] (577ms)
(12832) [FG]         _getFunctionInferredReturnType [Function 'matvec' (scipy.sparse.linalg._interface)] (577ms)
(12832) [FG]         _getFunctionInferredReturnType ...
(12832) [FG]           evaluateTypeForSubnode ...
(12832) [FG]             evaluateTypesForStatement ...
(12832) [FG]               evaluateTypeForSubnode ...
(12832) [FG]                 evaluateTypesForStatement ["y" (scipy.sparse.linalg._interface) [237:13]] (266ms)
(12832) [FG]               evaluateTypeForSubnode ["y" (scipy.sparse.linalg._interface) [237:13]] (266ms)
(12832) [FG]             evaluateTypesForStatement ["y" (scipy.sparse.linalg._interface) [240:13]] (268ms)
(12832) [FG]           evaluateTypeForSubnode ["y" (scipy.sparse.linalg._interface) [240:13]] (268ms)
(12832) [FG]         _getFunctionInferredReturnType [Function 'matvec' (scipy.sparse.linalg._interface)] (269ms)
(12832) [FG]       getTypeOfExpression ["v.copy" (scipy.sparse.linalg._isolve.lsqr) [392:9]] (2565ms)
[Info  - 11:16:14 PM] (12832) [FG] Long operation: getTypeOfExpression (2565ms)
(12832) [FG]     isAfterNodeReachable [function '"lsqr"  [96:5]' (scipy.sparse.linalg._isolve.lsqr) [96:1]] (2565ms)
[Info  - 11:16:14 PM] (12832) [FG] Long operation: isAfterNodeReachable (2565ms)
(12832) [FG]   getEffectiveTypeOfSymbol [symbol Variable, "coef_"  [669:22] (sklearn.linear_model._base)] (6829ms)
[Info  - 11:16:14 PM] (12832) [FG] Long operation: getEffectiveTypeOfSymbol (6829ms)
(12832) [FG] runWithCancellationToken (6830ms)
[Info  - 11:16:14 PM] (12832) [FG] Long operation: runWithCancellationToken (6830ms)

the log is with this option

    "python.analysis.logLevel": "Trace",
    "python.analysis.logTypeEvaluationTime": true,
    "python.analysis.minimumLoggingThreshold": 500,

and now why resolveCompletionItem didn't seem to be cancelled when user typed more chars. it turns out vscode (hence LSP) only cancel resolveCompletionItem when completion itself is either cancelled, disposed or different item is manually selected by user. just typing more won't cancel on-going resolve item request. furthermore, filtering by requesting new completion (which could change selected item) could be blocked by the pending resolve item request. so it is kind of a dead lock situation for us.

anyway, that's why the cancellation seems not working for resolveCompletionItem and unless we move to async server, it seems there is not much we can do.

@heejaechang
Copy link
Contributor

merging all issues about scikit-learn to #3978

@heejaechang heejaechang added the duplicate This issue or pull request already exists label Feb 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

3 participants