Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Improve startup time of typing dbt #9814

Closed
3 tasks done
b-per opened this issue Mar 25, 2024 · 11 comments
Closed
3 tasks done

[Feature] Improve startup time of typing dbt #9814

b-per opened this issue Mar 25, 2024 · 11 comments
Labels
enhancement New feature or request performance stale Issues that have gone stale

Comments

@b-per
Copy link
Contributor

b-per commented Mar 25, 2024

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

On my Mac M1, running dbt without any subcommand or flag takes between 1.2 and 1.5 secs to run and show me the subcommands list (measured with time).

This is not a big problem when using dbt on a day to day basis but it prevents us from being to leverage the out of the box shell completion from Click like implemented here . Each call to complete the command or params takes between 1.2 and 1.5 secs, making the completion not really usable.

I'd expect that running dbt without any parameter or subcommand would be instantaneous and not take more than 1 sec.

Describe alternatives you've considered

Not improving the startup speed and not being able to leverage the free completion script from Click.

Who will this benefit?

  • the world: Imagine 1 second multiplied by millions of dbt runs 😄
  • dbt Labs: getting a completion script we never need to manage manually again - we could technically do the same for the dbt Cloud CLI as well
  • dbt developers: a fast, always up to date completion script (and also faster dbt start in general)

Are you interested in contributing this feature?

Yes, but am I the best person

Anything else?

No response

@b-per b-per added enhancement New feature or request triage labels Mar 25, 2024
@b-per
Copy link
Contributor Author

b-per commented Mar 25, 2024

I am wondering if some of the imports from that file are causing the delay and if we could move them inside specific functions/classes

import functools
from copy import copy
from dataclasses import dataclass
from typing import Callable, List, Optional, Union

import click
from click.exceptions import (
    Exit as ClickExit,
    BadOptionUsage,
    NoSuchOption,
    UsageError,
)

from dbt.cli import requires, params as p
from dbt.cli.exceptions import (
    DbtInternalException,
    DbtUsageException,
)
from dbt.contracts.graph.manifest import Manifest
from dbt.artifacts.schemas.catalog import CatalogArtifact
from dbt.artifacts.schemas.run import RunExecutionResult
from dbt_common.events.base_types import EventMsg

@b-per
Copy link
Contributor Author

b-per commented Mar 25, 2024

Here is a screenshot of "tuna" gotten with python -X importtime core/dbt/cli/main.py 2> tuna.log followed by tuna tuna.log. It shows the performance of the different imports.

image

Most of those imports shouldn't be required when running a simple dbt command but I don't know the effort behind not loading those in that case.

@b-per
Copy link
Contributor Author

b-per commented Mar 25, 2024

I just tried to strip quite a bit of code to find if there is a piece to focus on, but even after removing a lot of imports (making dbt work only to show its commands and args), it still takes more than 1 sec.

Instead of moving imports it might be better to have a "click only" flow where we don't call any code from core.dbt and just provide the commands/subcommands/arguments straight away.

image

@peterallenwebb
Copy link
Contributor

@b-per Your timing could not be more perfect. I think I can get you a lot of the remaining time back. Check out dbt-labs/dbt-common#98. This should go out with the next release of dbt-common, which I have been told is scheduled for tomorrow, March 26th.

@b-per
Copy link
Contributor Author

b-per commented Mar 25, 2024

Great news!!

And if 0.7s is already much better than 1.5s, I still can't stop thinking that it feels a bit long for something that is just about showing a list of commands and parameters.

@peterallenwebb
Copy link
Contributor

peterallenwebb commented Mar 25, 2024

Agreed. If your changes combined with mine don't get us down to "almost instant" then I can work with you to get us the rest of the way. I'm confident we can do it.

@b-per
Copy link
Contributor Author

b-per commented Mar 25, 2024

My changes are not OK to be merged because they are breaking the normal dbt flow but your change plus this heavyweight removal of imports make dbt run in ~0.45 secs. So, a slight decrease from 0.7 but nothing major either.

@dbeatty10
Copy link
Contributor

Makes sense to be able to add sub-command and parameter completion like proposed in dbt-labs/dbt-completion.bash#21

@b-per Two questions for you:

  1. It sounds like the poor responsiveness you observed is a known issue with Click. Did you already try this?
  2. If the above doesn't help, would this be useful, by any chance?

@b-per
Copy link
Contributor Author

b-per commented Apr 1, 2024

For 1., the docs say

To speed it up, write the generated script to a file, then source that

This is what is done in the implementation in dbt-completion.bash ; the delay is not from Click in that case but from dbt itself.

I think 2. could help as well. But in that case, we would need a specific subcommand in dbt to get the matrix of all commands and all parameters allowed. Then, we'll need to process it in bash/zsh and make it work with their completion framework (possible but not trivial and not fun :-) )

I feel like if we reach a start up time of dbt of ~100-200 ms, then we wouldn't need #6840 and could just use the out of the box completion without any delay.

For the moment, I am thinking of adding the "auto" completion in dbt-completion.bash, but only active when a given env var is set, in case people want to use it despite the perf hit.

Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label Sep 29, 2024
Copy link
Contributor

github-actions bot commented Oct 7, 2024

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance stale Issues that have gone stale
Projects
None yet
Development

No branches or pull requests

3 participants