Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Impersonating Service Account - Defaults GCP quota_project to project where SA is defined #1344

Closed
2 tasks done
jcarpenter12 opened this issue Sep 11, 2024 · 2 comments · Fixed by #1345
Closed
2 tasks done
Labels
enhancement New feature or request triage

Comments

@jcarpenter12
Copy link
Contributor

Is this a new bug in dbt-bigquery?

  • I believe this is a new bug in dbt-bigquery
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When using the impersonate_service_account setting in the profiles.yml for BigQuery it will default the quota_project for the client_options to None. It will then use the project that the service account is defined in as the quota project.

This means that if the service account a user sets is defined in a project that does not have the BigQuery API enabled in it will fail, no matter what execution_project or project are set to in the profiles.yml as these are separate to the quota project.

Expected Behavior

A user should be able to override and set the quota project in the configuration of the dbt profile rather than it defaulting to the project the service account is defined in. A user should not have to enable the BigQuery API for a project that a service account lives in if BigQuery is not required in that project.

Steps To Reproduce

  1. Setup a python venv and pip install the following versions
  • pip install dbt-core==1.8.6 dbt-bigquery==1.8.2
  1. Setup three separate GCP projects (the names of which will have to be unique to yourself)

MY_SERVICE_ACCOUNT_PROJECT:

  • Create a service account called dbt-sa
  • Give your user service account token creator role on the above SA so that you can impersonate it

MY_DATA_PROJECT:

  • Enable BQ API and create a dataset called foo within it
  • Give service account above roles/bigquery.dataViewer to the dataset

MY_EXECUTION_PROJECT:

  • Enable BQ API
  • Give service account above roles/bigquery.jobUser to the project
  1. Create a profiles.yml configuration that contains the following and substitute the names of your GCP projects with those you have created.
  target: dev
  outputs:
    dev:
      type: bigquery
      method: oauth
      schema: foo
      location: europe-west2
      priority: interactive
      project: $MY_DATA_PROJECT
      execution_project: $MY_EXECUTION_PROJECT
      impersonate_service_account: "dbt-sa@${MY_SERVICE_ACCOUNT_PROJECT}.iam.gserviceaccount.com"
  1. Run the following dbt command and point to wherever you have stored your profiles.yml file defined above
dbt debug --connection --profiles-dir .

This will error on the connection and complain that the BQ API is not enabled in the project, it will link you to the project that the service account is defined in and crucially not the project or execution_project set in the profile.

Relevant log output

❯ dbt debug --connection --profiles-dir .
22:22:11  Running with dbt=1.8.6
22:22:12  dbt version: 1.8.6
22:22:12  python version: 3.12.1
22:22:12  python path: /Users/jackcarpenter/.pyenv/versions/3.12.1/envs/dbt-bigquery-bug/bin/python
22:22:12  os info: macOS-13.2.1-x86_64-i386-64bit
22:22:13  Using profiles dir at .
22:22:13  Using profiles.yml file at ./profiles.yml
22:22:13  Using dbt_project.yml file at /Users/jackcarpenter/dbt-bigquery-bug/dbt_project.yml
22:22:13  adapter type: bigquery
22:22:13  adapter version: 1.8.2
22:22:13  Skipping steps before connection verification
22:22:13  Connection:
22:22:13    method: oauth
22:22:13    database: MY_DATA_PROJECT
22:22:13    execution_project: MY_EXECUTION_PROJECT
22:22:13    schema: foo
22:22:13    location: europe-west2
22:22:13    priority: interactive
22:22:13    maximum_bytes_billed: None
22:22:13    impersonate_service_account: dbt-sa@MY_SERVICE_ACCOUNT_PROJECT.iam.gserviceaccount.com
22:22:13    job_retry_deadline_seconds: None
22:22:13    job_retries: 1
22:22:13    job_creation_timeout_seconds: None
22:22:13    job_execution_timeout_seconds: 5000
22:22:13    timeout_seconds: 5000
22:22:13    client_id: None
22:22:13    token_uri: None
22:22:13    dataproc_region: None
22:22:13    dataproc_cluster_name: None
22:22:13    gcs_bucket: None
22:22:13    dataproc_batch: None
22:22:13  Registered adapter: bigquery=1.8.2
22:22:15  BigQuery adapter: https://console.cloud.google.com/bigquery?project=MY_EXECUTION_PROJECT&j=bq:europe-west2:7388df4b-c857-4aef-8af5-5f97d9a5ff74&page=queryresults
22:22:15    Connection test: [ERROR]

22:22:15  2 checks failed:
22:22:15  Project loading failed for the following reason:
 project path </Users/jackcarpenter/dbt-bigquery-bug/dbt_project.yml> not found

22:22:15  dbt was unable to connect to the specified database.
The database returned the following error:

  >Database Error
  BigQuery API has not been used in project xxxxxxxxxxxxx before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/bigquery.googleapis.com/overview?project=xxxxxxxxxxxxx then retry
. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.

Check your database credentials and try again. For more information, visit:
https://docs.getdbt.com/docs/configure-your-profile


### Environment

```markdown
- OS: macOS Ventura 13.2.1
- Python: 3.12.1
- dbt-core: 1.8.6
- dbt-bigquery: 1.8.2

Additional Context

This relates to a feature request I have also raised #1343 that outlines a solution/improvement that I think may work

@jcarpenter12 jcarpenter12 added bug Something isn't working triage labels Sep 11, 2024
@jcarpenter12 jcarpenter12 changed the title [Bug] Impersonating Service Account - Defaults GCOquota_project to project where SA is defined [Bug] Impersonating Service Account - Defaults GCP quota_project to project where SA is defined Sep 11, 2024
@amychen1776
Copy link

@jcarpenter12 Thank you for opening this issue! I appreciate the level of detail you provided here and in the related PR. Would you be able to explain to me in what instances a user would need to impersonate a service account and also not use the related quota account? Is it due to access?

@amychen1776 amychen1776 added enhancement New feature or request awaiting_response and removed bug Something isn't working triage labels Sep 13, 2024
@jcarpenter12
Copy link
Contributor Author

@jcarpenter12 Thank you for opening this issue! I appreciate the level of detail you provided here and in the related PR. Would you be able to explain to me in what instances a user would need to impersonate a service account and also not use the related quota account? Is it due to access?

Hi @amychen1776 thanks for getting back to me. This is actually a bit misleading this issue. It actually isn't related to impersonating a service account at all it applies when using any auth method as far as I can tell. I've raised this issue #1347 to cover that and will close this one as it's not the core problem

matthewshaver added a commit to dbt-labs/docs.getdbt.com that referenced this issue Dec 18, 2024
## What are you changing in this pull request and why?
<!--
Describe your changes and why you're making them. If related to an open
issue or a pull request on dbt Core or another repository, then link to
them here!

To learn more about the writing conventions used in the dbt Labs docs,
see the [Content style
guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md).
-->

References dbt-labs/dbt-bigquery#1343 dbt-labs/dbt-bigquery#1344

Adding docs to detail changes in PR dbt-labs/dbt-bigquery#1345

This adds detail about how to override the `quota_project` through the
dbt profile. It also updates the information regarding the execution
project. I believe that the execution project just sets where the BQ job
is created and doesn't actually impact where it's billed from (I could
be wrong on this, but from my testing I can see that the quota project
is set from the environment not from the execution project override
currently detailed in the docs).

## Checklist
- [x] I have reviewed the [Content style
guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md)
so my content adheres to these guidelines.
- [ ] The topic I'm writing about is for specific dbt version(s) and I
have versioned it according to the [version a whole
page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version)
and/or [version a block of
content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content)
guidelines.
- [ ] I have added checklist item(s) to this list for anything anything
that needs to happen before this PR is merged, such as "needs technical
review" or "change base branch."
<!--
PRE-RELEASE VERSION OF dbt (if so, uncomment):
- [ ] Add a note to the prerelease version [Migration
Guide](https://github.com/dbt-labs/docs.getdbt.com/tree/current/website/docs/docs/dbt-versions/core-upgrade)
-->
<!-- 
ADDING OR REMOVING PAGES (if so, uncomment):
- [ ] Add/remove page in `website/sidebars.js`
- [ ] Provide a unique filename for new pages
- [ ] Add an entry for deleted pages in `website/vercel.json`
- [ ] Run link testing locally with `npm run build` to update the links
that point to deleted pages
-->

---------

Co-authored-by: Leona B. Campbell <[email protected]>
Co-authored-by: Matt Shaver <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request triage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants