Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: BigQuery profile target 'dev' invalid: 'schema' is a required property #1031

Closed
pankajastro opened this issue Jun 7, 2024 · 2 comments · Fixed by #1033
Closed

[Bug]: BigQuery profile target 'dev' invalid: 'schema' is a required property #1031

pankajastro opened this issue Jun 7, 2024 · 2 comments · Fixed by #1033
Assignees
Labels
area:profile Related to ProfileConfig, like Athena, BigQuery, Clickhouse, Spark, Trino, etc bug Something isn't working dbt:list Primarily related to dbt list command or functionality execution:local Related to Local execution environment parsing:dbt_ls Issues, questions, or features related to dbt_ls parsing profile:bigquery Related to BigQuery ProfileConfig triage-needed Items need to be reviewed / assigned to milestone
Milestone

Comments

@pankajastro
Copy link
Contributor

pankajastro commented Jun 7, 2024

Astronomer Cosmos Version

Other Astronomer Cosmos version (please specify below)

If "Other Astronomer Cosmos version" selected, which one?

1.4.2

dbt-core version

1.5.0

Versions of dbt adapters

No response

LoadMode

AUTOMATIC

ExecutionMode

LOCAL

InvocationMode

None

airflow version

2.9.0

Operating System

Linux

If a you think it's an UI issue, what browsers are you seeing the problem on?

No response

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

What happened?

my unit tests detected some issues in v1.4.2 that did not occur in v1.4.1 or v1.4.0
from CI:

def test_validate_dags(mocker, config_instance: Config):
    mocker.patch("warnings.simplefilter")

    assert os.environ["AIRFLOW__DATAHUB__ENVIRONMENT"] == "production"
    assert os.environ["AIRFLOW_BUSINESS_UNIT"] == AIRFLOW_BUSINESS_UNIT
    assert (
        config_instance.config.get("bigquery").get("dataset").get("raw")
        == f"raw_{AIRFLOW_BUSINESS_UNIT}"
    )

    mock_get_connection = mocker.patch("airflow.hooks.base.BaseHook.get_connection")
    mock_get_connection.return_value = MockConnection()

    directories_to_scan = get_dag_directories()
    for dir in directories_to_scan:
        dag_bag = DagBag(dir)
        assert len(dag_bag.import_errors) == 0

Relevant log output

Error Log:

    raise CosmosLoadDbtException(f"Unable to run {command} due to the error:\n{details}")
cosmos.dbt.graph.CosmosLoadDbtException: Unable to run ['/opt/hostedtoolcache/Python/3.10.14/x64/bin/dbt', 'ls', '--output', 'json', '--select', 'tag:refresh@12', '--vars', "today: '{{ tomorrow_ds }}'\nyesterday: '{{ ds }}'\n", '--project-dir', '/tmp/tmp8_ka9hno', '--profiles-dir', '/tmp/tmpyyfkdhiu', '--profile', 'bigquery_profile', '--target', 'dev'] due to the error:
04:09:09  Running with dbt=1.5.0
04:09:10  Encountered an error:
Runtime Error
  Credentials in profile "bigquery_profile", target "dev" invalid: 'schema' is a required property

From local container

  File "/home/airflow/.local/lib/python3.10/site-packages/cosmos/dbt/graph.py", line 216, in run_dbt_ls
    stdout = run_command(ls_command, tmp_dir, env_vars)
  File "/home/airflow/.local/lib/python3.10/site-packages/cosmos/dbt/graph.py", line 96, in run_command
    raise CosmosLoadDbtException(f"Unable to run {command} due to the error:\n{details}")
cosmos.dbt.graph.CosmosLoadDbtException: Unable to run ['/home/airflow/.local/bin/dbt', 'ls', '--output', 'json', '--select', 'tag:refresh@6', '--vars', "today: '{{ tomorrow_ds }}'\nyesterday: '{{ ds }}'\n", '--project-dir', '/tmp/tmpa9cdtc8x', '--profiles-dir', '/tmp/tmp5xp_iwzf', '--profile', 'bigquery_profile', '--target', 'prod'] due to the error:


### How to reproduce

the above test is failing.
if you're wondering, the error message did not print fully, i'm not sure if it's from the package or truncated for some other reason






### Anything else :)?

_No response_

### Are you willing to submit PR?

- [ ] Yes I am willing to submit a PR!

### Contact Details

_No response_
@pankajastro pankajastro added bug Something isn't working triage-needed Items need to be reviewed / assigned to milestone labels Jun 7, 2024
@dosubot dosubot bot added area:profile Related to ProfileConfig, like Athena, BigQuery, Clickhouse, Spark, Trino, etc dbt:list Primarily related to dbt list command or functionality execution:local Related to Local execution environment parsing:dbt_ls Issues, questions, or features related to dbt_ls parsing profile:bigquery Related to BigQuery ProfileConfig labels Jun 7, 2024
@pankajastro pankajastro changed the title BigQuery profile target 'dev' invalid: 'schema' is a required property[Bug]: [Bug]: BigQuery profile target 'dev' invalid: 'schema' is a required property Jun 7, 2024
@tatiana tatiana added this to the Cosmos 1.5.0 milestone Jun 7, 2024
@tatiana
Copy link
Collaborator

tatiana commented Jun 7, 2024

@pankajastro I think this bug relates to this change: #1017

It seems we are not adding schema to the profiles.yml file created with the ProfileMapping, but that dbt-bigquery adapter expects it to be there:

Runtime Error
  Credentials in profile "bigquery_profile", target "dev" invalid: 'schema' is a required property

@tatiana
Copy link
Collaborator

tatiana commented Jun 7, 2024

I wonder if it is a specific version of dbt-bigquery that requires schema to be in profiles. Could we confirm if this happens for newer versions of dbt-core and dbt-bigquery?

It seems that the issue with the 'schema' in profiles was introduced by the changes made in PR #1017, which was a continuation of the work done in PR #839.

These PRs assumed that the BQ profile should not have to define the dataset and that this value could come from other dbt project files.

@oliverrmaa @pankajastro , could you help us confirm the original use case and scenario so we can try to accommodate both cases?

The dbt-bigquery docs don't seem to be very explicit about which properties should be required and which should be optional: https://docs.getdbt.com/docs/core/connect-data-platform/bigquery-setup

While we don't release the fix, if you want to use 1.4.2, BQ users will have to add the following to their ProfileMapping.

        profile_args={
            "dataset": "<your-dataset>",

@pankajkoti pankajkoti self-assigned this Jun 7, 2024
pankajkoti added a commit that referenced this issue Jun 7, 2024
In PR #1017, we attempted to remove `dataset` from the required
fields list for the BigQuery profile. However, we realised that
this is failing BiqQuery dbt operations as it indeed is a required
field. Hence, bring back the same as a required field. This is
also necessary for building the mock profile where we construct
the profile by taking in consideration only the required fields.

closes: #1031
tatiana pushed a commit that referenced this issue Jun 7, 2024
In PR #1017, we attempted to remove `dataset` from the required fields
list for the BigQuery profile. However, we realised that this is failing
BiqQuery dbt operations as it indeed is a required field. Hence, bring
back the same as a required field. This is also necessary for building
the mock profile where we construct the profile by taking in
consideration only the required fields.

Closes: #1031
pankajkoti added a commit that referenced this issue Jun 7, 2024
In PR #1017, we attempted to remove `dataset` from the required fields
list for the BigQuery profile. However, we realised that this is failing
BiqQuery dbt operations as it indeed is a required field. Hence, bring
back the same as a required field. This is also necessary for building
the mock profile where we construct the profile by taking in
consideration only the required fields.

Closes: #1031
(cherry picked from commit 803776a)
arojasb3 pushed a commit to arojasb3/astronomer-cosmos that referenced this issue Jul 14, 2024
…omer#1033)

In PR astronomer#1017, we attempted to remove `dataset` from the required fields
list for the BigQuery profile. However, we realised that this is failing
BiqQuery dbt operations as it indeed is a required field. Hence, bring
back the same as a required field. This is also necessary for building
the mock profile where we construct the profile by taking in
consideration only the required fields.

Closes: astronomer#1031
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:profile Related to ProfileConfig, like Athena, BigQuery, Clickhouse, Spark, Trino, etc bug Something isn't working dbt:list Primarily related to dbt list command or functionality execution:local Related to Local execution environment parsing:dbt_ls Issues, questions, or features related to dbt_ls parsing profile:bigquery Related to BigQuery ProfileConfig triage-needed Items need to be reviewed / assigned to milestone
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants