Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Supports non half width alphanumeric for generic test #8203

Conversation

d-kaneshiro
Copy link
Contributor

@d-kaneshiro d-kaneshiro commented Jul 25, 2023

resolves #8204

Problem

Double-byte characters in GenericTest were displayed as "_".

Example

Double-byte character columns are displayed as "_" when dbt test is executed.
image

Solution

Fixed an issue where double-byte characters were not displayed in GenericTest.

Example

All double-byte character columns are displayed when 'dbt test' is executed.
image

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX

@d-kaneshiro d-kaneshiro requested a review from a team as a code owner July 25, 2023 03:05
@d-kaneshiro d-kaneshiro requested a review from QMalcolm July 25, 2023 03:05
@cla-bot cla-bot bot added the cla:yes label Jul 25, 2023
@codecov
Copy link

codecov bot commented Jul 25, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (bb35b3e) 86.54% compared to head (b0a988a) 86.49%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8203      +/-   ##
==========================================
- Coverage   86.54%   86.49%   -0.05%     
==========================================
  Files         179      179              
  Lines       26538    26538              
==========================================
- Hits        22967    22955      -12     
- Misses       3571     3583      +12     
Flag Coverage Δ
integration 83.34% <100.00%> (-0.11%) ⬇️
unit 64.79% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
core/dbt/parser/generic_test_builders.py 94.00% <100.00%> (ø)

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@d-kaneshiro d-kaneshiro requested a review from a team as a code owner August 15, 2023 06:09
@d-kaneshiro d-kaneshiro requested a review from Fleid August 15, 2023 06:09
@d-kaneshiro
Copy link
Contributor Author

Added tests that include full-width hiragana, katakana, kanji, and half-width katakana columns.
All of these are working as intended.
CI test with GitHubActions has also been successful.
On the other hand, I have not verified the operation in environments other than Japan.

@MichelleArk MichelleArk added the ready_for_review Externally contributed PR has functional approval, ready for code review from Core engineering label Sep 18, 2023
@dbeatty10
Copy link
Contributor

Thanks for raising this PR and adding these test cases @d-kaneshiro !

@gshank and I looked at this, and we'd like to fix it further upstream.

@gshank is going to take a deeper look to identify the best place to address this and get back to you.

@gshank
Copy link
Contributor

gshank commented Sep 19, 2023

d-kaneshiro: Please remove the changes in core/dbt/task/runnable.py and change a line in core/dbt/parser/generic_test_builders.py from "clean_flat_args = [re.sub("[^0-9a-zA-Z_]+", "", arg) for arg in flat_args]" to "clean_flat_args = [re.sub("\W+", "", arg) for arg in flat_args]".

@gshank
Copy link
Contributor

gshank commented Sep 19, 2023

In addition, the tests/adapter set of tests is for tests that might be different in different adapters, and this functionality should not be different for different adapters, so please move the test to tests/functional/schema_tests.

@dbeatty10
Copy link
Contributor

@d-kaneshiro there's three main things that we'd need to do for this PR:

  1. Move the test to the tests/functional/schema_tests directory

  2. Change a line in core/dbt/parser/generic_test_builders.py from this:

    clean_flat_args = [re.sub("[^0-9a-zA-Z_]+", "", arg) for arg in flat_args]

    to this:

    clean_flat_args = [re.sub("\W+", "", arg) for arg in flat_args]
  3. Remove all of your changes in core/dbt/task/runnable.py (since it should be covered by the above change)

Question

Would you be willing to make these changes? Or would you prefer us to make those changes within your branch?

@d-kaneshiro
Copy link
Contributor Author

@dbeatty10 and @gshank
Thank you for your advice!

@dbeatty10
Feel free to implement these changes in my branch. (Because I can't think of a best practice for incorporating tests without breaking the existing configuration)
Please let me know after you make any changes. I'll check to see if it behaves as intended.

Copy link
Contributor

@gshank gshank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think that putting the functional test in the adapter zone isn't necessary, since this is core dbt functionality and not related to adapters. But it's not that big of an issue. I think we might be rearranging some of this soon anyway.

@dbeatty10 dbeatty10 merged commit f45b013 into dbt-labs:main Nov 9, 2023
48 checks passed
colin-rogers-dbt added a commit that referenced this pull request Nov 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla:yes ready_for_review Externally contributed PR has functional approval, ready for code review from Core engineering
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-2867] [Feature] Supports non half width alphanumeric for generic test
4 participants