Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle NULL values in hash calculation for accurate data comparison #3

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bienvenuushindi
Copy link

Context:

While working with DBDiff to compare data and schema, an issue was identified in the handling of NULL values during data comparison. Specifically, when columns containing NULL values were included in the MD5 hash computation, the results were inconsistent. This issue was due to the behavior of CONCAT in SQL, which returns NULL if any of the values being concatenated are NULL.

Root Cause:

When constructing MD5 hashes for row comparisons, the query builder used CONCAT without handling NULL values. This led to NULL hashes, causing incorrect identification of rows as identical even when there were clear differences.

For example:

SELECT CONCAT('value', NULL);  -- Result is NULL as shown below
+-----------------------+
| CONCAT('value', NULL) |
+-----------------------+
| NULL                  |
+-----------------------+

@kroky
Copy link

kroky commented Dec 17, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants