Handle NULL values in hash calculation for accurate data comparison #3

bienvenuushindi · 2024-12-16T21:47:30Z

Context:

While working with DBDiff to compare data and schema, an issue was identified in the handling of NULL values during data comparison. Specifically, when columns containing NULL values were included in the MD5 hash computation, the results were inconsistent. This issue was due to the behavior of CONCAT in SQL, which returns NULL if any of the values being concatenated are NULL.

Root Cause:

When constructing MD5 hashes for row comparisons, the query builder used CONCAT without handling NULL values. This led to NULL hashes, causing incorrect identification of rows as identical even when there were clear differences.

For example:

SELECT CONCAT('value', NULL);  -- Result is NULL as shown below
+-----------------------+
| CONCAT('value', NULL) |
+-----------------------+
| NULL                  |
+-----------------------+

…ison

kroky · 2024-12-17T13:13:58Z

Original issue found here: https://gitlab.com/tikiwiki/tiki/-/merge_requests/6532#note_2241090253

[FIX] handle NULL values in hash calculation for accurate data compar…

b2cb0db

…ison

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle NULL values in hash calculation for accurate data comparison #3

Handle NULL values in hash calculation for accurate data comparison #3

bienvenuushindi commented Dec 16, 2024

kroky commented Dec 17, 2024

Handle NULL values in hash calculation for accurate data comparison #3

Are you sure you want to change the base?

Handle NULL values in hash calculation for accurate data comparison #3

Conversation

bienvenuushindi commented Dec 16, 2024

Context:

Root Cause:

kroky commented Dec 17, 2024