-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AVRO-3912: Fix Big decimal deser #2599
Conversation
I thought the error was on serialization, not deserialization |
@Ten0 : I made serialization the same way i made it on Java, to keep capabilities to serialize in Java or Rust and deserialize in Java or Rust. So, i found better to fix deserialization in Rust than to review Java & rust module. |
If you skip serializing the bytes length and you still have the scale after the first buffer then that means the BigDecimal logical type cannot be deserialized as bytes (because you'd have an extra int after), which looks like it's incorrect because it's only a logical type and not a full new type. (One would expect from a logical type that if it's unknown by their library they can still deserialize it as the main type and either ignore it or interprete it manually.) |
lang/rust/avro/src/decimal.rs
Outdated
@@ -214,4 +218,63 @@ mod tests { | |||
|
|||
Ok(()) | |||
} | |||
|
|||
#[test] | |||
fn other() -> TestResult { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please give the method a better name, e.g. test_avro_3912_...
lang/rust/avro/src/decimal.rs
Outdated
let val = BigDecimal::new(BigInt::from(12), 2); | ||
record.put("field_name", val.clone()); | ||
|
||
// save record in writer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// save record in writer | |
// write a record |
lang/rust/avro/src/decimal.rs
Outdated
.build(); | ||
|
||
let result = writer.append(record.clone()); | ||
assert!(result.is_ok(), "Append KO : {}", result.unwrap_err()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert!(result.is_ok(), "Append KO : {}", result.unwrap_err()); | |
assert!(result.is_ok(), "Append NOK : {result}"); |
lang/rust/avro/src/decimal.rs
Outdated
|
||
let result = writer.append(record.clone()); | ||
assert!(result.is_ok(), "Append KO : {}", result.unwrap_err()); | ||
assert!(writer.flush().is_ok(), "Flush KO"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually there is no need to assert!()
these above. You could just use ?
, e.g. writer.append(record.clone())?;
and it will propagate the error because the method returns TestResult
.
I try to avoid using .unwrap()
in general. Especially in non-test code its usage should be forbidden. I am not sure whether there is a Clippy check that we could enforce.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add an interop test (or extend the current one) for BigDecimal in Rust and Java! |
Get it, i didn't realize that a logical type has to be deserializable in its main type; i do the fix. |
By the way if the scale was written at the beginning of the Bytes instead of the end then you wouldn't need to serialize an extra length, because the scale encoded like other ints is self-delimited so then the rest of the Bytes would be the big endian encoded part. That would be a more efficient storage, more in the spirit of the rest of the format which is to store as little amount of metadata as possible. Maybe it isn't too late to change that since it's not released? |
I add test with file generated with Java module and fix code comments. To with scale to avoid serializing an extra length is a good idea, but, Java decoder interface miss a function like "readRemainder()" or something similar; because we don't know the number of byte took by scale value with readInt() (can be one or several byte, depends on the value itself). |
Isn't avro/lang/java/avro/src/main/java/org/apache/avro/Conversions.java Lines 90 to 93 in b4968f7
|
.remaining() method apply on ByteBuffer value variable, while decoder class use its own "pos" field to "iterate" on data. So, it would need as "byte[] decoder.readRemaining()" method. |
Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
lang/rust/avro/src/decimal.rs
Outdated
@@ -113,13 +113,17 @@ impl<T: AsRef<[u8]>> From<T> for Decimal { | |||
} | |||
|
|||
pub(crate) fn serialize_big_decimal(decimal: &BigDecimal) -> Vec<u8> { | |||
// encode big decimal, without global size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
global size
sounds too ... global.
How about total size
?
…bigdecimal.rs) Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
Alternately you could make
and then parse value[decoder.getPos()..] .Either way that looks simple enough. |
make getPos public => Well, pos is an implementattion detail of BinaryDecorder, that may should not be public. |
I first merge this and report to 1.11 branch to align Rust & Java BigDecimal logical type. |
* AVRO-3912: Fix Big decimal deser Co-authored-by: Martin Tzvetanov Grigorov <[email protected]>
* AVRO-3912: Fix Big decimal deser (#2599) * AVRO-3912: Fix Big decimal deser Co-authored-by: Martin Tzvetanov Grigorov <[email protected]> * AVRO-3912: error comment
* AVRO-3912: Fix Big decimal deser Co-authored-by: Martin Tzvetanov Grigorov <[email protected]>
What is the purpose of the change
AVRO-3912 : fix big decimal deserialisation error (thanks @Ten0 for comment on this PR).
Verifying this change
Unit test from writer to reader in decimal.rs source file
Documentation