-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert BigInt/Decimal to expected types for packed numbers #1391
base: main
Are you sure you want to change the base?
Conversation
- add checks for signed/unsigned in BinaryIntegerBaseParser and BinaryIntegerBaseUnparser - add tests for parse and unparse - update failing tests DAFFODIL-2297
- currently we don't check the minimum lengths for signed/unsigned binary ints and this causes issues when you try to set a binary int to 0 bits (RSDE), so per the DFDL workgroup we add in these checks. - add width check to unparse code and associated test - fix bug where we weren't returning after UnparseError - add SDEs for checks for non-expression fixed lengths - add tunable allowSignedIntegerLength1Bit and update tests to use - add WarnID SignedBinaryIntegerLength1Bit - SDW by default, SDE or UE/PE if tunable is false DAFFODIL-2297
- currently we pass in isSigned boolean from ElementBaseGrammarMixin, but we can use the new PrimType.PrimNumeric.isSigned member instead for simplicity. We also added the minWidth member to simplify checking the minimum width of PrimNumeric types DAFFODIL-2339
- move vals into conditional blocks that use them - fix diagnostic message to be grammatically correct for unsigned - make binaryIntegerValue lazy val - add tests for coverage - make isSigned a def instead of a val - set minWidth for decimal to Nope - fix minWidth bug for Byte type - add comment for non PrimNumeric usage of PackedBinaryIntegerBaseParser - make it a PE if nBits == 0 for PackedBinaryIntegerBaseParser DAFFODIL-2297
- added test for parsing from -1 to 1 bit - added checks for signed/unsigned minWidth to decimal with tests DAFFODIL-2297
- added more decimal tests - updated PE to be dynamic for signed and unsigned packed integers - added PE for packed decimal with length 0 DAFFODIL-2297
… Integers - move checkMinWidth and checkMaxWidth into BinaryNumberCheckWidth trait - no SDW on parsing/unparsing 1 bit length signed numbers (just during compilation for bit length known at compile time) - update tests - add test for packed integer IVC DAFFODIL-2297
</tdml:defineSchema> | ||
|
||
<!-- Daffodil 2961 --> | ||
<tdml:parserTestCase name="bcdBigDecimalError" root="bcdBigDec" model="bcdErr" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove "error"/"err" from the schema and test name. I think in general that implies it is a negative test case. Maybe use "expr" instead to make it clear that this is about using packed values in expressions?
...dil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/PackedBinaryTraits.scala
Show resolved
Hide resolved
case pn: PrimType.PrimNumeric => pn.fromNumber(toBigDecimal(num, scale)) | ||
// Non-numeric types such as Time can still use these funcitons and | ||
// expect BigDecimal as the output of the conversion | ||
case _ => toBigDecimal(num, scale) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like when we always use the packed integer parsers for dealing with date/times, so should this be an Assert.impossible() case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that is correct. Will change.
@@ -38,6 +40,24 @@ import passera.unsigned.ULong | |||
trait PackedBinaryConversion { | |||
def toBigInteger(num: Array[Byte]): JBigInteger | |||
def toBigDecimal(num: Array[Byte], scale: Int): JBigDecimal | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's worth doing a little more refactoring. I say this because something that implements PackedBinaryConversion only ever uses one of toBigInteger/toInteger or toBigDecimal/toDecimal. So it feels a bit odd to require both be implemented. And potentially unsafe--an implementation could accidentally use the wrong function.
What if instead we remove PackedBinaryConversion altogether, and PackedBinaryDecimalBaseParser
has the abstract definition of toBigDecimal, and PackedBinaryIntegerBaseParser
has the abstract definition for toBigInteger. Then these new toInteger and toDecimal functions go and the logic just moved into the the appropriate parser trait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue here is that we would still need to duplicate everything in the PackedBinaryDecimal/IntegerDelimitedBaseParsers as well. I think that was the reason for the conversion trait to begin with. The DelimitedBaseParsers do not share anything with the standard BaseParsers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PackedIntegerDelimitedParser only uses toBigInteger and PackedDecimalDelimitedParser only uses toBigDecimal, so these two classes do need an abstract toBigInteger/toBigDecimal respectively, but they shouldn't need both that the PackedBinaryConversion trait requires.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't the delimited parsers need the same conversion code to avoid the error this bug was fixing where a value gets used in an expression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, maybe right refactoring is more like this?
// returns either BigDecimal or BigInteger depending on the implementation
def toBigNumber(num: Array[Byte]): Number
def toPrimType(...) : DataValueNumber = {
val bigNumber = toBigNumber
context.optPrimType.get match {
case pn: PrimType.PrimNumeric => pn.fromNumber(bigNmber)
case => bigNumber
}
}
Then all the individual parses just have to implement toBigNumber correctly, and the abstract base classes just call toPrimType. Might even be renaming it and changing it to also handle exceptions and call setDataValue to avoid more duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just making sure that I'm on the same page. Is that not what I did below (although they are called toInter/Decimal instead of toPrimType)? I recognize that I need to add some exception handling for potential InvalidPrimitiveDataExceptions. I think it also makes more sense to me to keep the calls to actually set the value inside the parse functions instead of moving them inside the toPrimType trait function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the main difference I'm suggesting is making the trait more generic so implementations don't have to implement both toBigInteger and toBigDecimal, they just implement a single toBigNumber and that will return the BigDecimal or BigInteger depending on if it's an decimal or integer type of parser. To ensure correct types, we could define it something like this:
trait PackedBinaryConversion[A <: Number] {
def toNumber(num: Array[Byte]): A
def toPrimType(...) = { ... toNumber(...) }
}
And then usages would provide the right type parameter, for example:
abstract class PackedBinaryIntegerDelimitedBaseParser(...)
...
with PackedBinaryConversion[JBigInteger]
...
abstract class PackedBinaryIntegerDelimitedDecimalParser(...)
...
with PackedBinaryConversion[JBigDecimal]
Avoids duplicating the decimal/integer logic in the PackedBinaryConversion trait.
I think it also makes more sense to me to keep the calls to actually set the value inside the parse functions instead of moving them inside the toPrimType trait function.
I have no strong preference either way. You just have to duplicate error handling logic if it's not done in the shared trait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a clean way to have a single toNumber function while also dealing with the scale parameter that is needed for Decimal? I thought about defaulting scale to 0 and treating scale=0 as a BigInt, but scale=0 is valid for BigDecimal as well.
You just have to duplicate error handling logic if it's not done in the shared trait.
If I have it in the trait I would then need to pass that current state in as well to create the Parse Error. Since the PackedBinaryConversion trait I wasn't sure if there was a way to pass in either a Parse state or an Unparse state that could create errors appropriately.
I'm pushing up some changes now that I think cover everything else pretty well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you drop the scale parameter and get it from the class member, e.g.
class BCDDecimalKnownLengthParser(..., binaryDecimalVirtualPoint: Int)
extends ... {
override def toNumber(num: Array[Byte]): JBigDecimal =
DecimalUtils.bcdToBigDecimal(num, binaryDecimalVirtualPoint)
}
I think this actually makes is actually an improvement over what we have now, since the abstract class shouldn't need to know anything about how the implementations converts a byte array to a Number
, all the information like scale, sign codes, etc. is private to the specific packed binary number implementation. And if we ever need some new parameter on how a byte array is converted to a Number it can be done specifically in the implementation and the base class doesn't have to change at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented it as you described now. It is much cleaner now I think. Unfortunately rebasing onto Lola's branch has really muddied up the diff when it comes to review.
...dil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/PackedBinaryTraits.scala
Show resolved
Hide resolved
@@ -26,6 +26,8 @@ import org.apache.daffodil.lib.equality.TypeEqual | |||
import org.apache.daffodil.lib.exceptions.Assert |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably hold off until #1337 is merged, since that touches some of this code and is the bigger commit, easier to deal with conflicts with the smaller commit.
@@ -100,4 +100,7 @@ class TestPacked { | |||
@Test def testDelimitedIBM4690DecSeqUnparser(): Unit = { | |||
runner.runOneTest("DelimitedIBM4690DecSeqUnparser") | |||
} | |||
|
|||
// Daffodil-2961 | |||
@Test def testBCDBigDecimalError(): Unit = { runner.runOneTest("bcdBigDecimalError") } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some out of range tests as well? That's a new limitation that this will enforce that I think wasn't previously enforced.
… Binary Integers - add tests for 0 bit length and multiple of 4 check - add runtime multiple of 4 length check Deprecation/Compatibility Although still supported via the tunable allowSignedIntegerLength1Bit, it is recommended that schemas be updated to not depend on 1-bit signed binary integers DAFFODIL-2297
- add test for verification DAFFODIL-1842
Bumps fedora from 40 to 41. --- updated-dependencies: - dependency-name: fedora dependency-type: direct:production update-type: version-update:semver-major ... Also removes the "dnf update nodejs". That was needed because the pinned version of nodejs in Fedora 40 was broken. Fedora 41 ships with a newer version of nodejs that includes a fix so this update is no longer needed. Signed-off-by: dependabot[bot] <[email protected]>
Tests show that parse of negative representations (from packed decimal reps) does not detect the error when decimalSigned="no". Test aalso show that unparse of negative values into binary, packed, and bcd all are broken. This doesn't fix the issue. Just adds tests to illustrate the failing cases. DAFFODIL-2957
Moved TestUtils and FuzzData in core to src/main so FuzzData can be used outside of daffodil proper. Removed unused and trivial methods from TestUtils. Made private all methods possible. We're left with minimum footprint of public methods. Removed dependency on JUnit from TestUtils and FuzzData so they don't have to be in src/test. Tested for and removed a couple of TODOs that are no longer needed. Reducing the number of TODOs and FIXMEs is DAFFODIL-1569. DAFFODIL-2959, DAFFODIL-1569
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
I don't have much to add beyond finding one typo.
|
||
<!-- Daffodil 2961 --> | ||
<tdml:parserTestCase name="bcdBigDecimalError" root="bcdBigDec" model="bcdErr" | ||
description="Verify that usign packed values in an expression work correctly"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usign => unsigned
Rebased on PR apache#1337
26912c9
to
416bf4f
Compare
Still have an edge case for packed xs:date/Time values require an actual BigInt, but this fixes DAFFODIL-2961