Convert BigInt/Decimal to expected types for packed numbers #1391

jadams-tresys · 2024-12-13T15:01:36Z

Still have an edge case for packed xs:date/Time values require an actual BigInt, but this fixes DAFFODIL-2961

- add checks for signed/unsigned in BinaryIntegerBaseParser and BinaryIntegerBaseUnparser - add tests for parse and unparse - update failing tests DAFFODIL-2297

- currently we don't check the minimum lengths for signed/unsigned binary ints and this causes issues when you try to set a binary int to 0 bits (RSDE), so per the DFDL workgroup we add in these checks. - add width check to unparse code and associated test - fix bug where we weren't returning after UnparseError - add SDEs for checks for non-expression fixed lengths - add tunable allowSignedIntegerLength1Bit and update tests to use - add WarnID SignedBinaryIntegerLength1Bit - SDW by default, SDE or UE/PE if tunable is false DAFFODIL-2297

- currently we pass in isSigned boolean from ElementBaseGrammarMixin, but we can use the new PrimType.PrimNumeric.isSigned member instead for simplicity. We also added the minWidth member to simplify checking the minimum width of PrimNumeric types DAFFODIL-2339

- move vals into conditional blocks that use them - fix diagnostic message to be grammatically correct for unsigned - make binaryIntegerValue lazy val - add tests for coverage - make isSigned a def instead of a val - set minWidth for decimal to Nope - fix minWidth bug for Byte type - add comment for non PrimNumeric usage of PackedBinaryIntegerBaseParser - make it a PE if nBits == 0 for PackedBinaryIntegerBaseParser DAFFODIL-2297

- added test for parsing from -1 to 1 bit - added checks for signed/unsigned minWidth to decimal with tests DAFFODIL-2297

- added more decimal tests - updated PE to be dynamic for signed and unsigned packed integers - added PE for packed decimal with length 0 DAFFODIL-2297

… Integers - move checkMinWidth and checkMaxWidth into BinaryNumberCheckWidth trait - no SDW on parsing/unparsing 1 bit length signed numbers (just during compilation for bit length known at compile time) - update tests - add test for packed integer IVC DAFFODIL-2297

stevedlawrence · 2024-12-13T15:38:18Z

daffodil-test/src/test/resources/org/apache/daffodil/section13/packed/packed.tdml

+  </tdml:defineSchema>
+
+  <!-- Daffodil 2961 -->
+  <tdml:parserTestCase name="bcdBigDecimalError" root="bcdBigDec" model="bcdErr"


Can you remove "error"/"err" from the schema and test name. I think in general that implies it is a negative test case. Maybe use "expr" instead to make it clear that this is about using packed values in expressions?

...dil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/PackedBinaryTraits.scala

stevedlawrence · 2024-12-13T16:01:46Z

...dil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/PackedBinaryTraits.scala

+      case pn: PrimType.PrimNumeric => pn.fromNumber(toBigDecimal(num, scale))
+      // Non-numeric types such as Time can still use these funcitons and
+      // expect BigDecimal as the output of the conversion
+      case _ => toBigDecimal(num, scale)


It looks like when we always use the packed integer parsers for dealing with date/times, so should this be an Assert.impossible() case?

I believe that is correct. Will change.

stevedlawrence · 2024-12-13T16:22:12Z

...dil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/PackedBinaryTraits.scala

@@ -38,6 +40,24 @@ import passera.unsigned.ULong
 trait PackedBinaryConversion {
  def toBigInteger(num: Array[Byte]): JBigInteger
  def toBigDecimal(num: Array[Byte], scale: Int): JBigDecimal
+


I wonder if it's worth doing a little more refactoring. I say this because something that implements PackedBinaryConversion only ever uses one of toBigInteger/toInteger or toBigDecimal/toDecimal. So it feels a bit odd to require both be implemented. And potentially unsafe--an implementation could accidentally use the wrong function.

What if instead we remove PackedBinaryConversion altogether, and PackedBinaryDecimalBaseParser has the abstract definition of toBigDecimal, and PackedBinaryIntegerBaseParser has the abstract definition for toBigInteger. Then these new toInteger and toDecimal functions go and the logic just moved into the the appropriate parser trait.

The issue here is that we would still need to duplicate everything in the PackedBinaryDecimal/IntegerDelimitedBaseParsers as well. I think that was the reason for the conversion trait to begin with. The DelimitedBaseParsers do not share anything with the standard BaseParsers.

PackedIntegerDelimitedParser only uses toBigInteger and PackedDecimalDelimitedParser only uses toBigDecimal, so these two classes do need an abstract toBigInteger/toBigDecimal respectively, but they shouldn't need both that the PackedBinaryConversion trait requires.

Wouldn't the delimited parsers need the same conversion code to avoid the error this bug was fixing where a value gets used in an expression?

Ah right, maybe right refactoring is more like this?

// returns either BigDecimal or BigInteger depending on the implementation def toBigNumber(num: Array[Byte]): Number def toPrimType(...) : DataValueNumber = { val bigNumber = toBigNumber context.optPrimType.get match { case pn: PrimType.PrimNumeric => pn.fromNumber(bigNmber) case => bigNumber } }

Then all the individual parses just have to implement toBigNumber correctly, and the abstract base classes just call toPrimType. Might even be renaming it and changing it to also handle exceptions and call setDataValue to avoid more duplication.

Just making sure that I'm on the same page. Is that not what I did below (although they are called toInter/Decimal instead of toPrimType)? I recognize that I need to add some exception handling for potential InvalidPrimitiveDataExceptions. I think it also makes more sense to me to keep the calls to actually set the value inside the parse functions instead of moving them inside the toPrimType trait function.

I think the main difference I'm suggesting is making the trait more generic so implementations don't have to implement both toBigInteger and toBigDecimal, they just implement a single toBigNumber and that will return the BigDecimal or BigInteger depending on if it's an decimal or integer type of parser. To ensure correct types, we could define it something like this:

trait PackedBinaryConversion[A <: Number] { def toNumber(num: Array[Byte]): A def toPrimType(...) = { ... toNumber(...) } }

And then usages would provide the right type parameter, for example:

abstract class PackedBinaryIntegerDelimitedBaseParser(...) ... with PackedBinaryConversion[JBigInteger] ... abstract class PackedBinaryIntegerDelimitedDecimalParser(...) ... with PackedBinaryConversion[JBigDecimal]

Avoids duplicating the decimal/integer logic in the PackedBinaryConversion trait.

I think it also makes more sense to me to keep the calls to actually set the value inside the parse functions instead of moving them inside the toPrimType trait function.

I have no strong preference either way. You just have to duplicate error handling logic if it's not done in the shared trait.

Is there a clean way to have a single toNumber function while also dealing with the scale parameter that is needed for Decimal? I thought about defaulting scale to 0 and treating scale=0 as a BigInt, but scale=0 is valid for BigDecimal as well.

You just have to duplicate error handling logic if it's not done in the shared trait.
If I have it in the trait I would then need to pass that current state in as well to create the Parse Error. Since the PackedBinaryConversion trait I wasn't sure if there was a way to pass in either a Parse state or an Unparse state that could create errors appropriately.

I'm pushing up some changes now that I think cover everything else pretty well.

Could you drop the scale parameter and get it from the class member, e.g.

class BCDDecimalKnownLengthParser(..., binaryDecimalVirtualPoint: Int) extends ... { override def toNumber(num: Array[Byte]): JBigDecimal = DecimalUtils.bcdToBigDecimal(num, binaryDecimalVirtualPoint) }

I think this actually makes is actually an improvement over what we have now, since the abstract class shouldn't need to know anything about how the implementations converts a byte array to a Number, all the information like scale, sign codes, etc. is private to the specific packed binary number implementation. And if we ever need some new parameter on how a byte array is converted to a Number it can be done specifically in the implementation and the base class doesn't have to change at all.

I've implemented it as you described now. It is much cleaner now I think. Unfortunately rebasing onto Lola's branch has really muddied up the diff when it comes to review.

...dil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/PackedBinaryTraits.scala

stevedlawrence · 2024-12-13T16:49:57Z

...dil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/PackedBinaryTraits.scala

@@ -26,6 +26,8 @@ import org.apache.daffodil.lib.equality.TypeEqual
 import org.apache.daffodil.lib.exceptions.Assert


We should probably hold off until #1337 is merged, since that touches some of this code and is the bigger commit, easier to deal with conflicts with the smaller commit.

stevedlawrence · 2024-12-13T16:50:49Z

daffodil-test/src/test/scala/org/apache/daffodil/section13/packed/TestPacked.scala

@@ -100,4 +100,7 @@ class TestPacked {
  @Test def testDelimitedIBM4690DecSeqUnparser(): Unit = {
    runner.runOneTest("DelimitedIBM4690DecSeqUnparser")
  }
+
+  // Daffodil-2961
+  @Test def testBCDBigDecimalError(): Unit = { runner.runOneTest("bcdBigDecimalError") }


Can you add some out of range tests as well? That's a new limitation that this will enforce that I think wasn't previously enforced.

… Binary Integers - add tests for 0 bit length and multiple of 4 check - add runtime multiple of 4 length check Deprecation/Compatibility Although still supported via the tunable allowSignedIntegerLength1Bit, it is recommended that schemas be updated to not depend on 1-bit signed binary integers DAFFODIL-2297

- add test for verification DAFFODIL-1842

Bumps fedora from 40 to 41. --- updated-dependencies: - dependency-name: fedora dependency-type: direct:production update-type: version-update:semver-major ... Also removes the "dnf update nodejs". That was needed because the pinned version of nodejs in Fedora 40 was broken. Fedora 41 ships with a newer version of nodejs that includes a fix so this update is no longer needed. Signed-off-by: dependabot[bot] <[email protected]>

Tests show that parse of negative representations (from packed decimal reps) does not detect the error when decimalSigned="no". Test aalso show that unparse of negative values into binary, packed, and bcd all are broken. This doesn't fix the issue. Just adds tests to illustrate the failing cases. DAFFODIL-2957

DAFFODIL-2797

Moved TestUtils and FuzzData in core to src/main so FuzzData can be used outside of daffodil proper. Removed unused and trivial methods from TestUtils. Made private all methods possible. We're left with minimum footprint of public methods. Removed dependency on JUnit from TestUtils and FuzzData so they don't have to be in src/test. Tested for and removed a couple of TODOs that are no longer needed. Reducing the number of TODOs and FIXMEs is DAFFODIL-1569. DAFFODIL-2959, DAFFODIL-1569

mbeckerle

+1
I don't have much to add beyond finding one typo.

mbeckerle · 2024-12-18T14:21:45Z

daffodil-test/src/test/resources/org/apache/daffodil/section13/packed/packed.tdml

+
+  <!-- Daffodil 2961 -->
+  <tdml:parserTestCase name="bcdBigDecimalError" root="bcdBigDec" model="bcdErr"
+    description="Verify that usign packed values in an expression work correctly">


usign => unsigned

Rebased on PR apache#1337

olabusayoT added 7 commits November 13, 2024 14:44

Parse Error on Out of Range Binary Integers

e88324f

- add checks for signed/unsigned in BinaryIntegerBaseParser and BinaryIntegerBaseUnparser - add tests for parse and unparse - update failing tests DAFFODIL-2297

fixup! fixup! fixup! fixup! Parse Error on Out of Range Binary Integers

ecd1f45

- added test for parsing from -1 to 1 bit - added checks for signed/unsigned minWidth to decimal with tests DAFFODIL-2297

fixup! fixup! fixup! fixup! Parse Error on Out of Range Binary Integers

a67fe6d

- added more decimal tests - updated PE to be dynamic for signed and unsigned packed integers - added PE for packed decimal with length 0 DAFFODIL-2297

stevedlawrence requested changes Dec 13, 2024

View reviewed changes

stevedlawrence mentioned this pull request Dec 13, 2024

Parse Error on Out of Range Binary Integers #1337

Open

olabusayoT and others added 8 commits December 13, 2024 14:24

Verify overlap check doesn't occur with default format

50af13f

- add test for verification DAFFODIL-1842

Add final

ea91e2b

DAFFODIL-2797

Convert BigInt/Decimal to expected types for packed numbers

95f1129

fixup!

9cde7f5

mbeckerle approved these changes Dec 18, 2024

View reviewed changes

Address some comments from review.

416bf4f

Rebased on PR apache#1337

jadams-tresys force-pushed the 2961-BCD_Long branch from 26912c9 to 416bf4f Compare December 18, 2024 20:51

Refactored PackedBinaryConversion trait

8b0b392

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert BigInt/Decimal to expected types for packed numbers #1391

Convert BigInt/Decimal to expected types for packed numbers #1391

jadams-tresys commented Dec 13, 2024

stevedlawrence Dec 13, 2024

stevedlawrence Dec 13, 2024

jadams-tresys Dec 13, 2024

stevedlawrence Dec 13, 2024

jadams-tresys Dec 18, 2024

stevedlawrence Dec 18, 2024

jadams-tresys Dec 18, 2024

stevedlawrence Dec 18, 2024

jadams-tresys Dec 18, 2024

stevedlawrence Dec 18, 2024

jadams-tresys Dec 18, 2024

stevedlawrence Dec 18, 2024

jadams-tresys Dec 19, 2024

stevedlawrence Dec 13, 2024

stevedlawrence Dec 13, 2024

mbeckerle left a comment

mbeckerle Dec 18, 2024

		@@ -26,6 +26,8 @@ import org.apache.daffodil.lib.equality.TypeEqual
		import org.apache.daffodil.lib.exceptions.Assert

Convert BigInt/Decimal to expected types for packed numbers #1391

Are you sure you want to change the base?

Convert BigInt/Decimal to expected types for packed numbers #1391

Conversation

jadams-tresys commented Dec 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mbeckerle left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment