Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate assert/discriminator expressions after groupContent: WIP #987

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jadams-tresys
Copy link
Contributor

According to 9.5 of the DFDL spec assert and discriminators with expressions should be processed after the content of their enclosing sequence, group, or choice. Before these expressions were always being processed before the content.

This commit also moves the setVariable expression evaluaiton to the correct place, which is before the enclosing group.

DAFFODIL-1971, DAFFODIL-1590

@jadams-tresys
Copy link
Contributor Author

jadams-tresys commented Mar 13, 2023

This is primarily WIP as it needs more test coverage and I have been informed that I need to hold off on working on this for a little while.

It is fully functional, just needs more tests for edge cases surrounding evaluating expressions after the content of sequences, groups, and choices.

Also need to add tests for the setVariable change.

@stevedlawrence stevedlawrence marked this pull request as draft September 18, 2024 11:27
Josh Adams added 3 commits January 3, 2025 10:28
According to 9.5 of the DFDL spec assert and discriminators with
expressions should be processed after the content of their enclosing
sequence, group, or choice. Before these expressions were always being
processed before the content.

This commit also moves the setVariable expression evaluaiton to the
correct place, which is before the enclosing group.

DAFFODIL-1971, DAFFODIL-1590
Tests for the following:
 - Sequence body succeeds but discriminator fails
 - Sequence body fails and discriminator fails
 - Sequence body fails but discriminator succeeds and references an
   element in the partial sequence body infoset
 - Sequence body fails and discriminator fails and references an
   element in the partial sequence body infoset
Copy link
Member

@stevedlawrence stevedlawrence left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the same assert could be evaluated multiple times?

Also, have you run this against the regression suite? I wonder if there are any schemas that rely on the current behavior and could mess up how they use discriminators?

@@ -217,7 +217,7 @@ trait ProvidesDFDLStatementMixin extends ThrowsSDE with HasTermCheck {
final lazy val patternStatements: Seq[DFDLStatement] = patternAsserts ++ patternDiscrims

final lazy val lowPriorityStatements: Seq[DFDLStatement] =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea where "low priority" comes from? I'm not familiar with that as a DFDL term. Wondering if something like "expressionStatements" would be more clear?

new SeqCompParser(
context.runtimeData,
parserChildren.toVector,
assertExpressionChildren.toVector,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the paserChildren vector contains the assertExpressionChildren parsers, which I think means the assert parsers will be evaluated twice? Once when all the parserChildren are evaluated and again after the parserChildren finis and then we evaluate the asserts Parsers? Seems like parserChildren should not contain the assert children?

testAssert.foreach { ap =>
pstate.withTempSuccess(ap.parse1)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I guess I now see why it's okay for childrenParsers to also include the assertion parsers. If all the non-assert parsers succeed then we'll don't evaluate the assertion parsers.

However, I think there is still an issue. Say all the non-assertion parsers succeed, then we start evaluating the assertion parsers, and assume one of them fails. The pstate.processorStatus ne Success will be true, and then we'll evaluate all the assertion parsers, including ones we've already evaluated.

So I think this approach still feels like it needs a tweak. Almost feels like the assert parsers need to be completely separate from the other parsers or something.

setSuccess()
func(this)
if (processorStatus eq Success)
_processorStatus = priorProcessorStatus
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conditionally resetting back to the previous status feels like it might be confusing? Feels like if this is temporarily ignoring status it should always reset back to whatever the status was before. Maybe the way this wants to work is it it returns the status of func, and the caller can choose to do with it whatever they want? Something more like

def withTempSuccess(func: (PState) => Unit): ProcessorResult = {
  val priorStatus = processorStatus
  setSuccess()
  func(this)
  val funcStatus = processorStatus
  _processorStatus = priorStatus
  funcStatus
}

Also, Is there any value in passing in a lambda? An alternative would be to just pass in a Parser, and then instead of calling func(this), it could call parser.parse1(this). That ensures that this is always called with a Parser instead of a function that just happens to accept a PState, which is probably safer?

if (pstate.processorStatus ne Success)
if (pstate.processorStatus ne Success) {
if (testAssert.nonEmpty) {
testAssert.foreach { ap =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had another thought about this, this will evaluate all assertions even if one fails. Is that the correct behavior, or should it stop after the first assertion failure? Seems like if one assertion fails it shouldn't continue to evaluate other assertions?

@@ -89,10 +90,19 @@ class SeqComp private (context: SchemaComponent, children: Seq[Gram])
_.isInstanceOf[NadaParser]
}

lazy val assertExpressionChildren = parserChildren.filter {
_.isInstanceOf[AssertExpressionEvaluationParser]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the spec some more, it says:

an attempt to evaluate a discriminator MUST be made even if preceding statements or the parse of the schema component ended in a Processing Error.

I think this implies asserts should not be evaluated if the preceding statements fails? Which kindof makes sense since an assert only causes backtracking and does not discriminate points of uncertainty?

So it feels like the logic needs to differentiate between asserts and discriminators? Both should be evaluated at the end, but only discriminators should be evaluated if the prior parses succeed?

Another question, are discriminators always evaluated before asserts? Or are they evaluated depending on the order defined in the DFDL schema? For example, if you have:

<xs:appinfo source="http://www.ogf.org/dfdl/">
  <dfdl:assert test="..." />
  <dfdl:discriminator test="..." />
  <dfdl:assert test="..." />
</xs:appinfo>

Should that evaluate the first assert, then the discriminiator, then second assert. I don't know if the spec clarifies that or if it's implied, but I think we always evaluate asserts before discrims regardless of how they are defined in the schema?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants