-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluate assert/discriminator expressions after groupContent: WIP #987
base: main
Are you sure you want to change the base?
Conversation
This is primarily WIP as it needs more test coverage and I have been informed that I need to hold off on working on this for a little while. It is fully functional, just needs more tests for edge cases surrounding evaluating expressions after the content of sequences, groups, and choices. Also need to add tests for the setVariable change. |
According to 9.5 of the DFDL spec assert and discriminators with expressions should be processed after the content of their enclosing sequence, group, or choice. Before these expressions were always being processed before the content. This commit also moves the setVariable expression evaluaiton to the correct place, which is before the enclosing group. DAFFODIL-1971, DAFFODIL-1590
Tests for the following: - Sequence body succeeds but discriminator fails - Sequence body fails and discriminator fails - Sequence body fails but discriminator succeeds and references an element in the partial sequence body infoset - Sequence body fails and discriminator fails and references an element in the partial sequence body infoset
94fe3ae
to
ef3ec79
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the same assert could be evaluated multiple times?
Also, have you run this against the regression suite? I wonder if there are any schemas that rely on the current behavior and could mess up how they use discriminators?
@@ -217,7 +217,7 @@ trait ProvidesDFDLStatementMixin extends ThrowsSDE with HasTermCheck { | |||
final lazy val patternStatements: Seq[DFDLStatement] = patternAsserts ++ patternDiscrims | |||
|
|||
final lazy val lowPriorityStatements: Seq[DFDLStatement] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea where "low priority" comes from? I'm not familiar with that as a DFDL term. Wondering if something like "expressionStatements" would be more clear?
new SeqCompParser( | ||
context.runtimeData, | ||
parserChildren.toVector, | ||
assertExpressionChildren.toVector, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the paserChildren
vector contains the assertExpressionChildren
parsers, which I think means the assert parsers will be evaluated twice? Once when all the parserChildren
are evaluated and again after the parserChildren finis and then we evaluate the asserts Parsers? Seems like parserChildren should not contain the assert children?
testAssert.foreach { ap => | ||
pstate.withTempSuccess(ap.parse1) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I guess I now see why it's okay for childrenParsers
to also include the assertion parsers. If all the non-assert parsers succeed then we'll don't evaluate the assertion parsers.
However, I think there is still an issue. Say all the non-assertion parsers succeed, then we start evaluating the assertion parsers, and assume one of them fails. The pstate.processorStatus ne Success
will be true, and then we'll evaluate all the assertion parsers, including ones we've already evaluated.
So I think this approach still feels like it needs a tweak. Almost feels like the assert parsers need to be completely separate from the other parsers or something.
setSuccess() | ||
func(this) | ||
if (processorStatus eq Success) | ||
_processorStatus = priorProcessorStatus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conditionally resetting back to the previous status feels like it might be confusing? Feels like if this is temporarily ignoring status it should always reset back to whatever the status was before. Maybe the way this wants to work is it it returns the status of func
, and the caller can choose to do with it whatever they want? Something more like
def withTempSuccess(func: (PState) => Unit): ProcessorResult = {
val priorStatus = processorStatus
setSuccess()
func(this)
val funcStatus = processorStatus
_processorStatus = priorStatus
funcStatus
}
Also, Is there any value in passing in a lambda? An alternative would be to just pass in a Parser
, and then instead of calling func(this)
, it could call parser.parse1(this)
. That ensures that this is always called with a Parser
instead of a function that just happens to accept a PState
, which is probably safer?
if (pstate.processorStatus ne Success) | ||
if (pstate.processorStatus ne Success) { | ||
if (testAssert.nonEmpty) { | ||
testAssert.foreach { ap => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just had another thought about this, this will evaluate all assertions even if one fails. Is that the correct behavior, or should it stop after the first assertion failure? Seems like if one assertion fails it shouldn't continue to evaluate other assertions?
@@ -89,10 +90,19 @@ class SeqComp private (context: SchemaComponent, children: Seq[Gram]) | |||
_.isInstanceOf[NadaParser] | |||
} | |||
|
|||
lazy val assertExpressionChildren = parserChildren.filter { | |||
_.isInstanceOf[AssertExpressionEvaluationParser] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading the spec some more, it says:
an attempt to evaluate a discriminator MUST be made even if preceding statements or the parse of the schema component ended in a Processing Error.
I think this implies asserts should not be evaluated if the preceding statements fails? Which kindof makes sense since an assert only causes backtracking and does not discriminate points of uncertainty?
So it feels like the logic needs to differentiate between asserts and discriminators? Both should be evaluated at the end, but only discriminators should be evaluated if the prior parses succeed?
Another question, are discriminators always evaluated before asserts? Or are they evaluated depending on the order defined in the DFDL schema? For example, if you have:
<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:assert test="..." />
<dfdl:discriminator test="..." />
<dfdl:assert test="..." />
</xs:appinfo>
Should that evaluate the first assert, then the discriminiator, then second assert. I don't know if the spec clarifies that or if it's implied, but I think we always evaluate asserts before discrims regardless of how they are defined in the schema?
According to 9.5 of the DFDL spec assert and discriminators with expressions should be processed after the content of their enclosing sequence, group, or choice. Before these expressions were always being processed before the content.
This commit also moves the setVariable expression evaluaiton to the correct place, which is before the enclosing group.
DAFFODIL-1971, DAFFODIL-1590