-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parser Fixes #128
Merged
Merged
Parser Fixes #128
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
During the recursive TextBlock instatiation and token move operation, we may have to navigate to back to our parent block. At this time this block is not yet completely created and may contain empty leading subblocks. We must be able to deal with this inconsistent state. We do now skip over these empty blocks. The parser will no longer crash. Unfortunately, I am not absolutely sure that this is the correct fix in all cases. It may potentially happen that we do compute the wrong original version. Further investigations / reviews are therefore needed.
ANTLR will return a fixed EOF Token. We have to convert it to our custom token implementation. Otherwise, we will crash if our root block is empty / contains only omitted tokens.
We have to use the absolut offset to completely replace all lexed tokens with whitespace.
The incremental parser will now consider less TextBlocks to be equal. This fixes some problems, but we may re-use fewer elements than we used to! There might be a better way to fix the problem described below. We might have considered TextBlocks to be equal which have different number of tokens. Those tokens then ended up as a duplicate in the re-used TB, resulting in an invalid TB model. We only check for re-usable tokens on the current sub-node level (the level where the lexer created these new tokens) and might then confuse tokens to be equal, though these do actually belong to different sub-blocks in a correctly nested TB tree. There seems to be no obvious way to remove these outdated, remaining tokens. This fixes the problems observed in TestParsingScenarios# testReparsingWithoutModiication*();
We do now inspect a replace operation and check for the actual textual differences. Only these differences will be applied to the textblocks model. Subdiffs leverage our re-use capabilites during incremental parsing. For example, when a text region is replaced with a very similar text, we can now re-use many tokens/textblocks. Before, the whole text range (including the corresponding model elements) had to be re-created.
This commit improves upon 883e78e and also consumes omitted tokens in empty regions without any lexed tokens.
4 new incremental parser tests are now failing. They create a new root object though it can be expected that they reuse the old one.
We do now correctly handle the substring case where the start offset lies within another token (e.g., because some tokens where skipped due to lexing problems)
Correctly report expected tokens (if known). This fixes the strange "Found X but expected <BUG>" messages. Implement a workaround to make error locations reported by the lexer absolute. They used to be relative to the parsed region. This fixes issue #121.
The old implementation was only capable of re-evaluating property inits in non-empty conditions and alternatives. The new implementation is based on TcsUtil.wasExecuted() and does not suffer from this problem.
This will force that the referencedElements are always set. Before, it may have happened due to incremental parsing and token reuse that the reference was already resolved but the referencedElements of the tokens were empty.
There may be more than one reference to a single property within a textblock (e.g.,the auto-parenthesis within operator templates have the same sequence element as the property itself). We therefore need to find out if a property is still 'referenced' by any other token in the re-used TextBlock. Only if it is no longer referenced from anywhere, we are free to unset it on the corresponding modelelement.
When deciding whether we can stop lexing or if we have to keep going, we checked if we have two off-channel tokens at hand. Having two tokens of this same time implied that our re-lexed stream converged with the unmodified tokens. This checked failed when we commented a line. We stopped lexing too early and did not properly lex the last token at the end of the commented line. This commit fixes this by checking the more fine grained token type.
We do now consider more TextBlocks to be equal and thus improve our re-use capabilities. Trick is to check if all old and new tokens within a block correspond to each other.
I cannot say for sure which block should actually be reused. I've therefore decided to simplify the testcase. That way we can at least assert our current behaviour.
Also check for the equality of the sequence elements instead of the mere equality of token values.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I've been working on some bugs deep down in our incremental parser and would be glad if someone could review my changes.
For more information, see ticket #113 and the hudson job of this branch.
Thanks,
Stephan