-
Notifications
You must be signed in to change notification settings - Fork 0
Parser
The parser logic has been implemented based on the fundamental knowledge gained from the course Compiler Design. Regardless, it was necessary to tailor some of the learned techniques to our needs. This documentation gives in particular a introduction into the deflexion of the default approach. First, we name the important classes and their basic functionality. Thereafter, we show where we had to change the prototype-implementation presented in various lectures. At last, we'll outline how the interaction between the modules happens and how a typical application of the parser works. Long story short, we've successfully implemented a SLR parser
There exist several important packages and classes, each of them providing a special kind of functionality.
- swp_compiler_ss13.javabite.parser.grammar.Grammar<T,NT> is the most important class of the parser. This class represents a grammar with T as terminals and NT as NonTerminals, respectively. The implementation provides functions like closure-computation, first- and follow-set determination.
- swp_compiler_ss13.javabite.parser.targetgrammar.TargetGrammar is a special subclass of Grammar. This class represents the exact programmatic translation of the grammar of interest.
- swp_compiler_ss13.javabite.parser.grammar.SLRAutomaton is a automaton, which can derive the concrete derivation of a given word with a look ahead of 1.
- swp_compiler_ss13.javabite.parser.astGenerator.ASTGenerator has been developed to translate a derivation ( bottom-up-left-to-right). The result is a Abstract Syntax tree
Some approaches have to be fitted or completely replaced to use it in our implementation with the given grammar. We note some of them:
- It's not traceable to construct the ast when the translation is given in a right-most fashion. We had to translate the right-most derivation in the left-most derivation first in the TargetGrammar class.
- The SLRAutomaton had to be modified in a way, that the automaton is able to deal with epsilon productions. Another way is to remove the epsilon-productions from the grammar, what results in another grammar. We did want to keep the grammar. In the most common literature, the automaton has not to deal with epsilon productions since they are removed before. Thus, the automaton had to be adjusted.
- The closure, first- and follow set computation can be very complex for large grammars. We decide to differ from a pure functional implementation and implemented cached and/ or pre-computed values as often as possible. This decreases the readability but increases the performance dramatically.
A typical application can be drafted as follows:
- A word of the grammar is given in form of a sequence of tokens
- We abstract from the concrete token and just consider the type ( e.g. "myInt" and "myDouble" as a ID is treated the same way).
- We use the abstract token stream to get a concrete derivation of the word with the help of TargetGrammar.
- Luckily, we piggy-bagged the information in the abstract token. Now, we have the derivation with the concrete tokens ( e.g. the tokens "myInt" and "myDouble" is again distinguishable).
- We use the TargetGrammar to rotate the derivation to gain a left-to-right fashioned derivation.
- We use a semantic translation to construct the ast from the derivation in ASTGenerator.