-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#992: added design doc for script options #477
base: master
Are you sure you want to change the base?
Conversation
3522853
to
0a2684d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you create these? If it is drawio, please use the extension drawio.png otherwise it is not clear how to change them and the tooling won't recognize them.
|
||
#### Legacy Parser | ||
|
||
The legacy parser (V1) parser searches for one specific script option. The parser starts from beginning of the script code. If found, the parser immediately removes the script option from the script code and returns the option value. It validates the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a narrow interpretation of arc42, the removal would be part of the runtime view and not the building blocks.
|
||
Tags: V2 | ||
|
||
### General Parser Integration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for the DB
b6dca79
to
207e27f
Compare
@@ -0,0 +1,27 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should at some Point replace this with a nox session, please create an issue
## Constraints | ||
|
||
- The parser implementation must be in C++. | ||
- The chosen parser implementation is [ctpg](https://github.com/peter-winter/ctpg), which supports the definition of Lexer and Parser Rules in C++ code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a constraint, but a decision that follows from the constraints
|
||
![Components](diagrams/OveralScriptOptionalsBuildingBlocks.drawio.png) | ||
|
||
At the very high level there can be distinguished between the generic "Script Options Parser" module which parses a UDF script code and returns the found script options, and the "Script Options Parser Handler" which converts the Java UDF specific script options. In both modules there are specific implementation for the legacy parser and the new CTPG based parser. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add reason, why we need to implementation
|
||
### Script Options Parser | ||
|
||
The parser component can be used to parse any script code (Java, Python, R) for any script options. It provides simplistic interfaces, which are different between the two versions, which accept the script code as input and return the found script option(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add reason, why the interfaces are different. Here, they work inherently different
|
||
As the parser needs to find script options in any given script code, the generated parser must accept any strings which are not script options and ignore those. In order to achieve this, the lexer rules need to be as simple as possible, in order to avoid collisions. | ||
|
||
It is important to emphasize that in contrast to the legacy parser, the caller is responsible for removing the script options from the script code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a good reason?
|
||
![LegacyParserHandler](diagrams/LegacyParserHandler.drawio.png) | ||
|
||
The `ScriptOptionsLinesParserLegacy` class uses the Parser to search for Java specific script options and forwards the found options to class `ConverterLegacy`, which uses a common implementation for the conversion of the options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The `ScriptOptionsLinesParserLegacy` class uses the Parser to search for Java specific script options and forwards the found options to class `ConverterLegacy`, which uses a common implementation for the conversion of the options. | |
The `ScriptOptionsLinesParserLegacy` class uses the Parser to search for Java specific script options and forwards the found options to the class `ConverterLegacy`, which uses a common implementation for the conversion of the options. |
common implementation is misleading, because it suggests both handler use it
|
||
![CTPGParserHandler](diagrams/CTPGParserHandler.drawio.png) | ||
|
||
The `ScriptOptionsLinesParserCTPG` class uses the new CTPG basedParser to search for **all** Java specific script options at once. Then it forwards the found options to class `ConverterV2`, which uses a common implementation for the conversion of the options. `ConverterV2` also implements the functions to convert Jvm otions and JAR options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The `ScriptOptionsLinesParserCTPG` class uses the new CTPG basedParser to search for **all** Java specific script options at once. Then it forwards the found options to class `ConverterV2`, which uses a common implementation for the conversion of the options. `ConverterV2` also implements the functions to convert Jvm otions and JAR options. | |
The `ScriptOptionsLinesParserCTPG` class uses the new CTPG based Parser to search for **all** Java specific script options at once. Then it forwards the found options to class `ConverterV2`, which uses a common implementation for the conversion of the options. `ConverterV2` also implements the functions to convert Jvm otions and JAR options. |
common again
`ScriptOptionsLinesParserCTPG` uses an instance of `ScriptImporter` to import foreign scripts. Because the new parser collects all script options at once, but backwards compatibility with existing UDF scripts must be ensured, there is an additional level of complexity in the import script algorithm. The algorithm is described in the following pseudocode snippet: | ||
``` | ||
function import(script_code, options_map) | ||
import_option = options_map.find("import") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import_option = options_map.find("import") | |
import_options = options_map.find("import") |
Should this be plural?
static void doSomething() {} | ||
} | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add the expected results
|
||
### Parser Implementation V1 | ||
|
||
The legacy parser (V1) parser searches for one specific script option. The parser starts from beginning of the script code. If found, the parser immediately removes the script option from the script code and returns the option value. It validates the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The legacy parser (V1) parser searches for one specific script option. The parser starts from beginning of the script code. If found, the parser immediately removes the script option from the script code and returns the option value. It validates the | |
The legacy parser (V1) parser searches for one specific script option. The parser starts from the beginning of the script code. If found, the parser immediately removes the script option from the script code and returns the option value. It validates the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rest of the sentence is missing
### Lexer and Parser Rules Option | ||
`dsn~lexer-parser-rules~1` | ||
|
||
Lexer and Parser rules to recognize `%optionKey`, `optionValue`, with whitespace characters as separator. The Parser rules will define the grammar to correctly identify Script Options, manage multiple options with the same key, and handle duplicates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mention simple lexer rules to deal with whitespace and escaping correct
### Ignore lines without script options | ||
`dsn~ignore-lines-without-script-options~1` | ||
|
||
In order to avoid lower performance compared to the old implementation, the parser must run only on lines which contain a `%` character. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to avoid lower performance compared to the old implementation, the parser must run only on lines which contain a `%` character. | |
In order to avoid lower performance compared to the old implementation, the parser must run only on lines which contain a `%` character after only whitespaces. |
|
||
Tags: V2 | ||
|
||
### Lexer and Parser Rules Whitespace Sequences |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to continue here
related to exasol/script-languages-release#992