-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for TDL :begin ... :end environments #183
Comments
Arguably those are not part of "DELPH-IN TDL" (as I'm calling the TDL subset in use by our grammars), despite the use of the I've come to understand TDL as a "modal" description language, as parts of the syntax can be turned on/off per file (e.g., lexical rules and morphological patterns), and the interpretation of the same TDL forms can differ depending on how the file was loaded (e.g., whether it is a type file or an instance file). In the LKB, these modes are determined by the Lisp function used to read the TDL file, whereas for PET they use the environments defined in K&S 1994b. ACE, and I believe agree, adopt the PET-style environments (presumably they are easier to parse than writing a full Lisp interpreter in C or C#). These things are perhaps the closest to Stephan's long-desired "universal configuration" for grammars, so I think we can adopt a subset of them for DELPH-IN TDL, and also then include them in PyDelphin. Indeed, this was part of my vague plan if I ever get type hierarchies (#93, #94) or unification working. The top definitions for environments in K&S 1994b is as follows: <start> -> { <block> | <statement> }*
<block> -> "begin" ":control" "." { <type-def> | <instance-def> | <start> }* "end" ":control" "."
| "begin" ":declare" "." { <declare> | <start> }* "end" ":declare" "."
| "begin" ":domain" <domain> "." { <start> }* "end" ":domain" "domain" "."
| "begin" ":instance" "." { <instance-def> | <start> }* "end" ":instance" "."
| "begin" ":lisp" "." { <Common-Lisp-Expression>}* "end" ":lisp" "."
| "begin" ":template" "." { <template-def> | <start> }* "end" ":template" "."
| "begin" ":type" "." { <type-def> | <start> }* "end" ":type" "." The only ones used in our grammars (having surveyed Jacy, ERG, and GG) are I also note that K&S 1994b uses The |
Great, thanks for the clear analysis. |
Ok, I think I'm happy to consider these part of DELPH-IN TDL. They are syntactically and semantically similar enough to the K&S 1994b definition of TDL, and used by enough processors, that it doesn't seem like a PET-specific feature (although I'm not prepared to include I've updated the wiki with a syntax description that includes only the forms I've seen, but it does allow type definitions to appear directly in the blocks and not just in included files, and similarly environment blocks and file includes can appear in any TDL file. Actually doing so, however, would break compatibility, so it should be recommended to keep the current convention of defining environments in top-level files. In PyDelphin, these are used similar to parsing XML elements in Python's xml.etree.ElementTree.iterparse() in that you'll see an event for the start of the environment when >>> from io import StringIO
>>> from delphin import tdl
>>> g = tdl.iterparse(StringIO('''
... :begin :type.
... t := a & b & [ ATTR "val" ].
... :include "file.tdl".
... :end :type.'''))
>>> event, env, lineno = next(g)
>>> event
'BeginEnvironment'
>>> env.entries
[]
>>> next(g)
('TypeDefinition', <TypeDefinition object 't' at 140577670422880>, 3)
>>> next(g)
('FileInclude', <delphin.tdl.FileInclude object at 0x7fdaca1be0b8>, 4)
>>> next(g)
('EndEnvironment', <delphin.tdl.TypeEnvironment object at 0x7fdaca1ca400>, 5)
>>> env.entries
[<TypeDefinition object 't' at 140577670422880>, <delphin.tdl.FileInclude object at 0x7fdaca1be0b8>] This should be sufficient for traversing through a grammar from its top-level TDL file. |
tdl is also used in the setup files for PET and ACE, with slightly different expectations.
PyDelphin cannot parse these files:
e.g. for zhong/cmn/zhs/zhs-pet-mal.tdl
delphin.exceptions.TdlParsingError: At ?:9 (type/rule definition)
Syntax error:
:begin :instance :status lexical-filtering-rule.
In an ideal world it would be nice to be able to parse these and use them to decide which files to parse to build a grammar model, ...
The text was updated successfully, but these errors were encountered: