From 4258dccb8ec49bbcfbb4c7d9de475ef13dc2b5cf Mon Sep 17 00:00:00 2001 From: oe Date: Fri, 20 Aug 2010 13:57:09 +0000 Subject: [PATCH] --- README | 482 --------------------------------------------------------- 1 file changed, 482 deletions(-) delete mode 100644 README diff --git a/README b/README deleted file mode 100644 index 81aaca3..0000000 --- a/README +++ /dev/null @@ -1,482 +0,0 @@ ------------------------------------------------------------------------------- -LinGO Grammar Matrix - -$Id: README,v 1.13 2008-05-23 01:44:21 sfd Exp $ - -NB: This version of the Matrix requires LKB version -$Date: 2008-05-23 01:44:21 $ or later. - -Changes since v 0.6: - -1. Head types - - We had previously hesitated to posit head types, as we - expect the exact subhierarchy under the type head to be - language-specific, even for those head types that are - found cross-linguistically. However, this hesitation - was hampering our ability to develop lexical types for - the Matrix. In this version, we have included 10 basic - head types, as well as types for all possible - disjunctive combinations for the basic 10. See further - documentation near 'head' in matrix.tdl. Note that the - disjunctive types are stored in a separate file - head-types.tdl. - - We still have not posited any head features. In order - to add features to one of the head types (disjunctive - or otherwise) already defined in the Matrix, you'll - need to use the new 'type addendum' syntax. (This is - why a current LKB is required.) Type addenda can be - used to add information (not override information) - associated with previously defined types. They have :+ - in place of :=. Type addenda can add parent types, - constraints, or documentation strings, as long as they - are consistent with the existing definition. As with - subtypes, it is good practice to refrain from stating - redundant information on type addenda. The LKB will - print an error if a parent is declared redundantly. No - such checking exists for constraints. - - NB: The :+ syntax is not yet supported in the PET - system. - -2. Lexical types (mostly from v 0.7) - - The Matrix now has a set of underspecified lexical types. - Lexical entries are classified along multiple dimensions, - including part-of-speech and subcategorization. Particular - grammars can cross-classify these types to create the - lexical types they need. We're particularly interested to - learn about cases where additional types parallel to those - posited in this part of the Matrix are required. - -3. anti-synsem - - The type anti-synsem is still present, but no longer explicitly - used in the Matrix. (In previous versions of the Matrix, - the mother of the head-subj phrase was SUBJ < anti-synsem > - instead of SUBJ < >. This and similar constraints are part - of English-specific analyses.) - -4. COMPS and the head-subject phrase - - In response to Ellingsen 2003, we have removed the requirement - that complements be realized before subjects cross-linguistically. - We expect this to be true in many, but not all, languages. - -5. MC and ROOT - - Again in response to Ellingsen 2003, we have removed many - of the constraints on the value of MC (`main clause'). - This feature is meant to be used for phenomena which are - restricted to either main (MC +) or subordinate (MC -) - clauses. The particular constraints present in earlier versions - of the Matrix were part of English-specific analyses. - - The feature ROOT has been removed. Its intended purpose - was redundant with the combination of MC and root conditions. - -6. Changes to the script - - The script (lkb/script) has been changed somewhat, including: - - -- (index-for-generator) is run at load time, - since most Matrix grammars are small and since - it is now relatively easy to create grammars which - can be used for parsing and generation. We have - found that the generator can be very useful in - detecting flaws in a grammar. For a grammar with - a large lexicon, this may be too slow. If so, - just comment it out in the script. - - -- There is a short expression at the end which - can be used to change the default string (or indeed - list of strings) in the parse dialogue to something - appropriate for your language. Uncomment this expression - and customize it appropriately. - - -- The default script no longer uses a cache file for - the lexicon. For lexicons with >1000 words, the - cache may be a good idea. See the comments in the - script file for how to invoke caching. - -7. labels.tdl - - With the addition of head types, we are now able to include - a basic set of node labels. As noted in labels.tdl, these - are provided only for the convenience of the grammar developer, - and do not have any theoretical status. As such, even though - they are provided as part of the Matrix distribution, they - should be customized (edited) without hesitation. - -8. Other minor technical changes - - The value of INSTLOC is now string (formerly "instloc") for - compatibility with recent versions of the LKB. - - The head daughters of head-modifier phrases are no longer - required to have empty COMPS lists. Furthermore, additional - features are matched between modifiers' MOD values and - the head daughters' SYNSEMs. - - The feature ARG-S has been moved from the type local - to the type word-or-lexrule and renamed as ARG-ST to - differentiate it a bit better from ARGS. - -9. Some semantic types (adv-relation, prep-mod-relation, - verb-ellipsis-relation, and unspec-compound-relation) - have been removed, as these types did not add any features - nor express any constraints. In their place, existing - argn relations should be used, with appropriate PRED values. - adv-relation and verb-ellipsis-relation should be replaced - with arg1-ev-relation, prep-mod-relation with arg12-ev-relation - and unspec-compound-relation with arg12-relation. - ------------------------------------------------------------------------------- - -LinGO Grammar Matrix v 0.6, October 15, 2003 (dpf) - -This is a minor tuning of version 0.5, including a refinement of the KEYS -attributes, more normalization of predicate names (especially for messages), -some bug fixes in the syntactic rule schemata, and a few additional lexical -types. Details on these changes will be available in the soon-to-be-released -Matrix Users' Guide. - -For those who have already developed a grammar baaed on the Matrix, the -following changes will have to be made manually in your language-specific -files in order to make them consistent with this version: - -1. Changes in feature geometry - - a. SYNSEM.LOCAL.LKEYS ==> - SYNSEM.LKEYS - - The feature LKEYS has been moved up from LOCAL to SYNSEM, to shorten - this frequently mentioned path. (NB: As a related change, the type - 'local-basic' which formerly introduced LKEYS has been deleted.) - - b. LKEYS.--KEYREL ==> - LKEYS.KEYREL - - LKEYS.--ALTKEYREL ==> - LKEYS.ALTKEYREL - - These two attributes are the only pointers to relations in the RELS - list for lexical types, and since they are not shortcuts, the leading - hyphens have been dropped. (NB: Since the attributes --COMPKEY and - --OCOMPKEY are just shortcuts, the leading hyphens for these two names - remain as a reminder). - - c. CAT.HEAD.KEYS.MESSAGE ==> - CONT.MSG - - Since the message value of a headed phrase is not always identified - with that of its head daughter, it was an error to make the attribute - MESSAGE a head feature. This attribute is now moved to CONT, and its - name shortened for convenience to MSG. - - -2. Strings and symbols: RULE-NAME - - The type 'symbol', which along with 'string' was a subtype of 'atom', has - been dropped, since the distinction between symbols and strings is not - useful, and was a source of potential confusion. So any attributes whose - values were of type 'symbol' should be changed to be of type 'string', - and values assigned to these attributes should be converted accordingly. - In particular, in subtypes of 'rule', the value of the attribute RULE-NAME - should be changed to be enclosed in double quotes; e.g. - [ RULE-NAME 'subj-head ] ==> - [ RULE-NAME "subj-head" ] - -3. KEYS attributes - - The attributes KEY and ALTKEY can have as values subsorts of the type - 'predsort' (the same kinds of values allowed for the attribute PREDSORT - in semantic relations). These attributes enable a word or phrase to be - semantically selected by a predicate, and as head features they propagate - up from the lexical head of the phrase. For example, a verb can select - for a prepositional phrase headed by a particular preposition, as long as - the preposition has lexically assigned a specific value (a subtype of - predsort) to its SYNSEM.LOCAL.CAT.HEAD.KEYS.KEY attribute, and the verb - similarly constrains the KEY value of its PP complement (accessed via the - SYNSEM.LKEYS.--COMPKEY or --OCOMPKEY of the verb). Note that the values - of KEY and ALTKEY are of the same type as the values of the PRED attribute - within semantic relations, but it is not always the case that the KEY - value of a sign is identified with the PRED value of one of the relations - in its RELS list. Typically, closed-class lexical entries may identify - KEY and PRED values, but open-class lexical entries won't, since their - KEY value will be some underspecified subtype of 'predsort' (e.g. - 'noun_rel' or 'verb_rel') - -4. Relations and messages - - The revisions introduced in version 0.5 for improved MRSs have led to a - potential confusion in naming of relations and their PRED values, so we - introduce a simple naming convention where all subtypes of the type - 'relation' bear the suffix "-relation" as part of their name, and all - values of the attribute PRED (subsorts of 'predsort') within relations - bear the suffix "_rel" as part of their name. - - In keeping with the reduction to a small number of subtypes of 'relation', - the value of the attribute MSG is now always the relation subtype 'message' - with appropriate values in the PRED attribute of the 'message' relation, - drawing from subtypes of the type 'predsort'. For example, in the type - 'imperative-clause', the following change has been made: - - imperative-clause := clause & - [ SYNSEM.LOCAL.CAT.HEAD.KEYS.MESSAGE command ]. ==> - - imperative-clause := clause & - [ SYNSEM.LOCAL.CONT.MSG.PRED command_m_rel ]. - -5. Rules - - This version incorporates several corrections and improvements to the - definitions of lexical and syntactic rules proposed by colleagues working - on the Japanese and Norwegian grammars, as follows: - - a. In the definition of 'lex-rule', the order of appending of the RELS - lists has been reversed, for convenience. - b. The type 'basic-head-subj-phrase' no longer inherits from the type - 'head-compositional' - this was an error preventing coherent MRSs. - c. The type 'basic-extracted-comp-phrase' no longer identifies the LEX value - of mother and daughter - this too was an error making the rule unusable. - d. The type 'basic-head-mod-phrase-simple' no longer identifies the value - of HOOK on mother and nonhead daughter, since this is no longer uniform - for scopal and intersective modifiers. Instead this identification is - done in the type 'scopal-mod-phrase'; in contrast, the type - 'isect-mod-phrase' now inherits from 'head-compositional', identifying - the HOOK values of mother and head daughter. - e. In a related change, the type 'extracted-adj-phrase' is now restricted - to extracting intersective modifiers, so that the value of HOOK can be - correctly constrained. - -6. Lexeme types - - In order to capture the usual configuration of semantic constraints for - open-class lexical entries, the types 'lex-item', 'norm-lex-item', and - 'lexeme' have been added. Some closed-class lexical entries, like those - for determiners in English, do not conform to the constraints in - 'norm-lex-item', but most lexical entries will. We further add the - constraint that the outputs of lexeme-to-lexeme rules will conform to the - constraints in 'norm-lex-item'. We look forward to feedback, as always. - ------------------------------------------------------------------------------- -Grammar matrix v 0.5, August 15, 2003 (dpf) - -This is an upgrade of version 0.4 of the grammar matrix, with some -further normalization of relation names and MRS feature geometry to be -consistent with the Copestake et al. paper, "Introduction to MRS", -being readied for publication. - -If you have already developed a grammar based on the matrix, you will -need to make at least one set of manual adjustments to your -language-specific grammar files, since the location of the KEYS -attribute has changed, and the constraints on its attributes have also -changed. The KEYS attributes had been used in matrix-derived grammars -for two distinct purposes, first to simplify the notation when defining -lexical types, and second to express constraints on semantic selection -within phrases. The first usage was a convenient shorthand notation -which is irrelevant to phrasal signs, while the second is crucial in -constraining phrases. These two notions are now distinct in the -matrix, with the attribute LKEYS now containing these 'shorthand' -attributes convenient for defining lexical types, and the attribute -KEYS now made a HEAD feature. The attributes in KEYS are also more -strictly constrained, with KEY and ALTKEY no longer taking whole -relations as values, but only semantic sorts (see the User Guide for -elaboration). Likewise, the MESSAGE attribute now simply takes a -'message' type (or the distinguished type 'no-msg') as its value, -rather than a difference list. - -Obligatory changes to make to language-specific grammar files: - -(1) Where your grammar used the KEY and ALTKEY attributes to constrain - the properties of a selected constituent (complement, specifier, - subject, or modifier), change these values of KEYS.KEY and - KEYS.ALTKEY to be subtypes of the type 'semsort'. See the User - Guide for elaboration. -(2) Change these paths for SYNSEM.LOCAL.KEYS.KEY and ...ALTKEY to be - SYNSEM.LOCAL.CAT.HEAD.KEYS.KEY and ...ALTKEY -(3) Where your grammar used the KEY and ALTKEY attributes to constrain - the value of a lexical type's own semantic relations, change these - paths for SYNSEM.LOCAL.KEYS.KEY and ...ALTKEY to be - SYNSEM.LOCAL.LKEYS.--KEYREL and ...--ALTKEYREL -(4) Change the value of KEYS.MESSAGE by removing the diff-list brackets. -(5) Change the paths SYNSEM.LOCAL.KEYS.MESSAGE to - SYNSEM.LOCAL.CAT.HEAD.KEYS.MESSAGE -(6) Change the values for --COMPKEY and --OCOMPKEY to be the semantic - sort of the relevant complement, rather than the type of a relation - (again, see the User Guide for elaboration of semantic sorts). -(7) Change the paths SYNSEM.LOCAL.KEYS.--COMPKEY and ...--OCOMPKEY to - SYNSEM.LOCAL.LKEYS.--COMPKEY and ...--OCOMPKEY - -In addition, you may need to make further adjustments, depending on -whether you have made explicit reference to the affected features or -types, which have been changed as follows: - -(a) Deleted feature - - The feature E-INDEX was introduced into the matrix for v 0.4,based - on its use in the ERG at the time for treating the semantics of - predicative PPs and gerunds. However, improved analysis of English - has removed the current motivation for this attribute in HOOK, so - it has been deleted from the matrix in order to be consistent with - the emerging MRS documentation. - -(b) Renaming of type 'mrs-thing', and changes to its subtypes - - The name of the supertype of 'individual' and 'handle' has been - renamed from 'mrs-thing' to 'semarg' (for 'semantic argument'). - Also, one of its subtypes 'non-expl' has been deleted, since it was - confusingly redundant with the type 'event-or-ref-index'. - Corresponding adjustments have been made to the type hierarchy under - 'semarg', though the leaf types remain the same. - -(c) Renaming of other relations - - To support a more consistent naming convention for relations, any - relation or predicate whose name formerly ended in "-rel" now has a - name which is like the previous one except that the hyphen ("-") is - always replaced with an underscore ("_"). An explanation of the - naming conventions can be found in the Matrix User Guide. - ------------------------------------------------------------------------------- -Grammar matrix v 0.4, March 10, 2003 - -This is a minor upgrade of the first version of the grammar matrix -(v 0.3), designed to standardize the feature geometry and naming -conventions for MRS feature structures, and to enable stronger -principles of semantic composition, as presented in Copestake, -Lascarides, and Flickinger (2001). - -If you have already developed a grammar based on the matrix, you will -need to make the following manual adjustments to your language-specific -grammar files: - -(1) Renamed features - Summary: Naming conventions now made consistent with soon-to-be-published - standard reference on MRS. - Recommended procedure: Do a global replace for each of the following in - all of your *.tdl files: - LISZT --> RELS - H-CONS --> HCONS - TOP --> LTOP - HNDL --> LBL - SC-ARG --> HARG - OUTSCPD --> LARG - SOA --> MARG - RESTR --> RSTR - BV --> ARG0 - EVENT --> ARG0 - INST --> ARG0 - LABEL --> WLINK - -(2) Introduction of HOOK attribute - Summary: The externally visible attributes of an MRS are now grouped - within a single attribute called HOOK, which is consistently used in - constructions to identify the properties of the semantic head daughter - with those of the phrase. The features in HOOK include the familiar - LTOP (formerly TOP), INDEX, and E-INDEX, as well as a new feature XARG - which is unified with the semantic index of the controlled argument of - a phrase (to simplify the definition of e.g. equi and raising types) - Recommended procedure: In each of your *.tdl files, search for each - occurrence of the three features LTOP, INDEX, and E-INDEX, and insert - HOOK into the path preceding each feature. In some cases, you will see - that you can simplify the re-entrancies in your feature structures by - referring to HOOK instead of individually referring to each of the three - attributes separately. In addition, consider revising your lexical types - for equi and raising predicates to make use of the new XARG feature, - which should enable you to avoid reference to arguments of arguments. - -(3) Naming of argument roles (ARG1, ARG2, ARG3, ARG4) - Summary: Each relation now assigns its first (least oblique) argument - to ARG1, its next argument to ARG2, and so on. The major change from - the first version of the matrix is to assign objects of transitive verbs - to ARG2 rather than ARG3, and similarly for objects of prepositions. - Recommended procedure: In each of your *.tdl files, search for ARG3, and - consider replacing it with ARG2. Check all other role name assignments - to ensure that role names are assigned consistently. - -(4) Basic relation types - Summary: The inventory of basic relation types has been simplified. - Recommended procedure: Review the subtypes that your grammar defined for - the original basic relation types, and revise them to employ the new - relation types, consistent with the changes made in step (3) above. Note - that a basic relation type has been added for quantifiers: quant-rel. - -(5) Deleted features (--TOPKEY) - Summary: Some semantics-related features proved to be unnecessary - Recommended procedure: None required, unless your grammar makes use of - the feature --TOPKEY, in which case you may choose to introduce this - feature as part of your language-specific inventory of features. - - ------------------------------------------------------------------------------- -[Original notes for v 0.3] - -This is an extremely preliminary first cut at the grammar -matrix. It has not been tested except by being loaded into -the LKB. It contains the following: - --- basic types which define the feature geometry --- types for MRS semantics --- underspecified supertypes of lexical rules --- underspecified supertypes of phrase structure rules - -Of these, the last were the most hastily thrown together. -They are basically taken from the syntax.tdl file of the LinGO -English grammar, and then simplified by removing constraints -that are either likely to be specific to English or are -related to the LinGO analysis of coordination. - -These phrase structure rule types are of necessity underconstrained -and merely instantiating them will surely lead to a grammar -with gross overgeneration. Thus, it is expected that they -will be either augmented directly, or via subtypes that fill -in some of the missing constraints. One clear example of this -is the lack of constraints on the HEAD values in phrase structure -rules. Since it's not clear what the appropriate 'universal' -head type hierarchy will/could be, I've refrained from even -defining types like 'verbal'... - -Similarly, certain parts of the type hierarchy might need to be -modified. In an ideal world, the matrix type hierarchy would only -need to be extended at the bottom for individual grammars. However, -it is not clear that this is possible or desirable even in principle, -and it is certainly not the case for this preliminary first version! -(Defaults may help here...) - -The single biggest gap in the matrix is the utter lack of -lexical types. I hope that it can be useful even with this -huge lacuna. Since the matrix holds so closely to the LinGO -grammar, the lexical types of the LinGO grammar should be -used as models for creating lexical types. Note in particular -that the rules assume lexical threading of NON-LOCAL features. -Beware that some feature names (notably HNDL) and many type names -differ between the matrix and the LinGO grammar, even when they -are logically and mnemonically related. - -Future versions of the matrix should include further documentation -as well as more types (especially lexical types). Revisions to -the types included in this version should also be anticipated, -since it seems extremely unlikely that this first guess as to -what's universally useful will turn out to be entirely correct. - -Stephan Oepen has kindly cleaned up the collateral .lsp files -included in this distribution of the matrix. Take a look at -lkb/script for information on how various files (including .tdl -files) are included, and which aspects of the grammar should -be encoded in which .tdl files. - -April 5, 2002 (erb) - -Added improvements to supertypes for lexical rules. "Derivational" -lexical rules are now lexeme-to-lexeme, rather than word-to-word -(a hold-over from PAGE). Lexeme-to-lexeme rules can be spelling -changing, and apply _inside_ lexeme-to-word, or inflectional rules, -as expected. - -June 18, 2002 (oe) - -Fix generator support (by adding a suitable `mrsglobals.lsp'); more -cleaning up of `script' and related collateral files.