Fill in unused identifiers on semantic structures #274

goodmami · 2020-01-20T02:37:58Z

For a long time PyDelphin has included on MRS, DMRS, and EDS a slot for the 'identifier' ('ident' in the original DTDs) field, which is basically unused. The field only gets filled in if it is encoded in a representation that is read in. There are few comments about it in the LKB code. Here's one from lingo/lkb/src/rmrs/dtd-notes.txt:

ident is an attribute on rmrs's to identify which utterance they
belong with. The HoG currently uses a wrapper around the RMRS, with
identifying information there instead. Hinoki uses the ident
identifier but may switch to a wrapper, in which case ident may be
removed. In any case it is optional.

(note that XMT uses the HoG strategy)

And in lingo/lkb/src/tsdb/lisp/redwoods.lisp, it seems to be formatted using a few other fields:

      for ident = (format nil "~a @ ~a~@[ @ ~a~]" i-id result-id i-comment)
      [...]
              (mrs::output-rmrs1
               (mrs::mrs-to-rmrs mrs)
               'mrs::xml out nil nil i-input ident)

And in lingo/lkb/src/rmrs/dmrs.lisp, it (as far as I can tell) uses the first column of a [incr tsdb()] file:

              (let ((scount (extract-fine-system-number fsout))
              [...]
                        (setf (dmrs-ident dmrs) (format nil "~A" scount))
[...]
(defun extract-fine-system-number (str)
  ;;; compare extract-fine-system-sentence
  (let ((apos (position #\@ str)))
        (if apos
            (parse-integer (subseq str 0 apos) :junk-allowed t))))

In PyDelphin, not all codecs can handle identifiers (the PENMAN ones don't, nor do any EDS ones). These identifiers could be useful for, e.g., exporting a corpus of *MRS representations which encode which items they came from.

It seems like the appropriate form of the identifier may depend on the task. In some cases, just an i-id from a profile would be enough, while for others a parse-id and result-id may be needed to distinguish among multiple MRSs from one item.

The text was updated successfully, but these errors were encountered:

goodmami added the enhancement label Jan 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fill in unused identifiers on semantic structures #274

Fill in unused identifiers on semantic structures #274

goodmami commented Jan 20, 2020

Fill in unused identifiers on semantic structures #274

Fill in unused identifiers on semantic structures #274

Comments

goodmami commented Jan 20, 2020