Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arg1 and arg2 repeating same text #2

Open
mishumausam opened this issue May 31, 2013 · 3 comments
Open

arg1 and arg2 repeating same text #2

mishumausam opened this issue May 31, 2013 · 3 comments

Comments

@schmmd
Copy link
Member

schmmd commented May 31, 2013

The arg1 is contained as one of the arguments in the arg2 on the sentence page.

@schmmd
Copy link
Member

schmmd commented May 31, 2013

This is actually indexed as such. The SRL demo does not have the same output.

http://nlpweb.cs.washington.edu/log/1945

<doc>
<str name="id">827699</str>
<str name="arg1">
There are strong indications that Abu Sayyaf, Jemaah Islamiyah, and other terrorist groups are really collaborating to have a common type of IED
</str>
<str name="arg1_exact">
There are strong indications that Abu Sayyaf, Jemaah Islamiyah, and other terrorist groups are really collaborating to have a common type of IED
</str>
<str name="rel">said</str>
<str name="rel_exact">said</str>
<arr name="arg2">
<str>
There are strong indications that Abu Sayyaf, Jemaah Islamiyah, and other terrorist groups are really collaborating to have a common type of IED, Ferro
</str>
</arr>
<arr name="arg2_exact">
<str>
There are strong indications that Abu Sayyaf, Jemaah Islamiyah, and other terrorist groups are really collaborating to have a common type of IED, Ferro
</str>
</arr>
<str name="arg1_postag">
EX VBP JJ NNS IN NNP NNP , NNP NNP , CC JJ JJ NNS VBP RB VBG TO VB DT JJ NN IN NNP
</str>
<str name="rel_postag">.</str>
<str name="arg2_postag">NNP VBD</str>
<arr name="arg1_types">
<str>ArabicName</str>
<str>StanfordPERSON</str>
</arr>
<arr name="arg2_types">
<str>StanfordPERSON</str>
</arr>
<str name="context"/>
<str name="context_exact"/>
<double name="confidence">0.9429314535406047</double>
<str name="sentence">
There are strong indications that Abu Sayyaf, Jemaah Islamiyah, and other terrorist groups are really collaborating to have a common type of IED, Ferro said.
</str>
<str name="sentence_exact">
There are strong indications that Abu Sayyaf, Jemaah Islamiyah, and other terrorist groups are really collaborating to have a common type of IED, Ferro said.
</str>
<str name="extractor">Srl</str>
<str name="url">
/scratch/dovetail/corpus/kdd3-sentences-parsed/tosir283.pdf.txt.sentences
</str>
<long name="_version_">1435956443506278401</long>
</doc>.

@schmmd
Copy link
Member

schmmd commented Jun 3, 2013

This is a problem with the SRL extractor. It's hard to troubleshoot because it stems from the unicode issues. Here is the unicode sentence and the Clear dependencies.

There are “strong indications that Abu Sayyaf, Jemaah Islamiyah, and other terrorist groups are really collaborating to have a common type of IED,” Ferro said.

expl(are_VBP_1_6, There_EX_0_0); attr(are_VBP_1_6, indications_NNS_3_18); amod(indications_NNS_3_18, strong_JJ_2_10); ccomp(indications_NNS_3_18, collaborating_VBG_17_104); nn(Sayyaf_NNP_6_39, Abu_NNP_5_35); punct(Sayyaf_NNP_6_39, ,_,_7_45); conj(Sayyaf_NNP_6_39, Islamiyah_NNP_9_54); nn(Islamiyah_NNP_9_54, Jemaah_NNP_8_47); punct(Islamiyah_NNP_9_54, ,_,_10_63); cc(Islamiyah_NNP_9_54, and_CC_11_65); conj(Islamiyah_NNP_9_54, groups_NNS_14_86); amod(groups_NNS_14_86, other_JJ_12_69); amod(groups_NNS_14_86, terrorist_JJ_13_76); complm(collaborating_VBG_17_104, that_IN_4_30); nsubj(collaborating_VBG_17_104, Sayyaf_NNP_6_39); aux(collaborating_VBG_17_104, are_VBP_15_93); advmod(collaborating_VBG_17_104, really_RB_16_97); xcomp(collaborating_VBG_17_104, have_VB_19_121); aux(have_VB_19_121, to_TO_18_118); dobj(have_VB_19_121, type_NN_22_135); det(type_NN_22_135, a_DT_20_126); amod(type_NN_22_135, common_JJ_21_128); prep(type_NN_22_135, of_IN_23_140); pobj(of_IN_23_140, IED_NN_24_143); nn(Ferro_NNP_27_149, _NNP_26_147); ccomp(said_VBD_28_155, are_VBP_1_6); punct(said_VBD_28_155, ,_,_25_146); nsubj(said_VBD_28_155, Ferro_NNP_27_149); punct(said_VBD_28_155, ._._29_159)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants