Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to generate summary for my own data? #11

Open
cherukuravi opened this issue Nov 12, 2018 · 7 comments
Open

How to generate summary for my own data? #11

cherukuravi opened this issue Nov 12, 2018 · 7 comments

Comments

@cherukuravi
Copy link

Hi Shashi,

I am trying to generate a summary of my own text article using the pretrained embeddings provided in the link. I created a doc file with the article text and saved as cnn.test.doc and also updated the corresponding title file. But when I am running the code it shows error as shown below.

File "/Users/ravi/Desktop/Sidenet-1/data_utils.py", line 263, in populate_data
thissent = [int(item) for item in line.strip().split()]
I have given a text document but it is accepting the integers. I guess do we need to provide the Word id's of the corresponding words in a sentence.

Can you please guide me on how to generate the summary for new text articles using this code.

@shashiongithub
Copy link
Collaborator

Could you send me an email? I will send you the code behind our demo.

@adisri2694
Copy link

def stanford_processing(log, story, highlights):
story_corenlp = None
highlights_corenlp = None
try:
log += timestamp()+" Start Stanford Processing (SSegmentation,Tokenization,NERTagging) ...\n"

    story_corenlp = subprocess.check_output(['./corenlp.sh', story])
    highlights_corenlp = subprocess.check_output(['./corenlp.sh', highlights])
        
    log += timestamp()+" Stanford Processing finished.\n"
except Exception as e:
    log += timestamp()+" Stanford Processing failed.\n"+str(e)+"\n"

return log, story_corenlp, highlights_corenlp

corenlp.sh file is the same as provided in the stanford github page or is a custom created one
when i am using the stanford one it it giving "permission denied" error

@shashiongithub
Copy link
Collaborator

I think there should be one with the demo code. It is not the one at the stanford github page.

@adisri2694
Copy link

adisri2694 commented Feb 16, 2019 via email

@shashiongithub
Copy link
Collaborator

corenlp has this:

#!/bin/bash
wget --post-data "$1" 'localhost:9000/?properties={"annotators": "tokenize,ssplit,pos,lemma,ner", "ssplit.newlineIsSentenceBreak": "always", "outputFormat": "text"}' -O -

@OmerET8
Copy link

OmerET8 commented Feb 20, 2019

Could you send me an email? I will send you the code behind our demo.

Can you please send also to me ? How to send you my email ?
thank you very much !

@luyunan0404
Copy link

Hi, Shashiong. I also want to test with my own dataset, could you also send me the demo code of how to preprocessed the data to generate the files in your "preprocessed-input-directory"? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants