-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
blue code
- Loading branch information
0 parents
commit fc1e364
Showing
1 changed file
with
51 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
''' | ||
calculate BLEU score with perl script | ||
https://github.com/karpathy/neuraltalk/blob/master/eval/multi-bleu.perl | ||
make list of references and candidate(prediction) | ||
format: | ||
all_references -> [[ref0_0, ref0_1, ... , ref0_4], [ref1_0, ref1_1, ... ref1_4], ...] | ||
all_candidates -> [cand0, cand1, ...] | ||
''' | ||
import json | ||
import os | ||
|
||
def calc_BLEU(all_references, all_candidates): | ||
ref_num = 5 | ||
# use perl script to eval BLEU score for fair comparison to other research work | ||
# first write intermediate files | ||
print 'writing intermediate files into eval/' | ||
open('eval/output', 'w').write('\n'.join(all_candidates)) | ||
for q in xrange(ref_num): | ||
open('eval/reference'+`q`, 'w').write('\n'.join([x[q] for x in all_references])) | ||
# invoke the perl script to get BLEU scores | ||
print 'invoking eval/multi-bleu.perl script...' | ||
owd = os.getcwd() | ||
os.chdir('eval') | ||
os.system('./multi-bleu.perl reference < output') | ||
os.chdir(owd) | ||
|
||
def example_BLEU(): # example code for BLEU, this shows 100% score | ||
path='data/flickr8k/' | ||
test_cap = pkl.load(open(path + 'flicker_8k_cap.test.pkl', 'rb')) | ||
r_c = {} # key:image file name, value:captions list | ||
for i in test_cap: | ||
if i[1] in r_c: | ||
r_c[i[1]].append(i[0]) | ||
else: | ||
r_c[i[1]] =[i[0]] | ||
rc_k = r_c.keys() | ||
r_c_l = [] | ||
for k in rc_k: | ||
r_c_l.append(r_c[k]) | ||
all_references = r_c_l | ||
all_candidates = [x[0] for x in r_c_l] # a sentence in a reference | ||
calc_BLEU(all_references, all_candidates) | ||
print('the scores are 100.0/100.0/100.0/100.0 ?') | ||
|
||
''' | ||
all_references = [['a dog', 'dogs', 'there is a dog', 'there are dogs', 'dog!'], | ||
['a cat', 'cats', 'there is a cat', 'there are cats', 'cat!']] | ||
all_candidates = ['dog', 'cat'] | ||
calc_BLEU(all_references, all_candidates) # BLEU = 100.0/0.0/0.0/0.0 | ||
''' |
fc1e364
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @AAmmy, I am confused with XU's Codes, for I never get the results like yours and XU's, Could you willing to share your files with me? Especially the flickr30k and COCO datasets. My gmail: [email protected].
My steps to run XU's codes as follows:
1, Run script prepare_caffe_and_dictionary_flickr8k.py to extract features(VGG19_5_4)
2, evaluate_flickr8k.py to get three files( model.npz_bestll.npz, model.npz.pkl, model.npz)
3. I changed parts of codes of alpha_visualization.ipynb and the generated caption and text.
4. By metricx.py to test the generated text and reference
is it same with you?
I am sorry to bother you.
fc1e364
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @yaxingwang
Your steps are same with me.
I uploaded my flickr8k dataset to GoogleDrive.
https://drive.google.com/folderview?id=0B3mCQWzCpiCESVFXdjN0ZnloMFU&usp=sharing
I will share code on github later.
After training, please run generate_caps.py to check BLEU for all epoch.
Uploading flickr30k and COCO datasets takes time because they are so large.
fc1e364
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @AAmmy,
Thanks. I am running your scripts. But the form of features are different with my. The extracted features from vgg are saved to three files( flicker_8k_train.dev.pkl, flicker_8k_align.dev.pkl, flicker_8k_test.dev.pkl), which is different with your files( flicker_8k_cap.train.pkl, flicker_8k_cap.dev.pkl, flicker_8k_cap.test.pkl), which includes index instead of features which are save alone in 196x512_perfile_dense_name_resizecrop. Could you explain about this?
Again: Thank your contribution
fc1e364
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
196x512_perfile_dense_name_resizecrop.zip contains all features of train, dev and test images.
My pkl files do not have features, only captions.
Original script loads features only one time, however my script loads features every update, does not need large memory, as written in page below.
so, I place features on SSD not HDD.
kelvinxu/arctic-captions#20 (comment)