Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit eb37057

Browse files
Matthias Winkelmannwaterson
Matthias Winkelmann
authored andcommitted
Fixed links to evaluation data in makefile (tensorflow#5402)
1 parent aec1fec commit eb37057

File tree

5 files changed

+17
-12
lines changed

5 files changed

+17
-12
lines changed

research/swivel/.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,9 @@ Mtruk.csv
66
SimLex-999.zip
77
analogy
88
fastprep
9-
myz_naacl13_test_set.tgz
9+
*.dSYM
1010
questions-words.txt
11+
word_relationship.*
12+
tensorflow/
1113
rw.zip
1214
ws353simrel.tar.gz

research/swivel/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -155,10 +155,10 @@ You can do some simple exploration using `nearest.py`:
155155
...
156156

157157
To evaluate the embeddings using common word similarity and analogy datasets,
158-
use `eval.mk` to retrieve the data sets and build the tools:
158+
use `eval.mk` to retrieve the data sets and build the tools. Note that wordsim is currently not compatible with Python 3.x.
159159

160160
make -f eval.mk
161-
./wordsim.py -v vocab.txt -e vecs.bin *.ws.tab
161+
./wordsim.py --vocab vocab.txt --embeddings vecs.bin *.ws.tab
162162
./analogy --vocab vocab.txt --embeddings vecs.bin *.an.tab
163163

164164
The word similarity evaluation compares the embeddings' estimate of "similarity"

research/swivel/eval.mk

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -59,9 +59,9 @@ simlex999.ws.tab: SimLex-999.zip
5959
mikolov.an.tab: questions-words.txt
6060
egrep -v -E '^:' $^ | tr '[A-Z] ' '[a-z]\t' > $@
6161

62-
msr.an.tab: myz_naacl13_test_set.tgz
63-
tar Oxfz $^ test_set/word_relationship.questions | tr ' ' '\t' > /tmp/q
64-
tar Oxfz $^ test_set/word_relationship.answers | cut -f2 -d ' ' > /tmp/a
62+
msr.an.tab: word_relationship.questions word_relationship.answers
63+
cat word_relationship.questions | tr ' ' '\t' > /tmp/q
64+
cat word_relationship.answers | cut -f2 -d ' ' > /tmp/a
6565
paste /tmp/q /tmp/a > $@
6666
rm -f /tmp/q /tmp/a
6767

@@ -75,7 +75,7 @@ MEN.tar.gz:
7575
wget http://clic.cimec.unitn.it/~elia.bruni/resources/MEN.tar.gz
7676

7777
Mtruk.csv:
78-
wget http://tx.technion.ac.il/~kirar/files/Mtruk.csv
78+
wget http://www.kiraradinsky.com/files/Mtruk.csv
7979

8080
rw.zip:
8181
wget http://www-nlp.stanford.edu/~lmthang/morphoNLM/rw.zip
@@ -84,15 +84,18 @@ SimLex-999.zip:
8484
wget http://www.cl.cam.ac.uk/~fh295/SimLex-999.zip
8585

8686
questions-words.txt:
87-
wget http://word2vec.googlecode.com/svn/trunk/questions-words.txt
87+
wget http://download.tensorflow.org/data/questions-words.txt
8888

89-
myz_naacl13_test_set.tgz:
90-
wget http://research.microsoft.com/en-us/um/people/gzweig/Pubs/myz_naacl13_test_set.tgz
89+
word_relationship.questions:
90+
wget https://github.com/darshanhegde/SNLPProject/raw/master/word2vec/eval/word_relationship.questions
91+
92+
word_relationship.answers:
93+
wget https://github.com/darshanhegde/SNLPProject/raw/master/word2vec/eval/word_relationship.answers
9194

9295
analogy: analogy.cc
9396

9497
clean:
9598
rm -f *.ws.tab *.an.tab analogy *.pyc
9699

97100
distclean: clean
98-
rm -f *.tgz *.tar.gz *.zip Mtruk.csv questions-words.txt
101+
rm -f *.tgz *.tar.gz *.zip Mtruk.csv questions-words.txt word_relationship.{questions,answers}

research/swivel/swivel.py

100644100755
File mode changed.

research/swivel/vecs.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ def __init__(self, vocab_filename, rows_filename, cols_filename=None):
3838
'unexpected file size for binary vector file %s' % rows_filename)
3939

4040
# Memory map the rows.
41-
dim = size / (4 * n)
41+
dim = round(size / (4 * n))
4242
rows_mm = mmap.mmap(rows_fh.fileno(), 0, prot=mmap.PROT_READ)
4343
rows = np.matrix(
4444
np.frombuffer(rows_mm, dtype=np.float32).reshape(n, dim))

0 commit comments

Comments
 (0)