Skip to content

Latest commit

 

History

History
78 lines (56 loc) · 3.75 KB

readme.md

File metadata and controls

78 lines (56 loc) · 3.75 KB

GloVe

Usage

Train

$ python3 main.py  --embedding_size 100 --context_size 10

Check whether words are_Similar

$ python3 main.py --mode "are_Similar" --word1 "big" --word2 "long"

To get_ClosestWords to a given word

$ python3 main.py --mode "get_ClosestWords" --word "big"

To use analogy and get word in :- word1 : word2 :: word : ?

$ python3 main.py --mode "analogy" --word1 "big" --word2 "long" --word "small"

To plot the embeddings

$ python3 main.py --mode "plotEmbeddings"

References

Contributed by:

Summary

##Introduction: GloVe is a specific weighted least squares model that trains on global word-word co-occurrence counts and thus makes efficient use of statistics. The model produces a word vector space with meaningful substructure, as evidenced by its state-of-the-art performance of 75% accuracy on the word analogy dataset. Global Vectors (GloVe) the global corpus statistics are captured efficiently by the model.

Model:

The relationship of these words can be examined by studying the ratio of their co-occurrence probabilities with various probe words, k. Compared to the raw probabilities, the ratio is better able to distinguish relevant words from irrelevant words and it is also better able to discriminate between the two relevant words.

drawing

We denote word vectors as w and separate context word vectors as ˜ w . We require that F be a homomorphism, such that it establishes a relation betwwen word vectors and the co occurance counts. It is given by

drawing

where word-word co-occurrence counts are denoted by X, whose entries Xij tabulate the number of times word j occurs in the context of word i. The solution to these equations is exponential function. So, the equation, after adding the biases, becomes

alt text

We use a new weighted least squares regression model that addresses these problems. Casting above equation as a least squares problem and introducing a weighting function f(Xij) into the cost function gives us the model,

drawing

Where V is the size of the vocabulary. The proposed weighting function is

alt text

we fix to xmax = 100 for all our experiments. We found that α = 3/4 gives a modest improvement over a linear version with α = 1.

A Sample of the Embedding Plot:

drawing

Comparison with word2vec:

GloVe controls for the main sources of variation by setting the vector length, context window size, corpus, and vocabulary size to the configuration and most importantly training time. alt text

Conclusion:

GloVe is a new global log-bilinear regression model for the unsupervised learning of word representations that outperforms other models on word analogy, word similarity, and named entity recognition tasks.

Note:

The dataset used in the implementation is same as used in word2vec tensorflow implementation to compare the results.