Natural Language Processing with Deep Learning …

Natural Language Processing with Deep Learning CS224N/Ling284

Christopher Manning Lecture 2: Word Vectors,Word Senses, and

Classifier Review

Lecture Plan

Lecture 2: Word Vectors and Word Senses 1. Finish looking at word vectors and word2vec (10 mins) 2. Optimization basics (8 mins) 3. Can we capture this essence more effectively by counting? (12m) 4. The GloVe model of word vectors (10 min) 5. Evaluating word vectors (12 mins) 6. Word senses (6 mins) 7. Review of classification and how neural nets differ (10 mins) 8. Course advice (2 mins)

G2 oal: be able to read word embeddings papers by the end of class

1. Review: Main idea of word2vec

? Start with random word vectors

? Iterate through each word in the whole corpus

? Try to predict surrounding words using word vectors

5:9 | 5

569 | 5

5:7 | 5

567 | 5

... problems turning

into

banking crises as ...

?

=

&'((*+,-.) 13 &'((*1, -.)

? Update vectors so you can predict better

? This algorithm learns word vectors that capture word similarity and meaningful directions in a wordspace

3

Word2vec parameters and computations

????? ? ? ? ?? ? ? ? ?? ? ? ? ?? ? ? ? ?? ? ? ? ??

U

outside

????? ? ? ? ?? ? ? ? ?? ? ? ? ?? ? ? ? ?? ? ? ? ??

V

center

? ? ? ? ? ? . >?

dot product

? ? ? ? ? ? softmax(. >?)

probabilities

!

4

Same predictions at each position

We want a model that gives a reasonably high probability estimate to all words that occur in the context (fairly often)

Word2vec maximizes objective function by putting similar words nearby in space

5

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download