Distributed representation of text

Distributed representation of text

He He

New York University

September 22, 2022

1 / 43

Logistics

HW1 P1.2 clarification

x is the BoW vector. HW1 due by Sep 29 (one week from now)

2 / 43

Table of Contents

Review Introduction Vector space models Word embeddings Brown clusters Neural networks

3 / 43

Last week

Generative vs discriminative models for text classification (Multinomial) naive Bayes Assumes conditional independence Very efficient in practice (closed-form solution) Logistic regression Works with all kinds of features Wins with more data

Feature vector of text input BoW representation N-gram features (usually n 3)

Control the complexity of the hypothesis class Feature selection Norm regularization

4 / 43

Table of Contents

Review Introduction Vector space models Word embeddings Brown clusters Neural networks

5 / 43

Objective

Goal: come up with a good representation of text What is a representation? Feature map: : text Rd , e.g., BoW, handcrafted features "Representation" often refers to learned features of the input

6 / 43

Objective

Goal: come up with a good representation of text What is a representation? Feature map: : text Rd , e.g., BoW, handcrafted features "Representation" often refers to learned features of the input What is a good representation?

6 / 43

Objective

Goal: come up with a good representation of text What is a representation? Feature map: : text Rd , e.g., BoW, handcrafted features "Representation" often refers to learned features of the input What is a good representation? Leads to good task performance (often requires less training data) Enables a notion of distance over text: d((a), (b)) is small for semantically similar texts a and b

6 / 43

................
................

In order to avoid copyright disputes, this page is only a partial summary.

To fulfill the demand for quickly locating and searching documents.

It is intelligent file search solution for home and business.

Literature Lottery

To fulfill the demand for quickly locating and searching documents.

Related download

Related searches