The Perceptron Algorithm, Margins

[Pages:31]The Perceptron Algorithm, Margins

Maria-Florina Balcan

08/29/2018

The Perceptron Algorithm

Simple learning algorithm for supervised classification analyzed via geometric margins in the 50's [Rosenblatt'57] . Originally introduced in the online learning scenario.

? Online Learning Model ? Perceptron Algorithm ? Its Guarantees under large margins

The Online Learning Model

? Example arrive sequentially. ? We need to make a prediction.

Afterwards observe the outcome.

For i=1, 2, ..., :

Phase i:

Online Algorithm

Example Prediction () Observe true label ()

Mistake bound model

? Goal: Minimize the number of mistakes.

? Analysis wise, make no distributional assumptions, consider a worst sequence of examples.

The Online Learning Model. Motivation

- Email classification (distribution of both spam and regular mail changes over time, but the target function stays fixed last year's spam still looks like spam). - Recommendation systems. Recommending movies, etc.

- Predicting whether a user will be interested in a new news article or not.

- Add placement in a new market.

Linear Separators

? Feature space X = Rd

? Hypothesis class of linear decision surfaces in .

? h x = w x + w0, if 0, then label x as +, otherwise label it as -

XX

X

X X

X

X

X

XX

Trick: Without loss of generality w0 = 0.

OO

O

O

O

w

O

O

O

Proof: Can simulate a non-zero threshold with a dummy input feature 0 that is always set up to 1.

? = 1, ... , = 1, ... , , 1 ? w x + w0 0 iff 1, ... , , w0 0

where w = 1, ... ,

Linear Separators: Perceptron Algorithm

? Set t=1, start with the all zero vector 1. ? Given example , predict positive iff 0 ? On a mistake, update as follows:

? Mistake on positive, then update +1 + ? Mistake on negative, then update +1 -

Natural greedy procedure:

? If true label of x is +1 and incorrect on x we have < 0, +1 + = + 2, so more chance +1 classifies x correctly. ? Similarly for mistakes on negative examples.

Linear Separators: Perceptron Algorithm

? Set t=1, start with the all zero vector 1. ? Given example , predict positive iff 0 ? On a mistake, update as follows:

? Mistake on positive, then update +1 + ? Mistake on negative, then update +1 -

Note: is weighted sum of incorrectly classified examples = 11 + + = 11 + +

Important when we talk about kernels.

Perceptron Algorithm: Example

Example: -1,2 - X

1,0 +

1,1 + X

-1,0 -

-1, -2 - X

1, -1 +

-

+

-+

+

-

Algorithm: Set t=1, start with all-zeroes weight vector 1. Given example , predict positive iff 0.

On a mistake, update as follows: ? Mistake on positive, update +1 + ? Mistake on negative, update +1 -

1 = (0,0) 2 = 1 - -1,2 = (1, -2) 3 = 2 + 1,1 = (2, -1) 4 = 3 - -1, -2 = (3,1)

................
................

In order to avoid copyright disputes, this page is only a partial summary.

To fulfill the demand for quickly locating and searching documents.

It is intelligent file search solution for home and business.

Literature Lottery

To fulfill the demand for quickly locating and searching documents.

Related download

Related searches