On Logistic Regression: Gradients of the Log Loss, Multi ...

On Logistic Regression: Gradients of the Log Loss, Multi-Class Classification, and Other Optimization Techniques

Karl Stratos June 20, 2018

1 / 22

Recall: Logistic Regression

Task. Given input x Rd, predict either 1 or 0 (on or off).

2 / 22

Recall: Logistic Regression

Task. Given input x Rd, predict either 1 or 0 (on or off). Model. The probability of on is parameterized by w Rd as a dot product squashed under the sigmoid/logistic function : R [0, 1].

1 p (1|x, w) := (w ? x) :=

1 + exp(-w ? x)

2 / 22

Recall: Logistic Regression

Task. Given input x Rd, predict either 1 or 0 (on or off). Model. The probability of on is parameterized by w Rd as a dot product squashed under the sigmoid/logistic function : R [0, 1].

1 p (1|x, w) := (w ? x) :=

1 + exp(-w ? x) The probability of off is

p (0|x, w) = 1 - (w ? x) = (-w ? x)

2 / 22

Recall: Logistic Regression

Task. Given input x Rd, predict either 1 or 0 (on or off).

Model. The probability of on is parameterized by w Rd as a dot product squashed under the sigmoid/logistic function : R [0, 1].

1 p (1|x, w) := (w ? x) :=

1 + exp(-w ? x)

The probability of off is p (0|x, w) = 1 - (w ? x) = (-w ? x)

Today's focus: 1. Optimizing the log loss by gradient descent 2. Multi-class classification to handle more than two classes 3. More on optimization: Newton, stochastic gradient descent

2 / 22

................
................

In order to avoid copyright disputes, this page is only a partial summary.

To fulfill the demand for quickly locating and searching documents.

It is intelligent file search solution for home and business.

Literature Lottery

To fulfill the demand for quickly locating and searching documents.

Related download

Related searches