Ml learning with finite hypothesis sets

[Pages:35]Foundations of Machine Learning

Learning with Finite Hypothesis Sets

Mehryar Mohri Courant Institute and Google Research

mohri@cims.nyu.edu

Motivation

Some computational learning questions

? What can be learned ef ciently? ? What is inherently hard to learn? ? A general model of learning?

Complexity

? Computational complexity: time and space. ? Sample complexity: amount of training data

needed to learn successfully.

? Mistake bounds: number of mistakes before learning successfully.

Mehryar Mohri - Foundations of Machine Learning

page 2

if

This lecture

PAC Model Sample complexity, nite H, consistent case Sample complexity, nite H, inconsistent case

Mehryar Mohri - Foundations of Machine Learning

page 3

i if f

De nitions and Notation

X : set of all possible instances or examples, e.g., the set of all men and women characterized by their height and weight.

c: X {0, 1}: the target concept to learn; can be identi ed with its support{x X: c(x)= 1}.

C : concept class, a set of target concepts c .

D : target distribution, a xed probability distribution overX.Training and test examples are drawn according to D.

Mehryar Mohri - Foundations of Machine Learning

page 4

if if if

De nitions and Notation

S : training sample.

H : set of concept hypotheses, e.g., the set of all linear classi ers.

The learning algorithm receives sample S and selects a hypothesis hS from H approximating c .

Mehryar Mohri - Foundations of Machine Learning

page 5

if if

Errors

True error or generalization error of h with respect to the target concept c and distribution D :

R(h) = Pr [h(x) = c(x)] = E [1h(x)=c(x)].

xD

xD

Empirical error: average error of h on the training

sample S drawn according to distribution D,

RS (h)

=

Pr [h(x)

x Db

=

c(x)]

=

E [1h(x)=c(x)]

x Db

=

1 m

m

1h(xi)=c(xi).

i=1

Note: R(h) = E S Dm

RS (h)

.

Mehryar Mohri - Foundations of Machine Learning

page 6

PAC Model

(Valiant, 1984)

PAC learning: Probably Approximately Correct learning.

De nition: concept class C is PAC-learnable if there exists a learning algorithmL such that:

? for all c C, >0, > 0, and all distributionsD,

S

Pr

Dm

[R(hS

)

]1

,

? for samples S of size m= poly(1/, 1/ ) for a xed polynomial.

Mehryar Mohri - Foundations of Machine Learning

page 7

if if

Remarks

Concept class C is known to the algorithm. Distribution-free model: no assumption onD. Both training and test examples drawn D. Probably: con dence1 . Approximately correct: accuracy1 . Ef cient PAC-learning:Lruns in time poly(1/ , 1/ ). What about the cost of the representation of c C?

Mehryar Mohri - Foundations of Machine Learning

page 8

if if

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download