Ml learning with finite hypothesis sets
[Pages:35]Foundations of Machine Learning
Learning with Finite Hypothesis Sets
Mehryar Mohri Courant Institute and Google Research
mohri@cims.nyu.edu
Motivation
Some computational learning questions
? What can be learned ef ciently? ? What is inherently hard to learn? ? A general model of learning?
Complexity
? Computational complexity: time and space. ? Sample complexity: amount of training data
needed to learn successfully.
? Mistake bounds: number of mistakes before learning successfully.
Mehryar Mohri - Foundations of Machine Learning
page 2
if
This lecture
PAC Model Sample complexity, nite H, consistent case Sample complexity, nite H, inconsistent case
Mehryar Mohri - Foundations of Machine Learning
page 3
i if f
De nitions and Notation
X : set of all possible instances or examples, e.g., the set of all men and women characterized by their height and weight.
c: X {0, 1}: the target concept to learn; can be identi ed with its support{x X: c(x)= 1}.
C : concept class, a set of target concepts c .
D : target distribution, a xed probability distribution overX.Training and test examples are drawn according to D.
Mehryar Mohri - Foundations of Machine Learning
page 4
if if if
De nitions and Notation
S : training sample.
H : set of concept hypotheses, e.g., the set of all linear classi ers.
The learning algorithm receives sample S and selects a hypothesis hS from H approximating c .
Mehryar Mohri - Foundations of Machine Learning
page 5
if if
Errors
True error or generalization error of h with respect to the target concept c and distribution D :
R(h) = Pr [h(x) = c(x)] = E [1h(x)=c(x)].
xD
xD
Empirical error: average error of h on the training
sample S drawn according to distribution D,
RS (h)
=
Pr [h(x)
x Db
=
c(x)]
=
E [1h(x)=c(x)]
x Db
=
1 m
m
1h(xi)=c(xi).
i=1
Note: R(h) = E S Dm
RS (h)
.
Mehryar Mohri - Foundations of Machine Learning
page 6
PAC Model
(Valiant, 1984)
PAC learning: Probably Approximately Correct learning.
De nition: concept class C is PAC-learnable if there exists a learning algorithmL such that:
? for all c C, >0, > 0, and all distributionsD,
S
Pr
Dm
[R(hS
)
]1
,
? for samples S of size m= poly(1/, 1/ ) for a xed polynomial.
Mehryar Mohri - Foundations of Machine Learning
page 7
if if
Remarks
Concept class C is known to the algorithm. Distribution-free model: no assumption onD. Both training and test examples drawn D. Probably: con dence1 . Approximately correct: accuracy1 . Ef cient PAC-learning:Lruns in time poly(1/ , 1/ ). What about the cost of the representation of c C?
Mehryar Mohri - Foundations of Machine Learning
page 8
if if
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- early learning guidlines for infants toddlers and twos
- what is a competency the difference between a competency
- concept learning what do concepts do for us
- teaching implications of information processing theory
- 2 promoting a positive learning environment
- active learning vanderbilt university
- adult learning theories and practices
- assessment for learning and development
- why are basic concepts important fun learning materials
- what is flipped learning
Related searches
- impact on learning with technology
- sets calculator with universal set
- finite difference and finite element
- words with two sets of double letters
- hypothesis vs null hypothesis examples
- sample writing with hypothesis and independent variable
- example of hypothesis with variables
- hypothesis testing with correlation
- indoor nativity sets with manger
- sequence to sequence learning with neural networks
- solution sets to equations with two variables
- hypothesis and null hypothesis examples