Personalized Dose Finding Using Outcome Weighted Learning
[Pages:312]1538
M. QIAN
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION , VOL. , NO. , Theory and Methods ..
Comment
Min Qian
Department of Biostatistics, Columbia University, New York, NY, USA
ABSTRACT
This comment deals with issues related to the article by Chen, Zeng, and Kosorok. We present several potential modifications of the outcome weighted learning approach. Those modifications are based on truncated l2 loss. One advantage of l2 loss is that it is differentiable everywhere, which makes it more stable and computationally more tractable.
KEYWORDS Double robustness; Epanechnikov kernel; Personalized treatment
1. Introduction
We congratulate Chen, Zeng, and Kosorok (hereafter, CZK) for a stimulating and interesting article on the important topic of personalized dose finding. We found the article is enjoyable to read, and we thank the editors for the opportunity to discuss the article. Personalized medicine is an emerging area in medical research. It holds the great potential to improve the quality of patient care. With recent advances in biomedical science, massive amounts of data have been produced on individual patients. How to use high-dimensional data to design personalized treatment is the key to success. Several methods have been proposed to deal with high-dimensional data in the case of limited treatment options (e.g., Qian and Murphy 2011; Zhao et al. 2012; Lu, Zhang, and Zeng 2013). Those methods, however, as discussed by CZK, cannot be directly applied when the number of treatment options is infinite (e.g., in the dose finding problem). CZK developed a novel outcome weighted learning method for personalized dose finding. They substituted the weighted indicator loss in the original optimization problem with a truncated l1 loss, and used an l2 penalty to address the overfitting problem. An efficient optimization algorithm was also provided to facilitate computation. In our discussion, we present several modifications of the outcome weighted learning approach. Those modifications are based on truncated l2 loss. One advantage of l2 loss is that it is differentiable everywhere, which makes it more stable and computationally more tractable. The proposed modifications are intended as a way to demonstrate the potential of the machine learning framework proposed by CZK.
2. Preliminaries
We adopt the same notations as in CZK. Assume we have n iid trajectories of (X, A, R), where X = (X1, . . . , Xd )T X is patient-level covariates, A is the assigned treatment dose taking values in a bounded interval A, and R is a scalar "reward,"
with large values representing better outcomes. For any individualized dose rule (IDR) f : X A, the value of f , V ( f ), is defined as the expected reward if f is implemented in the study population. The optimal IDR, f opt, is the dose rule that yields maximal expected reward, that is, f opt = arg max f V ( f ).
Denote Q(x, a) E(R|X = x, A = a). Let p(a|X) be the randomization probability of A = a given X. CZK showed that V ( f ) = EX[Q(X, f (X))] = lim0+ V ( f ), where
V ( f )
E
R p(A|X)
1 2
I
|A -f (X)|
.
(1)
Note that maximizing V ( f ) is computationally intractable due to discontinuity of the 0 -1 loss. To address this difficulty, CZK proposed to use a surrogate truncated l1 loss, yielding approximated value function
V ( f )
E
R p(A|X)
max
1 -|A -f (X)| , 0
. (2)
Their Theorem 1 showed that |V ( f ) -V ( f )| C under mild conditions. Below we extend this result to a general class of loss functions. For any measurable function g : R [0, ), IDR f : X A and > 0, denote
Vg, ( f )
E
R p(A|X)
g
A -f (X )
.
We have the following theorem.
Theorem 1. Suppose
EX
sup
a,a A,a=a
Q(X, a) -Q(X, a) a -a
= O(1).
(3)
Assume g : R [0, ) satisfies g(z)dz = 1 and |z|g(z)dz = O(1). Then for any individualized dose rule
f : X A and > 0, there exists a constant C > 0 such that |Vg, ( f ) -V ( f )| C.
CONTACT Min Qian mq@columbia.edu ? American Statistical Association
Department of Biostatistics, Columbia University, New York, NY .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
1539
Proof. First note that
Vg, ( f ) = E
R p(A|X)
g
A -f (X)
= EX
1
Q(X, a)g a -f (X) da
= EX Q(X, z + f (X))g(z)dz .
Since V ( f ) = EX[Q(X, f (X))], under the condition that g(z)dz = 1 and g(z) 0 for all z R, we have
|Vg, ( f ) -V ( f )| = EX Q(X, z + f (X))g(z)dz
- Q(X, f (X))g(z)dz
EX Q(X, z + f (X)) -Q(X, f (X)) g(z)dz
EX
sup
a,a A,a=a
Q(X, a) -Q(X, a) a -a
C,
|z|g(z)dz ,
where the last equality follows from conditions (3) and |z|g(z)dz = O(1).
Remarks. 1. Condition (3) is a mild Lipschitz-type condition. It is easy to verify that this condition holds in the simulation scenarios presented in Section 5. 2. Note that any density function g(?) of a square integrable random variable satisfies g(z)dz = 1 and |z|g(z)dz = O(1). As 0+, |V ( f ) -V ( f )| 0. To ensure a good approximation of the original indicator function in finite samples, it is natural to consider densities that are symmetric around 0. In another word, g(?) can be viewed as a kernel function. Indeed, the indicator loss in (1) uses the uniform kernel, and the truncated l1 loss in (2) corresponds to the triangular kernel.
3. Learning IDR with Truncated l2 Loss
In this section, we consider the truncated l2 loss, corresponding
to
the
Epanechnikov
kernel,
that
is,
g(u)
=
3 4
max(1
-u2, 0).
Similar to the truncated l1 loss, the optimization problem can
be solved using DC algorithm. In addition, since it is differen-
tiable everywhere in the compact support, an explicit parameter
updating formula can be derived. Denote
V? ( f )
E
4
3R p(A|X)
max
1
-[A
-f (X 2
)]2
,
0
. (4)
Note that choosing f to maximize V? ( f ) is equivalent to minimizing
R? ( f ) = E
4
3R p(A|X)
min
[A
-f (X)]2 2
,
1
.
For the reason discussed in CZK, we assume R 0 without loss of generality. Consider IDR of the form f (x; ) = (x)T, where (x) is a vector of basis functions of x. For example,
(x) = (1, xT)T represents a linear model of x. Denote Wi = mp(AiRni|iXimi) ,iziin=g 1, . . . , n. The parameters can be estimated by
R()
=
1 n
n i=1
Wi min
[Ai -
(Xi )T]2, n2
+ n2,
where n > 0 and n 0 are tuning parameters. n measures the closeness of the surrogate loss to the original indicator loss, and n controls the model complexity. It is easy to see that R() can be written as the difference of two convex functions R() = R1() -R2(), where
R1 ( )
=
1 n
n
(Wi[Ai -
i=1
(Xi)T]2) + n2
and R2()
=
1 n
n i=1
Wi ([Ai
-
(Xi )T]2 -n2 )+.
Using DC algorithm, we estimate by first initializing (0), then repeatedly updating via
(t+1) = arg min R1() -[ R2((t))]T( -(t)) (5)
until convergence, where R2() is the subgradient of R2().
DAfetfeinr ealtgheebirnadicexsismetplif(itnc)a=tio{ni,=(5)1,is.
. . , n : |Ai -XiT(t)| equivalent to
n}.
(t +1)
=
arg
min
i
Wi[Ai -
(t ) n
(Xi )T]2
+
Wi[
i1,...,n\
(t ) n
(Xi )T( -(t) )]2 + nn2
n
-1
= nn + Wi (Xi ) (Xi )T
i=1
? WiAi
i
(t ) n
(Xi ) +
Wi
i{1,...,n}\
(t ) n
(Xi )
(Xi )T(t) . (6)
4. A Doubly Robust Estimate
The above procedure assumes that the treatment assignment distribution p(a|X) is known or can be estimated consistently. In the case of finite treatment options, an augmented inverse probability weighted estimator of V ( f ) has been provided (Zhang et al. 2012). This estimate offers protection against model misspecification of p(a|X). It is doubly robust in the sense that the resulting estimate is consistent as long as p(a|X) or Q(x, a) is correctly specified. Below we present a doubly robust estimate of V ( f ) in the dose finding setting. Since E[R -Q(X, A)|X, A] =
1540
M. QIAN
0, the value of an IDR f can be written as
V ( f ) = E[Q(X, f (X))] + E
R
-Q(X, A) p(A|X)
gE
A -f (X)
for any > 0, where gE (z) denotes the Epanechnikov kernel. Note that the IDR that maximizes V ( f ) does not change if R is replaced by R + c for any constant c in the above display. For any f : X A, Q : X ? A R, p~ : X ? A R+ and > 0, define
VD( f ; Q, p~) E[Q(X, f (X))]
+E
[R
-Q(X, A)] p~(X, A)
gE
A -f (X)
.
Below we show that VD( f ; Q, p~) is a good approximation of V ( f ) when Q(x, a) = Q(x, a) or p~(x, a) = p(a|x).
Theorem 2. Suppose Q : X ? A R satisfies
The above theorem suggest that as long as Q(x, a) or p(a|x) is consistently estimated, maximizing an empirical version of VD will give us a high quality IDR.
Again consider IDR of the form f (x, ) = (x)T. We propose to estimate by minimizing
R^ D() = -nn3
n
Q
i=1
Xi,
(Xi )T
1n + n i=1
Wi min
(Ai -
(Xi )T)2, n2
+ n2,
where Q(x, a) and p^(x, a) are estimates of Q(x, a) and p(a|x), respectively, Wi = [Ri -Q(Xi, Ai) + c]/p^(Xi, Ai), and c is a constant so that Wi 0 for i = 1, . . . , n. n and n are tuning parameters. To make the optimization problem computationally tractable, we only consider Q(x, a) that is differentiable and either convex or concave (e.g., linear or quadratic in a). The problem can be solved using DC algorithm as discussed in Section 3.
EX
sup
a,a A,a=a
Q(X, a) -Q(X, a) a -a
= O(1).
(7) 5. Numerical Studies In this section, we conduct simulation studies to evaluate the
For any IDR f : X A and > 0, we have (i) VD( f ; Q, p~) = V ( f ) for any p~ : X ? A R+; and (ii) there exists a positive constant C such that |VD( f ; Q, p) -V ( f )| C.
performance of methods proposed in previous sections. In the simulation below the tuning parameter n is fixed, and n is selected using cross-validation.
We consider four examples. Scenarios 1 and 2 are the same
as those presented in CZK, where the treatment assignment
Proof. (i) follows from the fact that E[R -Q(X, A)|X, A] = 0. distribution A U [0, 2] is known. Scenarios 3 and 4 are the
For (ii), note that VD( f ; Q, p) can be decomposed as
same as scenarios 1 and 2, respectively, except that p(a|x)
is truncated normal ranging from (0, 2) with mean (-0.5 +
E
Q(X, f (X))
1
-gE
((A -f (X))/ p(A|X)
)}
+E
[Q(X,
f (X)) -Q(X, p(A|X)
A)] gE
A -f (X)
+ V? ( f ),
0.5X1 + 0.5X2)1X3 ................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- journal of personalized medicine
- 2022 leather crafters saddlers journal
- coach volleyball journal notebook volleyball lover gifts volleyball
- personalized recommender system for e learning environment ijcsns
- greta personalized writing journal notebook for girls and women
- personalized or precision medicine current vision challenges and
- waco texas leather crafters saddlers journal
- personal journals
- baseball journal 40 baseball notebook number and letter monogram
- new york notebook journal task list manager scrapbook 110 pages blank 6
Related searches
- finding words using these letters
- what is the outcome of cellular respiration
- outcome synonyms
- outcome of acute inflammation
- finding percentile using standard deviation
- finding drivers license number using ssn
- finding probability using normal distribution calculator
- deep learning using matlab
- finding derivative using table
- finding derivative using limit process
- weighted averages using percents in excel
- equal weighted vs price weighted index