Takeshi Emura (NCU) Joint work with Dr. Yi- Hau Chen and Dr. Hsuan -Yu Chen ( Sinica )

24
An extension of the compound covariate prediction under the Cox proportional hazard models Emura , Chen & Chen [ 2012, PLoS ONE 7(10) ] Takeshi Emura (NCU) Joint work with Dr. Yi-Hau Chen and Dr. Hsuan-Yu Chen (Sinica) 國國國國國國 國國國國國 1 2013/5/17

description

國立東華大學 應用數學系. An extension of the compound covariate prediction under the Cox proportional hazard models Emura , Chen & Chen [ 2012, PLoS ONE 7(10) ] . Takeshi Emura (NCU) Joint work with Dr. Yi- Hau Chen and Dr. Hsuan -Yu Chen ( Sinica ). Outline: - PowerPoint PPT Presentation

Transcript of Takeshi Emura (NCU) Joint work with Dr. Yi- Hau Chen and Dr. Hsuan -Yu Chen ( Sinica )

Page 1: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

1

An extension of the compound covariate prediction under the Cox proportional hazard models

Emura , Chen & Chen [ 2012, PLoS ONE 7(10) ]

Takeshi Emura (NCU)Joint work with Dr. Yi-Hau Chen and Dr. Hsuan-Yu Chen (Sinica)

國立東華大學 應用數學系

2013/5/17

Page 2: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

2

Outline:1) Review: Survival data with high-dimensional covariates2) Review: Ridge regression3) Review: Compound covariate (CC) method4) Some theory on CC method5) Extension of the CC method Propose a new method6) Data example

Page 3: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

3

Follow-up Period

Die) 0 ( jjt

) 1 ( jjt

censoring if death if

01

censoringor death tomeeither ti :

}...,,1);,,({:data Survival

i

i

iii

t

nit

x

npxx ipii possibly ,)...,,( 1x

Die

) Covariate Gene (

Page 4: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

4

Lung cancer data from Chen et al. 2007 NEJMPatient ID Survival_Status100 alive , 47 months (censored)101 alive, 49 months (censored)102 death, 20 months109 death, 26110 alive, 39 (censored)113 alive, 35 (censored)115 alive, 45 (censored)116 death, 9128 death, 21.....365 alive, 5 months (censored)

0 ,47 iit

1 ,20 iit

n=125 samples

Page 5: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

22 slides 5

1st Patient (ID = 100)

)...,,( :Gene 6721 iii xxx

P=672 covariates >> n = 125

Page 6: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

6

• Genetic information is useful in survival prediction:Breast cancer:(Jenssen et al., 2002; van de Vijver et al., 2002; van’t Veer et al., 2002; Zhao et al., 2011)

Lung cancer:(Beer et al., 2002; Chen et al., 2007; Shedden et al., 2008)

• Demand for statistical prediction methods adapted to high-dimensionality, especially on recent 10 years

Page 7: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

7

Hazard function:

• Cox proportional hazard model (Cox 1972)

•Partial likelihood estimator:

If , then If p > n, is not unique (infinitely many maxima)

pii thth Rβxβx ),exp()()|( 0

dtttdttttth iiii /),|Pr()|( xx

n

i tt l

in

p

i

il

L1

1

)exp()exp()( maximize:ˆ

xβxββRβ

pRβˆ0

ˆ ββ Pn

Page 8: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

8

0

Idea of Ridge estimator(Hoerl & Kennard 1970)

Infinitely many solution in OLS

Unique nearest point to 0

Bayesian Interpretation :Hastile, Tibshirani, Friedman (2009)The Element of Statistical Learning

2argminˆ ββ Xy

Page 9: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

9

• Penalized partial likelihood (Verweij & Houwelingen 1994)

)estimator Ridge(

||||)exp(

)exp()(:)(ˆ 2

1

βxβ

xβββ

n

i Rl l

iRidge

i

i

L

)(log)(:)0(ˆ 1 0ββ

βUβ

nL

)(ˆ 0β

Ridge estimator(shrink toward 0)

)(ˆ β

Page 10: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

10

• Prediction by the Ridge estimator

valueoff-cut theis where) prognosisPoor -- hazardHigh ( )(ˆ ) prognosis Good -- hazard Low ( )(ˆ

patient future afor Covariates :

cc

c

x

Page 11: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

11

• Ridge regression (Cox-regression with L2 penalty)

Verve & van Howelingen(1994 Stat. Med.), Zhao et al. (2011 PLoS ONE)

• Lasso (Cox-regression with L1 penalty)

Gui & Li (2005 Bioinformatics), Segal (2006 Biostatistics)

• Gene selection via univariate Cox-regression

Jenssen et al. (2002 Nature Med.), Chen et al. (2007 NEJM), name but a few

• Others (Principal component, partial lease square, etc.)

Among above methods, ridge regression has the best performance in terms of survival prediction (Bovelstad et al., 2007; van Weieringen e al., 2009; Bovelstad and Borgan, 2011)

Existing methods for high-dimensional survival data

Page 12: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

12

Objectives of our study:

Propose a new prediction method that generalizes the so-called compound covariate prediction

Compound covariate prediction method *Previously used in medical context Tukey (1993 Controlled Clinical Trial), Beer et al. (2002 Nature Med.) Chen et al. (2007 NEJM), Radamacher et al (2002 J. of Theoretical Bio.) Matsui (2006 BMC Bioinformatics)

* Less known in statistical literature

Page 13: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

13

Compound covariate predictionStep1: For each gene , fit a univariate Cox model

Step2: A set of p regression coefficients

Remark: This is possible even when p > nStep 3: Compound covariate prediction

)exp()(/),|Pr( 0 ijjjijii xthdtxttdtttt

n

i tt ljj

ijjjp

i

ilx

x

11 )exp(

)exp(maxargˆ where,)ˆ,...,ˆ()0(ˆ

β

prognosis)(Poor )0(ˆ ; prognosis) (Good )0(ˆ cc xβxβ

),...,1( pj

,)...,,( genesith patient w future aFor 1 pxxx

Page 14: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

14

Compound covariate method• is a univariate method to resolve the high

dimensionality

• empirically perform well in microarray studies

To motivate our proposal, we first give some theoretical results on the compound covariate method

Page 15: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

15

•Assumption: The Cox model holds with

at the true parameter

•Remark: Under the multivariate Cox model assumption, the univariate Cox model does not hold, i.e,

)exp()()exp()()|( 1100 ippiii xxththth xβx0ββ ),,( ,01,00 p

.)exp()(

]|)}exp()([exp{log )|(

0

10

ijjj

iiij

xth

xtHEt

xth

Page 16: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

16

•Univariate Cox model for each gene

is a misspecified model ( a working model ) Ref: Struthers & Kalbfleisch (1986) Misspecified proportional hazard models, Biometrika 73 pp.363-9.

•Univariate partial likelihood equation

)exp()(/),|Pr( 0 ijjjijii xthdtxttdtttt

),...,1( pj

n

in

jji

n

jjji

ijijjj

xttI

xxttIx

nU

1

1

1

)exp()(

)exp()(1)(0 toSolution :ˆ

)()(0 toSolution *jj

Pjjj Uu

Page 17: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

17

Remark I: If all genes are independent

Remark II:

Above results deduced from : Struthers & Kalbfleisch (1986 Biometrika) ; Bretagnolle & Huber-Carol(1988 Scand. JS)

)Assumption in the value(true ˆ0

*jj

Pj

|||| ),sign()sign( 0*

0*

jjjj

)...,,( 1 pxxx

. and between is )0( Then,

.)0,,0( and ),,()0(Let

0*

**1

*

0ββ

0β p

Page 18: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

18

Proposed estimator•Univariate compound likelihood ( unique maxima )

•Multivariate likelihood ( infinitely many maxima when p > n )

•Idea: Mixture of univariate and multivariate likelihood

We call it “compound shrinkage estimator”

n

i Rl l

in

i

i

L1

1

)exp()exp()(

xβxββ

]1,0[,)(log)1()(logargmax)(ˆ 01 aLaLaa nn βββ

p

j

n

i tt ljj

ijjn

i

ilx

xL

1 1

0

)exp()exp(

)(

β

Page 19: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

19

a a1

0

)0(βregressionCox temultivaria afor

solutionsmany Infinitely

)(log)1()(logargmax)(ˆ:estimator shrinkage Compound

01 βββ nn LaLaa

Ridge and Lasso both shrink toward zero

) true( 0β

0ββ

βUβ

)(log)(:ˆ 1nL

Page 20: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

20

• Proposition 2: (in our paper)

• Plug-in variance estimator

*Reasonable performance even when p > n.

).(argmax ˆ with ))(,())ˆ(ˆ( 00 aCVaNan βΣ0ββ

)(}/)(){()( 1 βAβVβAβΣ an

an

an

an n

pnnan

an daaCVd IβhβhβVβA )(}/)(){()()( 1221

ninformatioFisher observed)(

, offunction Estimating)(

function Score)( where,/)()(

βV

βUβUβ

an

an

ann

aaCVdad

ah

))ˆ(ˆ(ˆ aan βΣ

(CV = Cross-Validated likelihood of Verveij & Houwelingen 1993)

Page 21: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

21

Numerical comparison Compare 4 methods1. Compound covariate (CC) estimator

2. Compound shrinkage (CS) estimator

3. Ridge estimator

4. Lasso estimator

* or is obtained by cross-validation (Verveij & Houwelingen 1993 Stat.Med.)

p

jjnL

1

1 ||)(log:)ˆ(ˆ ββ

p

jjnL

1

21 )2/()(log:)ˆ(ˆ ββ

)(log)1()(log:)ˆ(ˆ 01 βββ nn LaLaa

estimators regressionCox univariate ˆ where,)ˆ,...,ˆ(ˆ1 jp β

a

R compound.Cox package (Emura and Chen, 2012)

R penalized package (Goeman et al., 2012)

R penalized package

Page 22: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

22

}n1,...,i,ˆ{ ofmedian theis where

), prognosisPoor ( ˆ ; ) prognosis Good ( ˆ

i

ii

c

cc

xβxβ

n=62, p=97 test set

•Data: Lung cancer data (Chen et al., 2007 NEJM)

LassoRidge

shrinkage compoundcovariate compound

ˆ

β

} 62,...,1 { iix

n=63 , p=97Training set

Predict

Poor prognosisGood prognosis

n=125, p=672

Page 23: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

23

Survival curves for Poor vs. Good prognosis groups in n=62 testing data; p-value for Log-rank test

Page 24: Takeshi  Emura  (NCU) Joint work with Dr. Yi- Hau  Chen and Dr.  Hsuan -Yu Chen ( Sinica )

24

Survival curves for Poor, Medium, Good prognosis groups for n=62 testing data; p-value for Log-rank trend test

Thank you for your attention