Rough-set-based ADR signaling from SRS data with missing values

37
智智智智智智智智 智智智智 : 智智智 智智 智智 : 智智 智智智 : 智智智 On the Feasibility of Rough- Set-based ADR Signaling from Spontaneous Reporting Data with Missing Values

Transcript of Rough-set-based ADR signaling from SRS data with missing values

Page 1: Rough-set-based ADR signaling from SRS data with missing values

智慧型計算實驗室指導教授 :林文揚 教授作者 : 藍琳簡報人 : 王敏賢

On the Feasibility of Rough-Set-based ADR Signaling from

Spontaneous Reporting Data with Missing Values

Page 2: Rough-set-based ADR signaling from SRS data with missing values

甚麼是 ADR

• Adverse Drug Reaction( 藥物不良反應 )

• ADR rule:Predc, drug → symptom

• e.g. sex=“Female”, drug=“d1” → symptom=“s1”

Page 3: Rough-set-based ADR signaling from SRS data with missing values

ADR 案例• 1950 年在德國上市的 Thalidomide 在當時被認為是最安全且快速的鎮定劑之一,經常用在抑制懷孕期間的嘔吐感。• 造成超過 12000 畸形胎兒,並在多個國家被發現容易造成多發性神經炎。

Page 4: Rough-set-based ADR signaling from SRS data with missing values

Thalidomide 產生畸形胎兒

Page 5: Rough-set-based ADR signaling from SRS data with missing values

Spontaneous Reporting System(SRS)

• 自發性通報系統• FDA Adverse Event Reporting System (FAERS)

• 所有通報資料以 line-oriented 格式儲存,並定期開放。

Page 6: Rough-set-based ADR signaling from SRS data with missing values

FAERS 開放資料

Page 7: Rough-set-based ADR signaling from SRS data with missing values

FAERS 開放資料

Page 8: Rough-set-based ADR signaling from SRS data with missing values

FAERS 定期公佈研究報告

http://goo.gl/VJzvCG

Page 9: Rough-set-based ADR signaling from SRS data with missing values

Drug Safety Labeling Changes

http://goo.gl/FEoWsj 2008-01~2014-10 , 每個月一份清單

Page 10: Rough-set-based ADR signaling from SRS data with missing values

The 2*2 contingency table

Predc. Symptom Other symptom Total

Drug a b a + b

other drugs c d c + d

Total a + c b + d N = a + b +c + d

•For ADR signal detection

Page 11: Rough-set-based ADR signaling from SRS data with missing values

ADR 信號量測方式• Frequentist methods

– Proportional Reporting Ratio(PRR)

– Reporting Odds Ratio(ROR)

• Bayesian methods– Bayesian Confidence Propagation Neural network(BCPNN)– Multi-item Gamma Poisson Shrinker(MGPS)

d)c/(cb)a/(a

dbca

//

Page 12: Rough-set-based ADR signaling from SRS data with missing values

ADR 信號量測方式04

Q1

04Q

305

Q1

05Q

306

Q1

06Q

307

Q1

07Q

308

Q1

08Q

309

Q1

09Q

310

Q1

10Q

311

Q1

11Q

312

Q1

12Q

313

Q1

13Q

30

7

14

21

28

35

Page 13: Rough-set-based ADR signaling from SRS data with missing values

世界不會永遠是美好的 !!

Page 14: Rough-set-based ADR signaling from SRS data with missing values

SRS 資料問題• 資料並非完全嚴謹,無法驗證其可靠度。• 在資料探勘的過程中,帶有缺漏值的資料對結果影響很大。

Page 15: Rough-set-based ADR signaling from SRS data with missing values

Missing Value

Page 16: Rough-set-based ADR signaling from SRS data with missing values

幾個處理 Miss Value 的傳統方法• Deletion methods:

– Listwise deletion– Pairwise deletion

Series1

5104523

35288353720493

4709159

TotalListwisePairwise-agePairwise-gender

Page 17: Rough-set-based ADR signaling from SRS data with missing values

ROUGH SET BASED METHOD

Page 18: Rough-set-based ADR signaling from SRS data with missing values

Rough Set Theory

• 由波蘭數學家 Zdzisław I. Pawlak(1926-2006) 在1982 年提出,是一個用來分析帶有不確定性資料的工具。

• 用以求出明確集 (crisp set)的上、下逼近集合。

Page 19: Rough-set-based ADR signaling from SRS data with missing values

一些簡單的名詞Case Height Weight Gender

1 170 60 Male

2 165 55 Female

3 155 45 Female

4 150 65 Male

S={U,A}•U={1,2,3,4}•A={Height,Weight,Gender}

Page 20: Rough-set-based ADR signaling from SRS data with missing values

Lower and Upper Approximations

• 目前有一資訊系統 S={U,A} , 設 X 、 P 分別為 U 、 A 的子集合,則 PX 的上下近似集定義如下 :

}][|{ XPeUeXP

}][|{ XPeUeXP

Page 21: Rough-set-based ADR signaling from SRS data with missing values

Lower and Upper Approximations

Lower approximation

Set X

Upper approximation

Page 22: Rough-set-based ADR signaling from SRS data with missing values

ExampleCase Height Weight Age

1 170 75 18

2 165 50 30

3 165 60 18

4 145 75 18

5 145 50 30

6 170 45 45

7 145 50 45

8 170 45 30

X={1,2,6,8}P={Weight, Age}

Equivalence classes:{1,4}{2,5}{3}{6}{7}{8}

}8,6{

}][|{

XP

XPeUeXP

}8,6,5,4,2,1{

}][|{

XP

XPeUeXP

Page 23: Rough-set-based ADR signaling from SRS data with missing values

ROUGH SET STRATEGIES TO DATA WITH MISSING DATA

Page 24: Rough-set-based ADR signaling from SRS data with missing values

原有的列聯表

•在 Information system 完整的情況下, a 、 b 、 c及 d 四個值是確定的。

Predc. Symptom Other symptoms

Drug a b

Other Drugs c d

Page 25: Rough-set-based ADR signaling from SRS data with missing values

帶有近似範圍的列聯表The specific

attribute symptom Other symptoms

drug

Other drugs

•利用粗糙集理論目的是求出該 crisp set 的上下逼近集合。

Page 26: Rough-set-based ADR signaling from SRS data with missing values

對缺漏值的兩種解釋• Lost(?):

– 原本應該存在的資料但遺失或被刪除。– 不應被忽略。

• Don’t care(*):– 缺漏的屬性值可有可無。

Page 27: Rough-set-based ADR signaling from SRS data with missing values

Characteristic relation & Characteristic set

• Lost(?)– Similarity characteristic relation:

– Similarity characteristic set:

.)(

?,such thatallfor

),(),(ifonly and if)(),(

axPa

ayaxPRyx S

)}(),(|{),( PsRyxyxPK s

Page 28: Rough-set-based ADR signaling from SRS data with missing values

Characteristic relation & Characteristic set

• Don’t care(*):– Tolerance characteristic relations:

– Tolerance characteristic set:

. *)( *)(

Paayax

ayaxPRyx T

allfor ,or,

or),(),( ifonly and if)(),(

)}(),( |{),( PRyxyxPK TT

Page 29: Rough-set-based ADR signaling from SRS data with missing values

Lower and Upper approximations

• Singleton approximation

})(|{ XxKUxXP Pkg

})( |{ XxKUxXP Pkg

• Subset approximation}),(, |),({ XxpKUxxpKXPK

s

}),(,|),({ XxPKUxxPKXPKS

• Concept approximation

}),(, |),({ XxPKXxxPKXPKc

}),(, |),({ XxPKXxxPKXPKc

Lower approximation

Set X

Upper approximation

Page 30: Rough-set-based ADR signaling from SRS data with missing values

Incomplete SRS data

Attribute set P

Strength Computation

• global • local

Characteristic set K(P, x)

• tolerance (don’t care) • Similarity (lost)

Approximation PX

• singleton• subset• concept

known rule :Predc , drug reaction

• Analyze the feasibility of the 12 different methods

Rough Set : Basic Idea

Page 31: Rough-set-based ADR signaling from SRS data with missing values

Example the singleton approximation& global

ISR Age Gender Drug PT

1 ? ? d1 s1

2 a2 ? d2,d3 s1,s2

3 a1 g1 d1 s1

4 a1 g1 d2,d3 s1,s2

5 ? ? d2,d3 s1,s2

6 ? g2 d1 s1

7 ? g1 d1 s1

8 a1 g1 d3 s1,s2

}8{)8,( }4{)4,(}73{)7,( }3{)3(

}6{)6,( }2{)2(}542{)5,( }7631{)1(

PKPK,PKP,K

PKP,K,,PK,,,P,K

SS

SS

SS

SS

.)(

?,such thatallfor

),(),(ifonly and if)(),(

axPa

ayaxPRyx S

Similarity characteristic relation:

Page 32: Rough-set-based ADR signaling from SRS data with missing values

Example the singleton approximation& global

Gender = g1 PT = s1 other PT

Drug = d2 Xa={4} Xb={}

other drugs Xc={3,7,8} Xd={}

}8{)8,( }4{)4,(}73{)7,( }3{)3(

}6{)6,( }2{)2(}542{)5,( }7631{)1(

PKPK,PKP,K

PKP,K,,PK,,,P,K

SS

SS

SS

SS

dd

cc

bb

aa

XPXPXPXPXPXPXPXP

}8,7,3,1{}8,7,3{

}4{}4{

})(|{ XxKUxXP Pkg

})( |{ XxKUxXP Pkg

Page 33: Rough-set-based ADR signaling from SRS data with missing values

Example the singleton approximation& global

Gender = g1 PT = s1 other reactions

Drug = d2 [1, 1] 0

other drugs [3, 4] 0

)()(PRR

)()(

bacdca

bacdca

333.1)01(3)04(1PRR75.0

)01(4)03(1

Page 34: Rough-set-based ADR signaling from SRS data with missing values

Experiment

No. Rule Drug Name Symptom

The suitable of group(Age or Gender)

Marked year in US

Year withdrawn

in US

R1-1

AVANDIA

MYOCARDIAL INFARCTION

18~ 1990 2010R1-2 DEATH

R1-3 CEREBROVASCULAR ACCIDENT

R2

TYSABRIPROGRESSIVE MULTIFOCAL

LEUKOENCEPHALOPATHY18~ 2004 2005

R3ZELNORM CEREBROVASCULAR

ACCIDENT Female 2002 2007

Page 35: Rough-set-based ADR signaling from SRS data with missing values

實驗結果04

Q1

04Q

204

Q3

04Q

405

Q1

05Q

205

Q3

05Q

406

Q1

06Q

206

Q3

06Q

407

Q1

07Q

207

Q3

07Q

408

Q1

08Q

208

Q3

08Q

409

Q1

09Q

209

Q3

09Q

410

Q1

10Q

210

Q3

10Q

411

Q1

11Q

211

Q3

11Q

412

Q1

12Q

212

Q3

12Q

413

Q1

13Q

213

Q30

1

2

3

4

5

0

53

106

159

212

265

Method 1 M(s, g, g) for R1-2

PRR_ld PRR_lower PRR_pd PRR_upperThreshold=2 A_ld A_rs A_pd

PRR A Value

Page 36: Rough-set-based ADR signaling from SRS data with missing values

04Q

104

Q2

04Q

304

Q4

05Q

105

Q2

05Q

305

Q4

06Q

106

Q2

06Q

306

Q4

07Q

107

Q2

07Q

307

Q4

08Q

108

Q2

08Q

308

Q4

09Q

109

Q2

09Q

309

Q4

10Q

110

Q2

10Q

310

Q4

11Q

111

Q2

11Q

311

Q4

12Q

112

Q2

12Q

312

Q4

13Q

113

Q2

13Q

30

1

2

3

4

5

0

53

106

159

212

265

Method 1 M(s, g, g) for R1-2

ROR_ld ROR_lower ROR_pdROR_upper Threshold=2 A_ld

ROR A Value

Page 37: Rough-set-based ADR signaling from SRS data with missing values

Q & A