Generalized Linear Model Generalized Linear Model [[GLMGLM]]

ผศ. นคิม ถนอมเสียง

ภาควิชาชีวสถิติและประชากรศาสตร

คณะสาธารณสุขศาสตร ม.ขอนแกน

Email: nikom@kku.ac.th

Generalized Linear Model

โมเดลเชิงเสนโดยนัยทั่วไป (Generalized Linear Model: GLM)

เปนโมเดลทีแ่นะนําครัง้แรกโดย Nelder & Wedderburn (1972)

Continuous data-Continuous data Regression

Continuous data-Categorical data Anova

Generalized Linear Model [GLM]ประกอบดวย 3 องคประกอบไดแก

-องคประกอบตัวแปรสุม (random component)

-องคประกอบเชิงระบบ (systematic component)

- ฟงกชันการเชื่อมโยง (link function)

E(Y)=α + β1x1 +… + βkxk

-องคประกอบตัวแปรสุม (random component)

เปนองคประกอบที่เกีย่วของกับคุณลักษณะการแจกแจง

ความนาจะเปนของตัวแปรตาม (response variable)

หรือเรียกวา “ชนิดของตระกูลเอกโพเนนเชียล”

(type of exponential family)

E(Y)=α + β1x1 +… + βkxk

-องคประกอบเชิงระบบ (systematic component)

เปนการกําหนดฟงกชันเชิงเสนของตัวแปรอิสระ

ที่ใชในการพยากรณตัวแปรตาม

การรวมตัวในลักษณะเชิงเสน (linear combination) ของตัวแปร

อรรถาธิบายดังกลาวเรียกวา “ตัวพยากรณเชิงเสน

(linear predictor)”

E(Y) = α + β1x1 +… + βkxk

คาของตัวแปรอรรถาธิบาย Xi ใดๆ เปนคาใดๆ ขึน้กับโมเดล

เชนคาของ X3 = XI X2 (X3 มีคาเทากับ interaction ระหวาง

ตัวแปร XI และ X2 ) หรือ X3 = X21

ฟงกชันการเชือ่มโยง (link function) เปนสวนที่ใชอธิบาย

ความสัมพันธระหวางองคประกอบเกี่ยวกับตวัแปรสุมและ

องคประกอบเชงิระบบเปนการเชื่อมโยงระหวางสวนตัวแปรสุมและสวนเชิงระบบ

หมายความวา เปนการเชื่อมโยงระหวาง

และตวัแปรอรรถาธิบายที่กําหนดเปนตัวพยากรณเชิงเสน

(linear predictor)

μ =E(Y)

ถาสวนตัวแปรสุมคือ โมเดลที่ไดคือ

ฟงกชัน g(.) เรียกวา “ฟงกชันเชื่อมโยง (link function)”

ในการเขียนฟงกชันเชื่อมโยง ใหงายในการอาน

เชนถา เปนโมเดลสําหรับคาเฉลี่ย เรียกรูปแบบนี้วา

“การเชื่อมโยงแบบเอกลักษณ (identity link)”

โมเดลเชิงเสนเขียนไดเปน

g(μ) = α + β1x1 +… + βkxk

g(μ) = μ

μ = α + β1x1 +… + βkxk

ฟงกชันการเชื่อมโยง-loglinear model ฟงกชันเชื่อมโยงจะแทน

ดังนั้น โมเดลเชิงเสนเขียนไดเปน

g(μ) = log(μ)

log(μ) = α + β1x1 +… + βkxk

ฟงกชันการเชื่อมโยง-logit model ฟงกชันเชื่อมโยงจะแทน

ดังนั้น โมเดลเชิงเสนเขียนไดเปน ⎥⎦⎤

⎢⎣⎡−

μlog)g(μ

kk11xβ..xβα

μ1μlog +++=⎥⎦

⎤⎢⎣⎡−

ตารางที่ 1 ชนิดของโมเดลสําหรับการวิเคราะหทางสถิติ

องคประกอบ

ตัวแปรสุม

การเชื่อมโยง องคประกอบเชิง

ระบบ

โมเดล

Normal Identity ตัวแปรตอเนื่อง Regression

Normal Identity ตัวแปรกลุม Analysis of variance

Normal Identity ผสม Analysis of covariance

Bernoulli Logit ผสม Logistic regression

Poisson Log ผสม Log linear

Multinomial Gernalized

ผสม Multinomial response

STATA link functions areLink function glm option ----------------------------------------identity link(identity) log link(log) logit link(logit) probit link(probit) complementary log-log link(cloglog) odds power link(opower #) power link(power #) negative binomial link(nbinomial)log-log link(loglog) log-compliment link(logc)

STATA distribution families areFamily glm option ----------------------------------------Gaussian(normal) family(gaussian) Inverse Gaussian family(igaussian)Bernoulli/binomial family(binomial) Poisson family(poisson) Negative binomial family(nbinomial)Gamma family(gamma)

ตัวอยาง การกรนและการเกิดโรคหัวใจ ขอมูลดังแสดงในตาราง

การกรน HD NHD รวม

0 24 1355 1379

2 35 603 638

4 21 192 213

5 30 224 254

log βαμ

μ+=⎥

⎤⎢⎣

⎡−

glm hd1 snore, family(binomial n) link(logit)

ตัวอยาง GLM. input snore hd1 hd0

snore hd1 hd01. 0 24 13552. 2 35 6033. 4 21 1924. 5 30 2245. end

. generate n=hd0+hd1

. glm hd1 snore, family(binomial n) link(logit)

Iteration 0: log likelihood = -11.539348Iteration 1: log likelihood = -11.530734Iteration 2: log likelihood = -11.530733

Generalized linear models No. of obs = 4Optimization : ML: Newton-Raphson Residual df = 2

Scale param = 1Deviance = 2.808911793 (1/df) Deviance = 1.404456Pearson = 2.874323296 (1/df) Pearson = 1.437162

Variance function: V(u) = u*(1-u/n) [Binomial]Link function : g(u) = ln(u/(n-u)) [Logit]Standard errors : OIM

Log likelihood = -11.53073319 AIC = 6.765367BIC = .0363230709

------------------------------------------------------------------------------hd1 | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------snore | .3973366 .0500107 7.95 0.000 .2993175 .4953557_cons | -3.866248 .1662144 -23.26 0.000 -4.192022 -3.540474

------------------------------------------------------------------------------

Analysis of FitAnalysis of Fit

- Deviance หรอื Log Likelidood

- ขึน้อยูกับ random component

- กรณีวิเคราะห logit ดังนี้

nln(n)])0

-2[nDevience −+=

nln(n))0

nd LikelihooLog −+=

Age chd Phat l1

20 0 0.043479 -0.0444523

23 0 0.059621 -0.0614728

24 0 0.066153 -0.0684424

… … … …

69 1 0.912465 -0.091606

รวม -53.6765477

การคํานวณ Log Likelihood, Deviance เมื่อมีเฉพาะ constant

- Deviance (D) เปนคาสถิติที่คํานวณจากคา Log likelihood

- คาสถิติที่ใชประเมิน goodness of fit ของสมการ

ตัวอยาง การศึกษาอายุและปจจัยเสี่ยงตอ CHD

nln(n))0

nd LikelihooLog −+== 43ln(43) + 57ln(57) – 100ln(100)

= 161.7316 + 230.45392 - 460.51702

= -68.331491

nln(n)])0

-2[nDevience −+== -2(-68.331491)

= 136.66298

−+=n

-)ln(1i

yoodLogLikelih π̂π̂

0).1109211(25.3094531

0).1109211(25.309453

ειπ̂

= 0.04347874

⎥⎥⎦

⎢⎢⎣

⎡∑=

−+−=n

-)ln(1i

y2Devience π̂π̂

= -2(-53.67654)

= 107.3531

53.6765477- oodLogLikelih =

Model Statistics

Akaike information criterion (AIC)

คา AIC มีคานอยแสดงวา better fit model

2p)2L(MAIC k

1.1135309100

2(2))(53.676546AIC =

. glm chd age, family(binomial) link(logit)

Iteration 0: log likelihood = -53.710416 Iteration 1: log likelihood = -53.676576 Iteration 2: log likelihood = -53.676546 Iteration 3: log likelihood = -53.676546

Variance function: V(u) = u*(1-u) [Bernoulli]Link function : g(u) = ln(u/(1-u)) [Logit]Standard errors : OIM

Log likelihood = -53.67654635 AIC = 1.113531BIC = 98.14275232

------------------------------------------------------------------------------chd | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------age | .1109211 .0240598 4.61 0.000 .0637647 .1580776

_cons | -5.309453 1.133655 -4.68 0.000 -7.531376 -3.087531

Log likelihood ratiofit สมการมีแต constant. glm chd , f(b) l(l)

Iteration 0: log likelihood = -68.373484 Iteration 1: log likelihood = -68.331492 Iteration 2: log likelihood = -68.331491

-------------+----------------------------------------------------------------_cons | -.2818511 .2019893 -1.40 0.163 -.6777429 .1140406

------------------------------------------------------------------------------

ไดคา log likelihood เทากับ –68.331491

ทําใหเปน Devience = -2(-68.331491) = 136.6629827

fit สมการ constant และ age

. glm chd age, f(b) l(l)

Iteration 0: log likelihood = -53.710416 Iteration 1: log likelihood = -53.676576 Iteration 2: log likelihood = -53.676546 Iteration 3: log likelihood = -53.676546

-------------+----------------------------------------------------------------age | .1109211 .0240598 4.61 0.000 .0637647 .1580776

_cons | -5.309453 1.133655 -4.68 0.000 -7.531376 -3.087531------------------------------------------------------------------------------

⎥⎦

⎤⎢⎣

⎡−=

variablethewithglikelihood

variablethewithoutlikelihood2lnG

⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢

−−

⎟⎟

⎜⎜

⎟⎟

⎜⎜

π̂π̂

[ ] [ ]⎪⎭

⎪⎬⎫

⎪⎩

⎪⎨⎧∑=

−+−−−+=n

nln(n))0

)ln(n0

)ln(1i

y2G π̂π̂

[ ]{ }3129

100ln(100)57ln(57)43ln(43)53.6772G

.=−+−−=

. logit chd age

Iteration 0: log likelihood = -68.331491

Logit estimates Number of obs = 100

LR chi2(1) = 29.31

Prob > chi2 = 0.0000

Log likelihood = -53.676546 Pseudo R2 = 0.2145

-------------+----------------------------------------------------------------age | .1109211 .0240598 4.61 0.000 .0637647 .1580776

_cons | -5.309453 1.133655 -4.68 0.000 -7.531376 -3.087531

Generalized Linear Model Generalized Linear Model [[GLMGLM]] · Generalized linear models No. of...

Transcript of Generalized Linear Model Generalized Linear Model [[GLMGLM]] · Generalized linear models No. of...

Generalized Linear Model Generalized Linear Model [[GLMGLM]] · Generalized linear models No. of...

Documents

Transcript of Generalized Linear Model Generalized Linear Model [[GLMGLM]] · Generalized linear models No. of...

LINEAR MAPPINGS AND GENERALIZED UPPER SPECTRUM FOR ...

統計モデリング 第九回 配布資料bayes.sigmath.es.osaka-u.ac.jp/ftanaka/T/modeling/... · Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric

Generalized linear models - University of California, San Diegovulstats.ucsd.edu/pdf/Gelman.ch-06.generalized-linear...112 GENERALIZED LINEAR MODELS where, as before, θ i =exp(X iβ).The

GLMs: Generalized Linear Models

Generalisiertes Vektorraummodell ( Generalized Vector Space Model, GSVM)

HW 2: due March 4, 11.59 pm. Generalized Linear Models ...utstat.utoronto.ca/reid/sta2201s/feb11.pdfToday I HW 2: due March 4, 11.59 pm. I Generalized Linear Models Chs. 6 and 7 SM

Model Linier Terampat - STK731...Dr. Kusman Sadik, M.Si Program Studi Doktor Departemen Statistika IPB Semester Genap 2019/2020 Model Linier Terampat - STK731 (Generalized Linear Model)Algoritma

MODEL MODEL LEBIH RUMIT · Model Nonlinear Yang secara Instrinsik Linear Jika suatu model adalah linear instrinsik, maka model ... nilai yang memenuhi pertidaksamaan

Aplikasi Model Generalized Space Time Autoregressive ...

Generalized Linear Models Logistic Regression Log-Linear Models © G. Quinn & M. Keough, 2004.

Generalized Linear Models...Generalized Linear Models † GLMs generalize the standard linear model: Yi = Xiﬂ + †i Random: Normal distribution †i » N (0;¾2) Systematic: linear

General Linear Models; Generalized Linear Models Hal Whitehead BIOL4062/5062.

ESTIMASI PARAMETER MODEL GENERALIZED SPACE TIME ...

Model Linear Terampat - WordPress.com · 08/02/2017 · Departemen Statistika IPB, 2017/2018. 2 Pada model linear klasik, seperti regresi linear, ... Maka dikembangkan Model Linear

Introduction to General and Generalized Linear Modelshmad/GLM/Slides_2012/week03/lect03.pdf · 2012-02-13 · Introduction to General and Generalized Linear Models The Likelihood

Part V The Generalized Linear Model

ISI KANDUNGAN MUKA SURAT - Universiti Teknologi … KANDUNGAN MUKA SURAT The Application of Generalized Linear Model (GLM) in Insurance Claims Foo Weoi Ming & PM. Dr. Fadhilah Yusof

MODUL 3 GENERALIZED LINEAR MODELS · PDF file3.2 Generalized Linear Models untuk Respon Biner Variabel respon banyak yang hanya memiliki dua kategori misalnya kelulusan dalam tes (lulus

Algoritma Komputasi dan Program R dalam GLM · menyebar Binomial, Poisson, Gamma, Eksponensial, dsb. Maka dikembangkan Model Linear Terampat (Generalized Linear Model) untuk mengatasi

Part V The Generalized Linear Model Chapter 16 Introduction.

統計モデリング第九回配布資料bayes.sigmath.es.osaka-u.ac.jp/ftanaka/T/modeling/... · Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric