内容的妥当性，構造的妥当性と仮説検定の評価

内容的妥当性，構造的妥当性と仮説検定の評価

竹林由武広島大学総合学研究科博士課程後期３年

患者報告式アウトカム尺度の評価法: 信頼性と妥当性の新しい国際基準COSMINチェックリストの使い方

公益社団法人日本心理学会

心理・医学系研究者のためのデータ解析環境Rによる統計学の研究会

第10回研究集会

2012/5/18 (土) 13:20~17:45

東京医科歯科大学

妥当性表面的妥当性

構成概念妥当性

仮説検定

異文化妥当性

構造的妥当性

基準関連妥当性

(併存•予測的)

内容的妥当性

構成

内容的妥当性 (Box D)

構造的妥当性 (Box E)

仮説検定 (Box F)

2



仮説検定

異文化妥当性

構造的妥当性



内容的妥当性

構成



仮説検定 (Box F) 表面的妥当性：尺度の項目が，目的とした構成概念を十分に反映しているように，確かに思える程度 (項目の第一印象で明らかに的外れな項目がないか)

内容的妥当性：尺度の内容が目的とした構成概念を十分に反映している程度

3

1. 全項目が構成概念の側面を反映しているか。 e. g., 無駄な項目がないか

2. 全項目が目的とする母集団と関連しているか。 e. g., 年齢，性別，疾患の特性，国，セッティング

3. 全項目が測定指標の目的と関連しているか。 e. g., 他の疾患との鑑別，重症度の評価，スクリーニング

4. 構成概念が包括的に項目に反映されているか。 e. g., 項目に漏れがないか

評価項目 ※共通項目は除く

内容的妥当性構造的妥当性仮説検証

4

1. 全項目が構成概念の側面を反映しているか。 e. g., 無駄な項目がないか

2. 全項目が目的とする母集団と関連しているか。 e. g., 年齢，性別，疾患の特性，国，セッティング

3. 全項目が測定指標の目的と関連しているか。 e. g., 他の疾患との鑑別，重症度の評価，スクリーニング

4. 構成概念が包括的に項目に反映されているか。 e. g., 項目に漏れがないか

評価項目

項目と概念の関連性

項目と概念の包括性


5

項目 Excellent Good Fair poor

評価あり記述が乏しい評価なし

10人以上で評価

5-9人以上で評価

5人以下で評価評価なし

評価あり目的記述なし推測可能評価なし

評価あり理論的背景なし評価なし

４件法で各項目を評定項目の評価

1. 全項目が構成概念の側面を反映しているか。

2. 全項目が目的とする母集団と関連しているか。

3. 全項目が測定指標の目的と関連しているか。

4. 構成概念が包括的に項目に反映されているか。


6

① 概念モデルと母集団に関する情報の検討文献のレビュー

② 測定指標の内容に関する情報の検討指標に関する全ての情報を開示すべき

装置，測定方法/手続き，スコアリング法質問紙であれば全ての項目，教示，回答方式

③ 専門家パネル (expert panel)の選択自己報告式アウトカムの場合，患者は疾患の専門家

④ 測定指標の内容と構成概念の対応を評価

検討手続き


7

事例慢性閉塞性肺疾患(COPD)者のQOL尺度

Scale content was generated from qualitative, unstructured interviews conducted with patients with COPD in the UK and focus groups with patients in the USA. The interviewees and focus group participants were encouraged to talk at length about their experience of COPD and the impact of the disease on all areas of their life. Audio recordings were made of the interviews and focus group discussions. These were transcribed, and each interview was subjected to content analysis by at least two experienced qualitative researchers to identify statements expressing the impact of LCOPD on patient’s lives.

The needs-based model of QoL were employed for the LCOPD [13]. This model asserts that QoL is dependent on an individual’s ability to fulfil his or her fundamental needs and that QoL is good when most needs are met and poor when they are not.

モデルの選択⇒QOLの欲求ベースモデルを採用

患者への面接による項目収集

専門家の判断による項目の吟味

McKenna SP et al: Qual Life Res (2011) 20:1043–1052


8

事例慢性閉塞性肺疾患(COPD)者のQOL尺度

It was clear from the interviews that COPD has a considerable adverse impact on many aspects of the lives of affected individuals. Figure 1 shows the conceptual framework for the LCOPD, illustrating how the issues raised during the interviews relate to needs and quality of life impact.

患者へのインタビューから，COPD患者のQOLモデルを構築

Cognitive debriefing interviews were conducted with 19 patients in the UK and 16 in the USA. Demographic details of the sample are shown in Table 1. The questionnaires were well received by participants who found them relevant, comprehensible, easy and quick to complete.

項目の関連性・包括性を患者が確認

McKenna SP et al: Qual Life Res (2011) 20:1043–1052


9

事例患者の報告が包括性に重要

Jonesa, P et al: Primary Care Respiratory Journal (2009); 18(3): 208-215

文献レビューによる概念の定義専門家への電話面接によって患者の健康の指標となる項目を聴取患者のインタビューに基づく項目と専門家による項目の確認

項目の作成手順


10



仮説検定

異文化妥当性

構造的妥当性



内容的妥当性

構成内容的妥当性

(Box D)


仮説検定 (Box F) 構造的妥当性：尺度の得点が，目的とした構成概念の次元を妥当に反映している程度

11

1．reflective modelに基づいているか

4. 解析に用いたサンプル数が適切か

6. 古典的テスト理論: 探索的 or 検証的因子分析が実施されたか

7. 項目反応理論:

IRTは項目の1次元性を決定するために実施されたか


12


項目の評価： 1．reflective modelに基づいているか

formative model reflective model

ライフストレス η

Y1: 将来の心配

Y2: 睡眠への支障

Y3:心拍の亢進


X1: 職の喪失

X2:家族の死

X3:離婚 ε

ε

ε

ε

formative reflective

因果項目⇒概念概念⇒項目

測定モデル主成分分析 CTT, IRT

概念の変化は全ての項目に影響を与える

概念の変化に，全ての項目の変化が寄与するとは限らない


13

項目の評価： 1．reflective modelに基づいているか

formative model reflective model


Y1: 将来の心配

Y2: 睡眠への支障

Y3:心拍の亢進


X1: 職の喪失

X2:家族の死

X3:離婚 ε

ε

ε

ε

formative reflective

因果項目⇒概念概念⇒項目

測定モデル主成分分析 CTT, IRT

概念の変化は全ての項目に影響を与える

概念の変化に，全ての項目の変化が寄与するとは限らない

以下の項目はスキップ


14

項目の評価： 4. 解析に用いたサンプル数が適切か

評価サンプル数

□Excellent 項目数×7かつ100以上

□Good 項目数×5 かつ100以上項目数×7かつ100以下

□Fair 項目数×5 かつ100以下

□Poor 項目数×5以下


15

項目の評価： 6. 古典的テスト理論


□Excellent 探索or検証的因子分析を実施かつその選択が適切

□Good 検証的因子分析の方が適切だが探索的因子分析を実施している

□Fair

□Poor 因子分析なし


16

共通因子

因子負荷量

観測変数

独自因子

因子分析モデル

31313

21212

11111

FY

FY

FY

Y2 Y1 Y3

F1

ε1 ε2 ε3

λ11 λ21 λ31

古典的テスト理論による前提 ①誤差の平均は0 ②独自因子間は無相関 ③共通・独自因子は無相関

因子負荷量共通因子から観測変数(項目得点)への影響

独自因子各独自因子から観測変数(項目得点)への影響


17

因子分析モデル

item5

item6

item2

item1

item3

item4

f1 f2

EFA

・新しい尺度の開発・因子数・因子間相関の理論的根拠が弱い・項目の削減⇒短縮版の作成

妥当性の検討という観点では， CFAの実施が適切

CFA

item5

item6

item2

item1

item3

item4

f1 f2

探索的因子分析検証的因子分析

因子構造の仮説生成が目的因子構造の仮説検証が目的


18

・事例

Explanatory factor analysis (EFA)

was used to examine the

dimensionality of the item set

measuring the underlying construct,

because the results suggested

insufficient model fit [27, 28].

S. A. M. Stevelink et al: Qual Life Res (2013) 22:137–144

(ＥＦＡとCFAの使い分け)

ＣＦＡで理論に基づき因子構造を検討

ＣＦＡの結果が不良

ＥＦＡで探索的に検討


19

構造方程式潜在変数→潜在変数観測変数→観測変数観測変数→潜在変数

検証的因子分析

・構造方程式モデリング (structural equation modeling) 測定方程式に基づく因子分析＋構造方程式に基づくパス解析

測定方程式：潜在変数が観測変数に与える影響を記述を記述

Y 5

Y 4

Y 6

F1

ε 4

ε 5

ε 6

λ 11

λ 2

1

λ 31

Y 2

Y 1

Y 3

F2

ε 1

ε 2

ε 3

λ 12

λ 2

2

λ 32


20

共分散構造分析

観測データの共分散行列とモデルの共分散行列の相違を最小化するパス係数を求める (最尤推定法or加重最小二乗法)

),(min

Sf

パス係数(因子負荷量)の推定

適合度の検討データのモデルに対するあてはまりの良さを評価する

Y2 Y1 Y3

F1

ε1 ε2 ε3

λ11 λ21 λ31 データ


21

共分散構造分析

適合度指標

絶対的指標：データとモデルの共分散行列の類似度

(absolute indices)

増分的指標：独立モデルと比較して，分析モデルによって

(incremental indices) データの適合が改善した度合い

倹約的指標：モデルの複雑さを考慮した，モデルのデータ

(parsimonious indices) に対する近似度

指標内容基準

SRMR モデルで説明できなかった分散の大きさ .08以下

CFI 自由度を考慮した乖離度の改善の大きさ .95以上

RMSEA 1自由度あたりの乖離度の大きさ .05以下

Ralph, O et al (2008): The Reviewer‘s Guide to Quantitative Methods in the Social Sciences


22

事例 (適合度指標)

Four practical fit indices were used to evaluate model fit: the Tucker-Lewis index (TLI), the comparative fit index (CFI), the root mean square error of approximation (RMSEA), and the standardized root-mean-square residual (SRMR). Guidelines proposed by Hu and Bentler (13) suggest that models with TLI and CFI close to 0.95 or higher, RMSEA close to 0.06 or lower, and SRMR close to 0.08 or lower are representative of good-fitting models.

方法の節 (指標の適合基準の参照元を明示する)

結果の節(他のモデルと比較して仮説モデルが妥当か評価)

Thombs et al: Arthritis & Rheumatism Vol. 59, No. 3, March 15, 2008, pp 438–443


23

事例 (高次(2次)因子モデルの適用)

Thombs et al: Arthritis & Rheumatism Vol. 59, No. 3, March 15, 2008, pp 438–443

ho

pe

ful

go

od

un

frien

dly

dis

like

d

en

joy

ha

pp

y

IP PA S/V DA

sle

ep

effo

rt

Ge

t go

ing

Ta

lke

d le

ss

ap

pe

tite

bo

tjere

d

min

d

lon

ely

fea

rful

sa

d

cry

de

pre

sse

d

Blu

es

failu

re

Depressive symptom

1次因子間の相関を少数の2次因子で説明適用ケース＞上位概念が想定される場合＞因子間相関が高い場合

Second-order factors are global

factors composed of all of the first-

order factors (e.g., depressed affect,

somatic/vegetative, (lack of) positive

affect, and interpersonal) that provide a mechanism to test the

plausibility that a single overarching

construct is being measured.

方法の節 (高次因子分析)


24

探索的因子分析

・回転方法の選択

5 6 2 1 3 4

f1 f2

直交回転 (orthogonal rotation)

5 6 2 1 3 4

f1 f2

斜交回転 (oblique rotation)

因子間相関を仮定因子間相関をしない

斜交回転で検討し，因子間相関が低ければ直交回転で再検討 (Henson et al: Educ Psychol Meas 66: 393-416, 2006 13)


25


・因子数の選択平行分析 (parallel analysis)

実データの固有値>乱数データの固有値となる最大因子数

最小平均偏相関 (Minimum average Partial Correlation: MAP) 主成分分析の第一成分を統制変数とし，観測変数間の偏相関行列，偏相関係数の平均平方を

繰り返しも求め，平均平方が最小となる主成分を因子数にする

複数の因子数決定法を用いて因子数を判断すべき e.g. ) 平行分析＋MAP＋解釈可能性


26


library(psych) peason.cor<-cor(dat,use="complete.obs")

fa.parallel(peason.cor,n.obs=1000)


27

library(psych)

fit1<-fa(dat, nfactors=2, fm="ml", rotate="promax")

# nfactors=因子数

# fm=推定法

# rorate=回転法

print(fit1, cutoff=0, sort=TRUE)

library(psych)

VSS(dat, n=10, plot=TRUE)

# n=最大因子数

1 2 3 4 5 6

0.0

0.5

1.0

1.5

2.0

2.5

Parallel Analysis Scree Plots

Factor Number

eig

en

va

lue

s o

f p

rin

cip

al co

mp

on

en

ts a

nd

fa

cto

r a

na

lysis

PC Actual Data

PC Simulated Data

FA Actual Data

FA Simulated Data

平行分析

最少平均偏相関

２因子モデルのEFA(最尤法)


・事例方法の節

To minimize potential for over- or under-identification of factors,

parallel analysis (e.g., Brown, 2006) and Velicer‘s MAP (Velicer,

1976) were computed. Parallel analysis computes randomly

generated data sets to specifications and compares the obtained eigenvalues in the raw data to those obtained by chance (see

O’Connor, 2000; Brown, 2006). Velicer‘s MAP is a step-wise

process whereby components are partialed out of the correlation

matrix sequentially. The step corresponding to the lowest partial squared correlation indicates the number of components (see

Velicer, 1976; O’Connor, 2000). Parallel analysis using normally

distributed random data generated 1000 datasets limited to the

95th percentile with principal components analysis (O‘Connor, 2000).

N.T. Van Dam, M. Earleywine: Psychiatry Research 186 (2011) 128–132


28

探索的因子分析・事例結果の節

N.T. Van Dam, M. Earleywine: Psychiatry Research 186 (2011) 128–132

Parallel analysis suggested four roots with eigenvalues larger than what would be obtained by chance. Velicer's MAP revealed a smallest average squared partial correlation of 0.020 on step two suggesting two underlying components.

Maximum Likelihood estimation using promax rotation limited to two-, three-, and four-factor solutions was used to explore factor loadings. Examination of the three- and four factor solutions revealed inconsistencies with theoretical considerations and optimal psychometric properties (see Brown, 2006). Both solutions suggested a factor containing only three items (1, 5, 19) related to appetite and sleep. This factor excluded another item related to sleep (11) and one related to weight changes (18), suggesting substantive inconsistencies. The two-factor solution was both theoretically and psychometrically consistent, suggesting one factor related to negative mood and another factor related to functional impairment.

平行分析で４因子・MAPで２因子

３・４因子は理論的と不一致

２因子解が理論・統計的な一貫性が良い


29

項目の評価： 7. 項目反応理論:


□Excellent 項目の次元性評価のためにIRTを実施している

□Good

□Fair

□Poor 項目の次元性評価のためにIRTを実施していない


30

IRTにおける次元性の評価:

IRTでは項目群の1次元性が前提 ( 1因子構造) IRTの統計モデル=質的変数の因子分析モデル

質的変数の因子分析 1.名義・順序尺度の相関行列に基づく 2. 加重最小二乗法による推定 Y2Y1 Y3

F1

ε1 ε2 ε3

λ11 λ21 λ31

名称変数の組み合わせ

多分相関係数 (polychoric correlation) 順序変数ー順序変数

重双相関係数 (polyserial correlation) 順序変数ー連続変数

四部相関係数 (tetrachoric correlation) ２値変数ー２値変数


31

質的変数の相関イメージ(テトラコリック相関)

a b

cd

t1

t2

y1

y2

評定者２

うつ病無うつ病あり

評定者１うつ病無 a b a+b

うつ病あり c d c+d

a+c b+d 1

③クロス集計表の実測値と近似する楕円の範囲を推定 (２段階最尤推定法)

②2者の評定によるクロス集計表 ①名義(2値)尺度の背後に連続量を仮定

In contrast to a classical CFA which uses the covariance matrix, CFA uses the polychoric correlations. We used the weighted least squares (WLS) method of estimation.

論文での記載事例(質的因子分析)

Mokkink et al (2011): Multiple Sclerosis Journal 17(12) 1498–1503


32

カテゴリカル探索的因子分析

library(polycor)

poly.cor<-polychoric(dat2)

poly.cor$rho


33

library(psych)

fit2<-fa(poly.cor$rho, nfactors=2,

fm=“WLS”,

rotate="promax")

print(fit2, cutoff=0, sort=TRUE)

ポリコリック相関

２因子モデルのEFA



仮説検定

異文化妥当性

構造的妥当性



内容的妥当性

構成



仮説検定 (Box F)

仮説検定：尺度が目的とした構成概念を妥当に測定している前提から導かれる仮説と尺度の得点が一致している程度

34

仮説検証の主要な目的:

収束的妥当性 (convergent validity) 理論的に関連の強い構成概念を測定する指標と相関が高い

弁別的妥当性 (discriminant validity) 理論的に関連の弱い構成概念を測定する指標と相関が弱い

群間で測定指標の得点に差がある


35

3. 解析に用いたサンプルは適切か

4. 相関・群間差に関する仮説が事前に設定されているか

5. 仮説に相関・平均差の「方向性」が含まれているか

6. 仮説に相関・平均差の「程度」が含まれているか

7. 比較尺度が適切に記述されているか

8. 比較尺度のプロパティが適切に記述されているか

10. 仮説検証に適切な統計手法・計画が用いられているか


36


項目の評価： 3. 解析に用いたサンプル数が適切か


□Excellent 各解析につき100以上

□Good 各解析につき50-99以上

□Fair 各解析につき30-49以上

□Poor 各解析につき30以下


37

項目の評価： 4-6. 事前の仮説設定

項目４

□Excellent 複数の仮説を事前に定式化

□Good 最小限の仮説を事前に定式化

□Fair 仮説があいまい仮説設定なしだが演繹可能

□Poor 予測が不明確

項目5-6

□Excellent 相関・差の方向性(程度)を仮説に含む

□Good 相関・差の方向性(程度)を仮説に含まない

□Fair 評価なし

□Poor 評価なし


38

相関・平均値差の大きさの基準

基準

検定指標小中大

d 群 d, g, Δ .20 .50 .80

r 群 r .10 .30 .50

R2 .02 .13 .26

η2 .01 .06 .14

ω2 .01 .09 .25

Cohen (1992)他

項目の評価： 7-8. 比較尺度の適切な記述

項目7

□Excellent 比較尺度の測定概念を適切に記述

□Good 比較尺度の測定概念の大部分を適切に記述

□Fair 比較尺度の概念の記述が乏しい

□Poor 比較尺度の概念の記述なし

項目8

□Excellent 当該研究と同様の母集団における比較尺度の特性が適切に記述されている

□Good 比較尺度の特性が適切に記述されているが，本研究の母集団への適用は不確か

□Fair 何らかの母集団での尺度特性またはそのレファレンスに関する情報がある

□Poor 比較尺度のプロパティに関する情報なし


40

項目の評価： 10. 計画・統計手法の適切性

項目10

□Excellent 統計的手法が適切に用いられている

□Good 統計的手法の適切性を推測可能（e.g., ピアソンの相関を使用しているが得点の分布や平均や標準偏差は呈示されていない）

□Fair 最適な統計手法が用いられていない

□Poor 統計的手法が適切に用いられていない


41

①概念の記述

②仮説の定式化

③比較尺度 or 対象群を記述

④データ収集

⑤結果と仮説の整合性を評価

⑥結果の説明

検討手続き


42

② 仮説の定式化: 事例

The hypotheses were based on the literature and theoretical considerations and were agreed on by all authors before they were tested. As in previous observations, we did not expect to find correlation coefficients of more than 0.50. If a relationship was anticipated, we expected to find correlation coefficients between 0.21 and 0.50. These cutoff values were arbitrarily chosen, but are in line with general recommendations for weak associations. 53,54

仮説 (方向性・程度を含む) 仮説 (方向性・程度を含む)

Apeldoorn et al: Clin J Pain Volume 28, Number 4, May 2012

程度の基準を明記


43

事例：仮説設定

Several studies have concluded that there is a link between high

Waddell scores and depression,15,17,31,37,50 but 1 other study found no

association.33 In this study, depression was measured with the Dutch

translation of the Beck Depression Inventory (BDI).67,68 The BDI

consists of 21 graded items, ranging in severity from 0 to 3. It has

good psychometric properties for the measurement of depression,

but in patients with CLBP a confounding effect has been found for 3

items measuring somatic symptoms.69 In this study, we expected to

find a positive association between high Waddell scores and

elevated BDI scores.

比較尺度の特性および仮説

Apeldoorn et al: Clin J Pain Volume 28, Number 4, May 2012


44

多特性・多方法行列 (multitrait-multimethod matrix:MTMM)

複数の特性を複数の方法で測定した尺度得点間の相関行列から，信頼性・収束的妥当性・弁別的妥当性をまとめて評価

http://stats.stackexchange.com/questions/9918/how-to-compute-correlation-between-within-groups-of-variables

信頼性の指標通常相関係数は1なので，ここには信頼性係数(αなど) を代入する。

同一概念・同一方法の相関

併存的妥当性の指標

同一概念・異方法の相関


Campbell, D. T. & Fiske, D.W. (1959). Convergent and discriminant validation by

the multitrait-multimethod matrix. Psychological Bulletin, 56, 81-105

45

複数の特性を複数の方法で測定した尺度得点間の相関行列から，信頼性・収束的妥当性・弁別的妥当性をまとめて評価

http://stats.stackexchange.com/questions/9918/how-to-compute-correlation-between-within-groups-of-variables

弁別的妥当性の指標同一概念・異なる方法の相関よりも弱ければ良い

異概念・異方法の相関

異概念・同一方法の相関

測定方法の影響性の指標 (method factor) 同一概念・異方法の相関よりも強いと測定方法による影響が強い



46



47

Correlation plot

C3

B3

A3

C2

B2

A2

C1

B1

A1

A1 B1 C1 A2 B2 C2 A3 B3 C3

22 24 67 29 17 75 36 34 100

20 64 15 26 64 24 39 100 34

55 23 19 66 24 23 100 39 36

22 28 68 19 9 100 23 24 75

20 58 13 21 100 9 24 64 17

59 24 21 100 21 19 66 26 29

37 48 100 21 13 68 19 15 67

29 100 48 24 58 28 23 64 24

100 29 37 59 20 22 55 20 22

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

cor.plot (cor.matrix,numbers=T)

SEMによるMTMM

B1 A1 C1

trait A

B2 A2 C2

trait B

B3 A3 C3

trait C

method1

method2

method3

model 1: 多特性・多方法モデル (加法モデル)

freely correlated trait and method


Byrne, B. M. (2011). Structural Equation Modeling with Mplus: Basic Concepts, Applications,

and Programming (Multivariate Applications Series). Routledge Academic 48

SEMによるMTMM (比較モデル)

model 2: 特性を想定しないモデル

no trait – freely correlated method

B1 A1 C1 B2 A2 C2 B3 A3 C3

method1

method2

method3




SEMによるMTMM(比較モデル) model 3: 概念の弁別性を想定しないモデル

perfectly correlated traits – freely correlated methods

B1 A1 C1

trait

B2 A2 C2 B3 A3 C3

method1

method2

method3




SEMによるMTMM (比較モデル) model 4: 測定方法の違いを想定しないモデル

freely correlated traits – perfectly correlated methods

B1 A1 C1

trait A

B2 A2 C2

trait B

B3 A3 C3

trait C

method




SEMによるMTMM

>

model 1 model 2

収束的妥当性がある場合

B1A1 C1

trait A

B2A2 C2

trait B

B3A3 C3

trait C

method1

method2

method3

B1A1 C1 B2A2 C2 B3A3 C3

method1

method2

method3


52

Langer et al. (2010). Child Psychiatry and Human

Development, 41, 549–561.

SEMによるMTMM

>

弁別的妥当性がある場合

model 1

model 2

model 3

B1A1 C1

trait

B2A2 C2 B3A3 C3

method1

method2

method3

B1A1 C1

trait A

B2A2 C2

trait B

B3A3 C3

trait C

method

B1A1 C1

trait A

B2A2 C2

trait B

B3A3 C3

trait C

method1

method2

method3


53

B1 A1 C1

trait A

B2 A2 C2

trait B

B3 A3 C3

trait C

e1

e2

e3

e4

e5

e6

e7

e8

e9

MTMM modelは”解が収束しない/不適解”がよく生じる

代替法 ⇒ correlated uniqueness model(CUM)

SEMによるMTMM


Kenny, D. A. & Kashy, D. A. Psychological Bulletin, Vol 112(1), Jul 1992, 165-172. 54

SEMによるMTMM


55

library(lavaan) model1<- ‘ m1=~1*A1+B1+C1 m2=~1*A2+B2+C2 m3=~1*A3+B3+C3 t1=~1*A1+A2+A3 t2=~1*B1+B2+B3 t3=~1*C1+C2+C3 m1~~ 0*t1 m1~~ 0*t2 m1~~ 0*t3 m2~~ 0*t1 m2~~ 0*t2 m2~~ 0*t3 m3~~ 0*t1 m3~~ 0*t2 m3~~ 0*t3‘ fit.model1<-lavaan:::cfa(model1, data=dat) summary(fit.model1,fit.measures = TRUE)

library(lavaan) model5<-'t1=~1*A1+A2+A3 t2=~1*B1+B2+B3 t3=~1*C1+C2+C3 A1~~ B1 A1~~ C1 B1~~ C1 A2~~ B2 A2~~ C2 B2~~ C2 A3~~ B3 A3~~ C3 B3~~ C3' fit.model5<-lavaan:::cfa(model5, data=dat) summary(fit.model5,fit.measures = TRUE)

CUM MTMM

SEMによるMTMM


56

CUMの解析結果

内容的妥当性の検討

専門家パネル・患者の評価に基づき，概念モデルとの

関連性・包括性が担保された項目で尺度が構成されて

いるかチェック

構造的妥当性の検討

概念モデルの適合性を因子分析によってチェック

仮説検証 (収束的・弁別的妥当性の検討)

他の尺度との理論的関連性に基づき，事前に相関・差の

方向性・程度に関する仮説を設定し，検証．

Summary


57

Apeldoorn, A. T., Ostelo, R. W., Fritz, J. M., van der Ploeg, T., van Tulder, M. W., & de Vet, H. C. (2012). The cross-

sectional construct validity of the Waddell score. Clinical Journal of Pain, 28, 309-17.


and Programming (Multivariate Applications Series). Routledge Academic

Campbell, D. T. & Fiske, D.W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix.

Psychological Bulletin, 56, 81-105.

Henson, R. K. & Roberts, J. K. (2006). Use of exploratory factor analysis in published research: common errors and

some comment on improved practice. Educational and Psychological Measurement, 66, 393-416.

Jones, P., Harding, G., Wiklund, I., Berry, P., & Leidy, N. (2009). Improving the process and outcome of care in

COPD: development of a standardized assessment tool. Primary Care Respiratory Journal, 18, 208-215.

Kenny, D. A. & Kashy, D. A. (1992). Analysis of the multitrait-multimethod matrix by confirmatory factor analysis.

Psychological Bulletin, 112, 165-172.

Langer, D. A., Wood, J. J., Bergman, R. L., & Piacentini, J. C. (2010). A Multitrait–Multimethod Analysis of the

Construct Validity of Child Anxiety Disorders in a Clinical Sample. Child Psychiatry and Human Development,

41, 549–561.

McKenna, S. P., Meads, D. M., Doward, L. C., Twiss, J., Pokrzywinski, R., Revicki, D., Hunter, C. J., &

Glendenning, G. A. (2011) Development and validation of the living with chronic obstructive pulmonary disease

questionnaire. Qualty of Life Research, 20, 1043–1052

Mokkink, L. B., Knol, D. L., & Uitdehaag, B. M. J. (2011). Factor structure of Guy's Neurological Disability Scale in

a sample of Dutch patients with multiple sclerosis. Multiple Sclerosis Journal, 17, 1498–1503.

Meuller R. O. & Hancock, G. R. (2010). Structural equation modeling. G. R. Hancock & R. O. Mueller (Eds.), The

reviewer's guide to quantitative methods in the social sciences. New York: Routledge, Pp. 371-383.

Stevelink, S. A. M., Terwee, C. B., Banstola, N., & van Brakel, W. H. (2013). Testing the psychometric properties of

the Participation Scale in Eastern Nepal. Quality of Life Research, 22, 137–144.

Thombs, T. B., Hudson, M., Schieir, O., Taillefer, S. S., & Baron, M. (2011). Reliability and validity of the center for

epidemiologic studies depression scale in patients with systemic sclerosis. Arthritis Care & Research, 59, 438–443.

Van Dam, N. T. & Earleywine, M. (2011). Validation of the Center for Epidemiologic Studies Depression Scale-

Revised (CESD-R): pragmatic depression assessment in the general population. Psychiatry Research, 30, 128-132.

Vet, H. C. W., Terwee, C. B., Mokkink, L. B., & Knol, D. J. (2011) Measurement in medicine. A practical guide.

Cambridge: Cambridge University Press.

Reference


58

内容的妥当性，構造的妥当性と仮説検定の評価

Technology

Transcript of 内容的妥当性，構造的妥当性と仮説検定の評価