Study Design for Linkage, Association and TDT Studies

72
Jan, 2003 YMGC Genotyping Core Study Design for Linkage, Association and TDT Studies 林林林 Ming-Wei Lin, PhD 林林林林林林林林林林林林 林林林林林林林林林林林林

description

Study Design for Linkage, Association and TDT Studies. 林明薇 Ming-Wei Lin, PhD 陽明大學醫學系家庭醫學科 台北榮民總醫院教學研究部. Collins FS. (1992) Nature genetics 1:3-6. Collins FS. (1992) Nature genetics 1:3-6. Linkage Mapping for Disease Genes. Linkage analysis (Lod score method) Allele-sharing methods. - PowerPoint PPT Presentation

Transcript of Study Design for Linkage, Association and TDT Studies

Page 1: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Study Design for Linkage, Association and

TDT Studies

林明薇 Ming-Wei Lin, PhD陽明大學醫學系家庭醫學科台北榮民總醫院教學研究部

Page 2: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Collins FS. (1992) Nature genetics 1:3-6

Page 3: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping CoreCollins FS. (1992) Nature genetics 1:3-6

Page 4: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Linkage Mapping for Disease Genes

•Linkage analysis (Lod score method)•Allele-sharing methods

Page 5: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Gregor Mendel

•The principle of segregation of alleles.•The principle of independent assortment.

Page 6: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

LinkageLinkage describes the phenomenon whereby allele at neighbouring loci are close to one another on the same chromosome, they will be transmitted together more frequently than chance.

Page 7: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Linkage Family

Page 8: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Recombinant Gametes

Crossing over between

two neighbouring loci

will produce

recombinant gametes.

Page 9: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Recombination Fraction

Recombination fraction (θ) =

number of recombinant gametes

---------------------------------------total gametes

Page 10: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Recombination Fraction

•Recombination fraction is a measure of genetic distance.

•1cM= 1% chance of recombination between two loci.

Page 11: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Page 12: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Page 13: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Estimation of Recombination

Fraction• Direct Method:

count recombinants.

• Maximum Likelihood Method:Unknown phases

Incomplete penetrance Heterogeneity

Page 14: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Likelihood Odds

Likelihood of data if loci linked at θ

Likelihood odds = Likelihood of data if loci

unlinked

L(θ< 0.5) =

L(θ= 0.5)

Page 15: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Lod Score

L(θ< 0.5)Lod score (θ) = log10

L(θ = 0.5)

Page 16: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Phase Known Family

BD BCAC BC

BE AF

AB CD

AD AD

N NNNN R

DA F

d

DA B

d

Page 17: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Phase Known

L(θ) = (θ)r (1-θ) n-r

r: No. of recombinants

n: All meiosis

Page 18: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Lod ScorePhase Known

L(θ)LOD = log

L(θ= 0.5)

(θ) r (1-θ) n-r

= log [ ]

(0.5) n

= log 2nθr(1-θ)n-r

Page 19: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Phase Unknown Family

BD BCAC BC

AB CD

AD AD

N NN NN RNRRRRR

A:B:

DA

DBor

DA

DBor

dB

d

A

Page 20: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Phase Unknown

L(θ) = 1/2 (θ)r (1-θ)n-r +1/2 (θ)n-r(1-θ)r

r: No. of recombinants

n: All meiosis

Page 21: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Lod ScorePhase Unknown

L(θ)LOD = log

L(θ= 0.5)

1/2[(θ) r(1-θ) n-r+(θ)n-r(1-θ)r ]

=log { }

(0.5) n

= log {2n-1[θr(1-θ)n-r +θn-r(1-θ)r ]}

Page 22: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Lod Score - Maximum Likelihood

Estimate (Z)• Can be calculated at any values of between 0 and 0.5, but are conventionally reported at =0, 0.01, 0.05, 0.1, 0.2, 0.3, and 0.4.

• Zmax is the maximum likelihood estimate (MLE) of .

Page 23: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Total Lod ScoreLod score obtained from individual families can be added together to calculate the total lod score.

Page 24: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Statistical Significance of the

Lod Scorelod score > 3: evidence of linkage

2 < lod score < 3: suggestive evidence of

linkage

-2 < lod score < 2: uninformative of

linkage

lod score < -2: exclusion of linkage

Page 25: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Lod Score

•Two-point lod score analysis

•Multipoint lod score analysis

Page 26: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Page 27: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Page 28: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Is a Pedigree Useful for linkage

Analysis?• Are critical individuals in the

pedigrees doubly heterozygous at the loci? (Informative)

• Can the offsprings be scored as recombinants or nonrecombinants? (Phase)

Page 29: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Parameters Assumed in Lod Score Analysis

• Transmission mode of disease

• Recombination fraction

• Trait allele frequencies

• Penetrance values for each possible disease phenotypes

• Marker allele frequencies.

Page 30: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Advantages of Lod Score Analysis

• Statistically, it is more powerful approach than any nonparametric method.

• Utilizes every family member’s phenotypic and genotypic information.

• Provides an estimate of the recombination fraction.

• Provides a statistical test for linkage and for genetic (locus) heterogeneity.

Page 31: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Limitations of Lod Score Method

•assumes single locus inheritance

•requires specification of disease gene frequency and penetrance

•has reduced power when disease model is grossly misspecified

Page 32: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Successful Examples Using Lod Score

Method• Cystic fibrosis

CFTR gene• Huntington disease

HD gene• Alzheimer disease

APP• Hereditary breast cancer

BRCA1BRCA2

Page 33: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Complex Diseases

• No clear pattern of Mendelian inheritance• A mix of genetic and environmental

factors• Incomplete penetrance• Phenocopies• Oligogenic or polygenic• Heterogeneity• High frequency of disease-causing allele

Page 34: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Recurrence Risk Ratio (λ)

Frequency in relatives of affected person

λr = -------------------------------------------------------Population frequency

r denotes the degree of relationship

Page 35: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Recurrence Risk Ratio

Genetic mapping is much

easier for traits with high λs (λs

> 10) than for those with low λs

(λs < 2).

Page 36: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Recurrence Risk Ratio of Different

DiseasesDisease λ s

Cystic fibrosis 500Type I diabetes 15schizophrenia 8.6Type II diabetes 3.5

Page 37: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Allele-sharing Methods

•Identical by state (I.B.S.)Two alleles of the same form.

•Identical by descent (I.B.D.)Two alleles are descended from the same ancestral allele.

Page 38: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Allele-sharing Methods

Testing whether affected relatives inherited a region IBD (or IBS) more often than expected under random Mendelian segregation.

Page 39: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

IBD = 2

IBD = 1

IBD = 0

AC AB

BC BC

AC BC

AC AB

AB

CD

AD BC

Page 40: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

IBS = 2

IBS = 1

IBS = 0

BC BC

AC AB

AD BC

Page 41: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Affected Sib-pair Methods

An affected sib-pair may share 0,1, 2 alleles identical by descent (IBD) with probabilities of 0.25, 0.5, 0.25, respectively, at any marker locus.

Page 42: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

IBD = 2

ACAB

BC

BCBC ABAC

BC AA

IBD = 1IBD = 0

25%

50%

25%

Page 43: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Affected Sib-pair Methods

If the marker locus is independent of the trait locus, the probabilities of the affected sib-pairs share 0,1, 2 alleles ibd will remain as 0.25, 0.50, 0.25.

Page 44: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Affected Sib-pair Methods

If the marker locus is linked to the trait locus, an excess of affected sib-pair sharing two alleles ibd will be expected.

Page 45: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Allele-sharing Methods

•Affected Sib-pairs

•Affected Pedigree Member

Page 46: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Pearson 2 statistics

Comparing observed numbers of sib-pairs sharing 0, 1, 2 alleles IBD with their expectations under the null hypothesis.

Page 47: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Pearson 2 statistics

• Alternative hypothesis:IBD sharing 0 1 2observed n0 n1 n2

N = n0 + n1 + n2

• Null hypothesis:IBD sharing:0 1 2 expected N/4 N/2 N/4

Page 48: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Comments on Allele-Sharing Method

There is no need to specify any genetic parameters of the transmission model.

Less powerful to detect linkage compared with the lod score method if the genetic transmission model can be specified correctly.

It is poor at providing a precise location of the disease gene.

Page 49: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Successful Examples Using Sib Pair

Method• Insulin-dependent diabetes• Non-insulin-dependent

diabetes• Multiple sclerosis• Alzheimer disease

Page 50: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Thresholds for Mapping Complex

TraitsMappingMethods

Suggestivelinkage

Lod score

SuggestivelinkageP value

Significantlinkage

Lod score

SignificantlinkageP value

Lod score 1.9 0.0017 3.3 0.000049

Sibs andhalf-sibs

2.2 0.00074 3.6 0.000022

Uncle-nephew

2.3 0.00056 3.7 0.000018

Firstcousin

2.3 0.00052 3.7 0.000016

Lander and Kruglyak (1995) Nature Genetics, 11, 241-247

Page 51: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Types of Association Study

• Case-Control study

• Transmission disequilibrium test (TDT)

• Sibling control

Page 52: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

○ □

○ □

□AD AC B

CAC A

B

BC

CD AA AD AC

■●■

● ●■

●DD

AC BD

CD CD

BC

AB

AD BD

AD

Case-Control study

Page 53: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Linkage Disequilibrium

Linkage disequilibrium is the non-random association in a population of alleles at closely linked loci.

Page 54: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Linkage DisequilibriumA2---B1-----C2---X----D3-----E4----F2

A2---B1-----C2---X----D3-----E4

A2---B1-----C2---X----D3

B1-----C2---X----D3

C2---X----D3 C2---X

N generations

Page 55: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

TDT StudyTo examine the transmission of a particular allele at a locus from heterozygous parents to their affected offspring.

Page 56: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

□ ○

□ ○

□ ○

□ ○

●BC AB BC B

B

AB

AC AC BC

AC BB

BC

AB

“ Trios” for TDT study

“ transmitted allele“ “case”

“ Non-transmitted allele” “control”

Page 57: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Using Sibling as Control

• Sibling association test (SIBASSOC)

• Sib transmission/disequilibrium test (S-TDT)

• Sibship disequilibrium test (SDT)

Page 58: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

What does a positive

association imply?•Direct causal effect•Linkage disequilibrium•Population stratification

Page 59: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

When to Use Association Study

• Candidate gene

• Positive evidence of linkage

• Candidate region allelic associations

Page 60: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Successful Examples of Mapping Genes by Association Studies

• Autoimmune diseases associated with HLA IDDM multiple sclerosis ankylosing spondylitis rheumatoid arthritis

• Angiotensin-converting enzyme and heart disease

• low-density lipoprotein receptor and heart disease

• insulin locus and IDDM

Page 61: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Sample Size Required

Linkage for Monogenic Traits

• One large family

• at least 40 informative meioses

• 20 cM marker density

• Expected lod score > 3

Page 62: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Sample Size Required

Allele-Sharing• λs = 2

• at least 600 affected sib pairs

• narrow down the region to 1 cM

Page 63: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Sample Size Required

Linkage for Complex Traits

Sham, Lin et al (2000) Am J Human Genetics 66, 1661-1668.

Page 64: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Genetic Markers

A complete informative

marker locus at 0

recombination fraction to

the disease locus.

Page 65: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Genetic ModelsModel f0 f1 f2 Kp q

Common Recessive (CR) 0.005 0.005 0.50 0.01 0.100

Common Dominant (CD) 0.005 0.500 0.50 0.01 0.005

Minor gene 1 (MG1) 0.050 0.200 0.80 0.10 0.130

Minor gene 2 (MG2) 0.050 .150 0.45 0.10 0.207

Kp: population risk, q: disease allele frequencyf0: penetrance for the genotype AA; f1: penetrance for the genotype Aaf2: penetrance for the genotype aa

Page 66: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Pedigree TypesPedigree Types

Page 67: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Number of Pedigrees Required

= 0.0001, Power = 90%, Homogeneity

Model

Pedigree type

CommonRecessive

CommonDominant

MinorGene 1

MinorGene 2

Type 1 18 50 285 626

Type 2 10 16 80 199

Type 3 16 44 245 582

Type 4 13 7 124 300

Page 68: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Number of Pedigrees Required = 0.0001, Power = 90%,

Heterogeneity ( = 0.5) Model

Pedigree type

CommonRecessive

CommonDominant

MinorGene 1

MinorGene 2

Type 1 74 227 1139 2503

Type 2 35 79 305 771

Type 3 65 191 965 2317

Type 4 44 26 468 1159

Page 69: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Sample Size Required

Case-Control Study ( = 0.05, Power = 90%) Allele Frequency (p)

Relative Risk (RR)0.05 0.1 0.2

1.1 41813 19747 8714

1.2 10921 5142 2252

1.3 5060 2375 1032

1.4 2962 1385 597

1.5 1969 918 392

2 582 266 109

3 188 82 30

4 101 42 13

5 65 6 -

10 19 - -

Page 70: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Sample Size Required

Case-Control Study ( = 0.05, Power = 90%)

Allele Frequency (p)Gene Effect Size

0.1 0.2 0.3 0.4

10 % 19747 8714 5037 3198

30 % 2375 1032 585 361

50 % 918 392 217 130

Page 71: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Sample Size Required

TDT Study ( = 0.001, Power = 80%)

Frequency of allele A Genotypic Risk Ratio

0.01 0.10 0.50 0.80

4.0 1098(0.048)

150(0.346)

103(0.500)

222(0.235)

2.0 5823(0.029)

695(0.245)

340(0.500)

640(0.267)

1.5 19320(0.025)

2218(0.197)

949(0.500)

1663(0.286)

Risch & Merikangas (1996) Science, 273, 1516-1517.

Page 72: Study Design for Linkage, Association and TDT Studies

Jan, 2003 YMGC Genotyping Core

Define phenotype

Identify evidence of genetic component

Extended families

Define study design

Sib pairs Single affected member

Family, clinical information and DNA collection

Genotyping

Data analysis

Identify regions of interest

Physical Mapping / Gene Identification