Study Design for Linkage, Association and TDT Studies
description
Transcript of Study Design for Linkage, Association and TDT Studies
Jan, 2003 YMGC Genotyping Core
Study Design for Linkage, Association and
TDT Studies
林明薇 Ming-Wei Lin, PhD陽明大學醫學系家庭醫學科台北榮民總醫院教學研究部
Jan, 2003 YMGC Genotyping Core
Collins FS. (1992) Nature genetics 1:3-6
Jan, 2003 YMGC Genotyping CoreCollins FS. (1992) Nature genetics 1:3-6
Jan, 2003 YMGC Genotyping Core
Linkage Mapping for Disease Genes
•Linkage analysis (Lod score method)•Allele-sharing methods
Jan, 2003 YMGC Genotyping Core
Gregor Mendel
•The principle of segregation of alleles.•The principle of independent assortment.
Jan, 2003 YMGC Genotyping Core
LinkageLinkage describes the phenomenon whereby allele at neighbouring loci are close to one another on the same chromosome, they will be transmitted together more frequently than chance.
Jan, 2003 YMGC Genotyping Core
Linkage Family
Jan, 2003 YMGC Genotyping Core
Recombinant Gametes
Crossing over between
two neighbouring loci
will produce
recombinant gametes.
Jan, 2003 YMGC Genotyping Core
Recombination Fraction
Recombination fraction (θ) =
number of recombinant gametes
---------------------------------------total gametes
Jan, 2003 YMGC Genotyping Core
Recombination Fraction
•Recombination fraction is a measure of genetic distance.
•1cM= 1% chance of recombination between two loci.
Jan, 2003 YMGC Genotyping Core
Jan, 2003 YMGC Genotyping Core
Jan, 2003 YMGC Genotyping Core
Estimation of Recombination
Fraction• Direct Method:
count recombinants.
• Maximum Likelihood Method:Unknown phases
Incomplete penetrance Heterogeneity
Jan, 2003 YMGC Genotyping Core
Likelihood Odds
Likelihood of data if loci linked at θ
Likelihood odds = Likelihood of data if loci
unlinked
L(θ< 0.5) =
L(θ= 0.5)
Jan, 2003 YMGC Genotyping Core
Lod Score
L(θ< 0.5)Lod score (θ) = log10
L(θ = 0.5)
Jan, 2003 YMGC Genotyping Core
Phase Known Family
BD BCAC BC
BE AF
AB CD
AD AD
N NNNN R
DA F
d
DA B
d
Jan, 2003 YMGC Genotyping Core
Phase Known
L(θ) = (θ)r (1-θ) n-r
r: No. of recombinants
n: All meiosis
Jan, 2003 YMGC Genotyping Core
Lod ScorePhase Known
L(θ)LOD = log
L(θ= 0.5)
(θ) r (1-θ) n-r
= log [ ]
(0.5) n
= log 2nθr(1-θ)n-r
Jan, 2003 YMGC Genotyping Core
Phase Unknown Family
BD BCAC BC
AB CD
AD AD
N NN NN RNRRRRR
A:B:
DA
DBor
DA
DBor
dB
d
A
Jan, 2003 YMGC Genotyping Core
Phase Unknown
L(θ) = 1/2 (θ)r (1-θ)n-r +1/2 (θ)n-r(1-θ)r
r: No. of recombinants
n: All meiosis
Jan, 2003 YMGC Genotyping Core
Lod ScorePhase Unknown
L(θ)LOD = log
L(θ= 0.5)
1/2[(θ) r(1-θ) n-r+(θ)n-r(1-θ)r ]
=log { }
(0.5) n
= log {2n-1[θr(1-θ)n-r +θn-r(1-θ)r ]}
Jan, 2003 YMGC Genotyping Core
Lod Score - Maximum Likelihood
Estimate (Z)• Can be calculated at any values of between 0 and 0.5, but are conventionally reported at =0, 0.01, 0.05, 0.1, 0.2, 0.3, and 0.4.
• Zmax is the maximum likelihood estimate (MLE) of .
Jan, 2003 YMGC Genotyping Core
Total Lod ScoreLod score obtained from individual families can be added together to calculate the total lod score.
Jan, 2003 YMGC Genotyping Core
Statistical Significance of the
Lod Scorelod score > 3: evidence of linkage
2 < lod score < 3: suggestive evidence of
linkage
-2 < lod score < 2: uninformative of
linkage
lod score < -2: exclusion of linkage
Jan, 2003 YMGC Genotyping Core
Lod Score
•Two-point lod score analysis
•Multipoint lod score analysis
Jan, 2003 YMGC Genotyping Core
Jan, 2003 YMGC Genotyping Core
Jan, 2003 YMGC Genotyping Core
Is a Pedigree Useful for linkage
Analysis?• Are critical individuals in the
pedigrees doubly heterozygous at the loci? (Informative)
• Can the offsprings be scored as recombinants or nonrecombinants? (Phase)
Jan, 2003 YMGC Genotyping Core
Parameters Assumed in Lod Score Analysis
• Transmission mode of disease
• Recombination fraction
• Trait allele frequencies
• Penetrance values for each possible disease phenotypes
• Marker allele frequencies.
Jan, 2003 YMGC Genotyping Core
Advantages of Lod Score Analysis
• Statistically, it is more powerful approach than any nonparametric method.
• Utilizes every family member’s phenotypic and genotypic information.
• Provides an estimate of the recombination fraction.
• Provides a statistical test for linkage and for genetic (locus) heterogeneity.
Jan, 2003 YMGC Genotyping Core
Limitations of Lod Score Method
•assumes single locus inheritance
•requires specification of disease gene frequency and penetrance
•has reduced power when disease model is grossly misspecified
Jan, 2003 YMGC Genotyping Core
Successful Examples Using Lod Score
Method• Cystic fibrosis
CFTR gene• Huntington disease
HD gene• Alzheimer disease
APP• Hereditary breast cancer
BRCA1BRCA2
Jan, 2003 YMGC Genotyping Core
Complex Diseases
• No clear pattern of Mendelian inheritance• A mix of genetic and environmental
factors• Incomplete penetrance• Phenocopies• Oligogenic or polygenic• Heterogeneity• High frequency of disease-causing allele
Jan, 2003 YMGC Genotyping Core
Recurrence Risk Ratio (λ)
Frequency in relatives of affected person
λr = -------------------------------------------------------Population frequency
r denotes the degree of relationship
Jan, 2003 YMGC Genotyping Core
Recurrence Risk Ratio
Genetic mapping is much
easier for traits with high λs (λs
> 10) than for those with low λs
(λs < 2).
Jan, 2003 YMGC Genotyping Core
Recurrence Risk Ratio of Different
DiseasesDisease λ s
Cystic fibrosis 500Type I diabetes 15schizophrenia 8.6Type II diabetes 3.5
Jan, 2003 YMGC Genotyping Core
Allele-sharing Methods
•Identical by state (I.B.S.)Two alleles of the same form.
•Identical by descent (I.B.D.)Two alleles are descended from the same ancestral allele.
Jan, 2003 YMGC Genotyping Core
Allele-sharing Methods
Testing whether affected relatives inherited a region IBD (or IBS) more often than expected under random Mendelian segregation.
Jan, 2003 YMGC Genotyping Core
IBD = 2
IBD = 1
IBD = 0
AC AB
BC BC
AC BC
AC AB
AB
CD
AD BC
Jan, 2003 YMGC Genotyping Core
IBS = 2
IBS = 1
IBS = 0
BC BC
AC AB
AD BC
Jan, 2003 YMGC Genotyping Core
Affected Sib-pair Methods
An affected sib-pair may share 0,1, 2 alleles identical by descent (IBD) with probabilities of 0.25, 0.5, 0.25, respectively, at any marker locus.
Jan, 2003 YMGC Genotyping Core
IBD = 2
ACAB
BC
BCBC ABAC
BC AA
IBD = 1IBD = 0
25%
50%
25%
Jan, 2003 YMGC Genotyping Core
Affected Sib-pair Methods
If the marker locus is independent of the trait locus, the probabilities of the affected sib-pairs share 0,1, 2 alleles ibd will remain as 0.25, 0.50, 0.25.
Jan, 2003 YMGC Genotyping Core
Affected Sib-pair Methods
If the marker locus is linked to the trait locus, an excess of affected sib-pair sharing two alleles ibd will be expected.
Jan, 2003 YMGC Genotyping Core
Allele-sharing Methods
•Affected Sib-pairs
•Affected Pedigree Member
Jan, 2003 YMGC Genotyping Core
Pearson 2 statistics
Comparing observed numbers of sib-pairs sharing 0, 1, 2 alleles IBD with their expectations under the null hypothesis.
Jan, 2003 YMGC Genotyping Core
Pearson 2 statistics
• Alternative hypothesis:IBD sharing 0 1 2observed n0 n1 n2
N = n0 + n1 + n2
• Null hypothesis:IBD sharing:0 1 2 expected N/4 N/2 N/4
Jan, 2003 YMGC Genotyping Core
Comments on Allele-Sharing Method
There is no need to specify any genetic parameters of the transmission model.
Less powerful to detect linkage compared with the lod score method if the genetic transmission model can be specified correctly.
It is poor at providing a precise location of the disease gene.
Jan, 2003 YMGC Genotyping Core
Successful Examples Using Sib Pair
Method• Insulin-dependent diabetes• Non-insulin-dependent
diabetes• Multiple sclerosis• Alzheimer disease
Jan, 2003 YMGC Genotyping Core
Thresholds for Mapping Complex
TraitsMappingMethods
Suggestivelinkage
Lod score
SuggestivelinkageP value
Significantlinkage
Lod score
SignificantlinkageP value
Lod score 1.9 0.0017 3.3 0.000049
Sibs andhalf-sibs
2.2 0.00074 3.6 0.000022
Uncle-nephew
2.3 0.00056 3.7 0.000018
Firstcousin
2.3 0.00052 3.7 0.000016
Lander and Kruglyak (1995) Nature Genetics, 11, 241-247
Jan, 2003 YMGC Genotyping Core
Types of Association Study
• Case-Control study
• Transmission disequilibrium test (TDT)
• Sibling control
Jan, 2003 YMGC Genotyping Core
○ □
○ □
□
○
○
□
○
□AD AC B
CAC A
B
BC
CD AA AD AC
■●■
● ●■
●
■
■
●DD
AC BD
CD CD
BC
AB
AD BD
AD
Case-Control study
Jan, 2003 YMGC Genotyping Core
Linkage Disequilibrium
Linkage disequilibrium is the non-random association in a population of alleles at closely linked loci.
Jan, 2003 YMGC Genotyping Core
Linkage DisequilibriumA2---B1-----C2---X----D3-----E4----F2
A2---B1-----C2---X----D3-----E4
A2---B1-----C2---X----D3
B1-----C2---X----D3
C2---X----D3 C2---X
N generations
Jan, 2003 YMGC Genotyping Core
TDT StudyTo examine the transmission of a particular allele at a locus from heterozygous parents to their affected offspring.
Jan, 2003 YMGC Genotyping Core
□ ○
●
□ ○
■
□ ○
■
□ ○
●BC AB BC B
B
AB
AC AC BC
AC BB
BC
AB
“ Trios” for TDT study
“ transmitted allele“ “case”
“ Non-transmitted allele” “control”
Jan, 2003 YMGC Genotyping Core
Using Sibling as Control
• Sibling association test (SIBASSOC)
• Sib transmission/disequilibrium test (S-TDT)
• Sibship disequilibrium test (SDT)
Jan, 2003 YMGC Genotyping Core
What does a positive
association imply?•Direct causal effect•Linkage disequilibrium•Population stratification
Jan, 2003 YMGC Genotyping Core
When to Use Association Study
• Candidate gene
• Positive evidence of linkage
• Candidate region allelic associations
Jan, 2003 YMGC Genotyping Core
Successful Examples of Mapping Genes by Association Studies
• Autoimmune diseases associated with HLA IDDM multiple sclerosis ankylosing spondylitis rheumatoid arthritis
• Angiotensin-converting enzyme and heart disease
• low-density lipoprotein receptor and heart disease
• insulin locus and IDDM
Jan, 2003 YMGC Genotyping Core
Sample Size Required
Linkage for Monogenic Traits
• One large family
• at least 40 informative meioses
• 20 cM marker density
• Expected lod score > 3
Jan, 2003 YMGC Genotyping Core
Sample Size Required
Allele-Sharing• λs = 2
• at least 600 affected sib pairs
• narrow down the region to 1 cM
Jan, 2003 YMGC Genotyping Core
Sample Size Required
Linkage for Complex Traits
Sham, Lin et al (2000) Am J Human Genetics 66, 1661-1668.
Jan, 2003 YMGC Genotyping Core
Genetic Markers
A complete informative
marker locus at 0
recombination fraction to
the disease locus.
Jan, 2003 YMGC Genotyping Core
Genetic ModelsModel f0 f1 f2 Kp q
Common Recessive (CR) 0.005 0.005 0.50 0.01 0.100
Common Dominant (CD) 0.005 0.500 0.50 0.01 0.005
Minor gene 1 (MG1) 0.050 0.200 0.80 0.10 0.130
Minor gene 2 (MG2) 0.050 .150 0.45 0.10 0.207
Kp: population risk, q: disease allele frequencyf0: penetrance for the genotype AA; f1: penetrance for the genotype Aaf2: penetrance for the genotype aa
Jan, 2003 YMGC Genotyping Core
Pedigree TypesPedigree Types
Jan, 2003 YMGC Genotyping Core
Number of Pedigrees Required
= 0.0001, Power = 90%, Homogeneity
Model
Pedigree type
CommonRecessive
CommonDominant
MinorGene 1
MinorGene 2
Type 1 18 50 285 626
Type 2 10 16 80 199
Type 3 16 44 245 582
Type 4 13 7 124 300
Jan, 2003 YMGC Genotyping Core
Number of Pedigrees Required = 0.0001, Power = 90%,
Heterogeneity ( = 0.5) Model
Pedigree type
CommonRecessive
CommonDominant
MinorGene 1
MinorGene 2
Type 1 74 227 1139 2503
Type 2 35 79 305 771
Type 3 65 191 965 2317
Type 4 44 26 468 1159
Jan, 2003 YMGC Genotyping Core
Sample Size Required
Case-Control Study ( = 0.05, Power = 90%) Allele Frequency (p)
Relative Risk (RR)0.05 0.1 0.2
1.1 41813 19747 8714
1.2 10921 5142 2252
1.3 5060 2375 1032
1.4 2962 1385 597
1.5 1969 918 392
2 582 266 109
3 188 82 30
4 101 42 13
5 65 6 -
10 19 - -
Jan, 2003 YMGC Genotyping Core
Sample Size Required
Case-Control Study ( = 0.05, Power = 90%)
Allele Frequency (p)Gene Effect Size
0.1 0.2 0.3 0.4
10 % 19747 8714 5037 3198
30 % 2375 1032 585 361
50 % 918 392 217 130
Jan, 2003 YMGC Genotyping Core
Sample Size Required
TDT Study ( = 0.001, Power = 80%)
Frequency of allele A Genotypic Risk Ratio
0.01 0.10 0.50 0.80
4.0 1098(0.048)
150(0.346)
103(0.500)
222(0.235)
2.0 5823(0.029)
695(0.245)
340(0.500)
640(0.267)
1.5 19320(0.025)
2218(0.197)
949(0.500)
1663(0.286)
Risch & Merikangas (1996) Science, 273, 1516-1517.
Jan, 2003 YMGC Genotyping Core
Define phenotype
Identify evidence of genetic component
Extended families
Define study design
Sib pairs Single affected member
Family, clinical information and DNA collection
Genotyping
Data analysis
Identify regions of interest
Physical Mapping / Gene Identification