Post on 03-Jan-2016
description
讲 座 提 纲
1 什么是分子育种 2 历史回顾 3 全基因组策略 4 基因型鉴定 5 表现型鉴定 6 环境型鉴定 (etyping) 7 标记 - 性状关联分析 8 标记辅助选择 9 决策支撑系统 10 展望
GSystems
From gels to chips and sequencing (GBS)
ThroughputFrom singles to millions
Resolution10-30 cM to many markers per gene
Cost (per data point)Several US dollars to 1/1000 cent
Evolution of Genotyping (1980-2010s)
Marker type
Morphological Cytological ProteinDNA
RFLPRAPDAFLPSSRSNP
Xu 2010Molecular Plant BreedingCABI Publisher
Molecular basis of DNA markers
A single-nucleotide polymorphism (SNP) is a DNA sequence variation occurring when a single nucleotide — A, T, C, or G — in the genome differs between members of a biological species. (Wiki)
Xu 2010Molecular Plant BreedingCABI Publisher
Revised from:
Copy-number variations (CNVs) — a form of structural variation—are alterations of the DNA of a genome that results in the cell having an abnormal number of copies of one or more sections of the DNA
Presence/Absence Variation (PAV)
Sample
A
B
Chromosome distribution
Presence
Absence
Presence/Absence Variation (PAV) results in many genes that cannot be mapped based on regular linkage mapping with SNP markers
单倍型的概念及其发展
功能区域 SNP 构成的单倍型基因内 SNP 构成的单倍型染色体内 SNP 构成的单倍型全基因组范围 SNP 构成的单倍型
A haplotype is a group of genes within an organism that was inherited together from a single parent. A haplotype can describe a pair of genes inherited together from one parent on one chromosome, or it can describe all of the genes on a chromosome that were inherited together from a single parent. This group of genes was inherited together because of genetic linkage.
The term "haplotype" can also refer to the inheritance of a cluster of single nucleotide polymorphisms (SNPs), which are variations at single positions in the DNA sequence among individuals.
SNP1 SNP2 SNP3
Chromosome 1 AACACGCCA …. TTCGGGGTC….AGTCGACCG ….Chromosome 2 AACACGCCA …. TTCGAGGTC….AGTCAACCG ….Chromosome 3 AACATGCCA …. TTCGGGGTC….AGTCAACCG ….Chromosome 4 AACACGCCA …. TTCGGGGTC….AGTCGACCG ….
Individual 01 CTCAAAGTACGGTTCAGGCAHaplotype 1 Individual 02 CTCAAAGTACGGTTCAGGCA
Individual 03 CTCAAAGTACGGTTCAGGCA Individual 04 CTCAAAGCACGGTTGAGGCA
Haplotype 2 Individual 05 CTCAAAGCACGGTTGAGGCA Individual 06 CTCAAAGCACGGTTGAGGCA Individual 07 CTCGAAGTACGGTTCAGGCA
Haplotype 3 Individual 08 CTCGAAGTACGGTTCAGGCA Individual 09 CTCGAAGTACGGTTCAGGCA Individual 10 CTCAAAGCACGGTTCAGGCA
Haplotype 4 Individual 11 CTCAAAGCACGGTTCAGGCA Individual 12 CTCAAAGCACGGTTCAGGCA
A T C/ / /G C G
Tag SNPs
SNPs
从 SNP到单倍型和标签 SNP Haplotype
Winner?
# Markers
Throughput
Cost
Data deliverry
Service
SNP Genotyping Platforms
● Three Illumina 1536-SNP chips: Illumina-Cornell-CIMMYT collaboration
Yan et al 2009; Yan et al 2010
● Illumina MaizeSNP50 Beadchip: Up to 56,110 SNPs, 1 SNPs/40 kbCovering 19,540 genes, 2 SNPs/geneFunctionally tested with over 30 diverse
maize linesDeveloped by Illumina in collaboration
with TraitGenetics, INRA, and Syngenta
Genotyping by Arraying (Chips)
SNP genotyping by Array Tape
Douglas Scientific Array Tape 平台包括:Nexar Inline Liquid Handling SystemSoellex Thermal CyclerAraya Inline Fluorescence ScannerCentrifugeKraken SNPline XL System
高通量数据: 每天处理 400 张 384 孔反应数据( 15 万个)低运行成本:极微量反应体系, 节省 80-90% 的反应试剂模块化程序设计:
NEXAR 微量液体转移系统SOELLEX 高通量 PCR 反应系统ARAYA 扫描系统
特别适合于大量样本少量标记的分析
Genotyping By Sequencing (GBS)
GBS technology enables the detection of a wider range of polymorphisms: SNPs plus small indels
No pre-discovery or validationApplicable to any species or population
GBS approaches Simply sequence the entire genomes of individuals: expensive Several extant methods. Each enriches for a portion of the genome
which is then sequenced. Enrichment is most often achieved via restriction enzyme (RE) digestion. The existence of only 4, 6 or 8bp recognition sites limits the “tunability” of extant methods.
Huang et al., 2010 Nature Genetics; Andolfatto et al., 2011 Genome Research; Elshire et al., 2011, PLoS ONE; Davey et al., 2011 Nature Reviews Genetics
Genotyping-By-Sequencing GBS
Created for high-throughput, semi-automated genotyping
Sequencing adaptor
BarcodeSticky ends
Genomic DNA
Images: Qiagen, Illumina, Elshire et al 2011, PLoS ONE
Restriction digest
SequenceLigate adaptors
Isolate DNA
Pool & amplify
Sample plants
• Advantages• One step SNP discovery + genotyping• Simple protocol; no reference required• Large numbers of SNPs found cheaply• Broadly applicable
• Drawbacks• False SNPs from
sequencing errors• Missing data from
stochastic sampling
1. 限制性酶切2. 添加接头3. 混池构建4. 片段长度选择5. 测序6. 质量检控7. 序列比对8. HMM 模型拟合9. 下游分析
Andolfatto et al. 2011 Genome Research
GBS: Competitive Landscape
From P. S. Schnable
1 Commercialized by Floragenex Inc.2 Not disclosed; Data2Bio’s proprietary technology
Maize GBS 2.7 Build
Trained on 32K taxa including extensive CIMMYT material (landraces and diverse breeding materials)
45K taxa now scored with build
960K core SNPs
Production Tags On Physical Map (TOPM) file for one step SNP calling available at panzea.org (imputation and calling in 15 min)
Ed Buckler, personal comm.
Resequencing to discover SNPs, haplotypes and tag SNPs
Tag SNPs can be developed to represent haplotypes. Each tag SNP represents one haplotype fragment.
A set of tag SNPs can be developed to represent whole genome diversity.
Sequencing Everything !!
Genotyping by Whole Genome Sequencing
Approaches to Reduce Cost and Increase Scale in Genotyping
Seed-based DNA genotyping
Efficient sample tracking
Selective genotyping and pooled DNA analysis
Integrated diversity analysis, genetic mapping and MAS
Developing breeding strategies for simultaneous improvement of multiple traits
① Soaking ② Sampling ③ Grinding
⑥ Tracking back and planting
④ DNA extraction ⑤ PCR and genotyping
Seed DNA-based Genotyping in Maize
Gao et al 2008 Mol Breed 22:477–494
Automaticseed chipping
Laser-assisted seed selection
20% 15% 10% 5% 3% 1%15
3050
100
20% 15% 10% 5% 3% 1%15
3050
100
20% 15% 10% 5% 3% 1%15
3050
100
20% 15% 10% 5% 3% 1%15
3050
1000
102030405060708090
100
0102030405060708090
100
0102030405060708090
100
0102030405060708090
100
QTL effect
Tail
size
Po
wer
(%
)
QTL effect
Tail
size
Po
wer
(%
)
QTL effect
Tail
size
Po
wer
(%
)
QTL effect
Tail
size
Po
wer
(%
)
N = 200 N = 500
N = 1000 N = 3000
Selective Genotyping: QTL Effects and Population/Tail Sizes
Sun et al 2010 Mol Breed 26:493–511
Bulked or Pooled DNA Analysis
PCR markers Chip genotyping DNA sequencingRNA sequencing
0.0
0.5
1.0
0.0
0.5
1.0
Population distribution
Selection
DNA Pools
GenotypingLinked
Unlinked
High tail
Lowtail
0.0
0.5
1.0 Linked
Linked
Linked
Unlinked
R plants S plants
A B
Alle
le f
requ
enc
y
Xu, 2010, Molecular Plant Breeding, CABI