Amplicon)sequencing)using)next gen)technology) - … · Amplicon)sequencing)using)next...
-
Upload
hoangkhanh -
Category
Documents
-
view
228 -
download
3
Transcript of Amplicon)sequencing)using)next gen)technology) - … · Amplicon)sequencing)using)next...
Amplicon sequencing using next-‐gen technology
STAMPS course 2013 Hilary Morrison
& Sharon Grim, Joe Vineis
Overview • Tag sequencing • Primer design • Amplicon sequencing on two Illumina plaIorms – fusion primers – opKmizaKon – piIalls – current protocols
WHAT YOU WILL NOT HEAR ABOUT
RETIRED
EXCHANGED
Targets
• Ribosomal RNA (SSU, LSU) • Internal transcribed spacers (ITS)—fungi • FuncKonal genes – nirS – butyrate kinase, butyrate transferase – others
Which variable regions to target?
V1 V3 V7 V8 V9
8F
V2 V4 V5 V6
V6 (967F-‐1046R) 60 bp V4 (518F-‐806R) 288 bp V4V5 (518F-‐1064R) 550 bp V9 (1380/1389-‐1513) for eukaryotes
341 518 806 926 967 1064 1513 1380
ITS (fungi)
18S
ITS1 ~350-‐400 bp ITS2 ~250-‐300 ITS1-‐ITS2 ~550-‐800
ITS1F ITS1R/ITS2F ITS2R
ITS1 ITS2 28S 5.8S
AmplificaKon process • DOs:
– High quality DNA template: e.g. 260/280 ~2 – Replicate reacKons using high-‐fidelity polymerase – NegaKve controls – Pool, clean, size-‐select
• Qiagen MinElute (FLX; Illumina v6), • Ampure (Titanium; Illumina v4v5); • Pippin Prep (Illumina pools)
– QuanKtaKon • e.g. PicoGreen, qPCR
– VisualizaKon • Bioanalyzer, Caliper, gel
• DON’Ts: – MDA or nested amplificaKon
Pre-‐Ampure Post-‐Ampure
What can go wrong? (sources of error) • ContaminaKon • PCR error (enzyme misincorporaKon) • Chimeras • PCR ectopic amplificaKon (non-‐rRNA) • MulKple Sequence Alignment • Distance measurements (gaps, etc.) • Clustering methods
Post sequencing • Basecalling • QC pipeline • Taxonomy assignment or clustering – QIIME – GAST (Sue Huse, 8/4) – mothur – RDP Pipeline (hjp://rdp.cme.msu.edu/) – Oligotyping – basic BLAST
• VAMPS
16S-‐SPECIFIC PRIMER DESIGN
Design primer suite for conserved sites
• Start with known primers • Check matches against SILVA or other rRNA database – Phyla missed? – ProporKon missed?
• Add variants or degenerate sites to expand coverage
Literature review
Database sources
3,194,778 Total 739,633 SSURef
Example: bacterial-‐archaeal universal primer v. 1
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
80.0%
90.0%
100.0%
0 mismatches 1 mismatch 2 mismatches 3 mismatches
Bacteria
Archaea
Eukarya
GGACTAC[ACG][CG]GGGTATCTAAT
Example: bacterial-‐archaeal universal primer v.2
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
80.0%
90.0%
100.0%
0 mismatches 1 mismatch 2 mismatches 3 mismatches
Bacteria
Archaea
Eukarya
GGACTAC[ACG][CG]GGGTATCTAAT GGACTAC[ACT][ACG]GGGT[AT]TCTAAT
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
80.0%
90.0%
100.0%
0 mismatches 1 mismatch 2 mismatches 3 mismatches
Bacteria
Archaea
Eukarya
V6 primers 967F is pool of 4 oligos with 2-‐16X degeneracy
CNACGCGAAGAACCTTANC CAACGCGMARAACCTTACC ATACGCGARGAACCTTACC CTAACCGANGAACCTYACC
1046R: 4 oligos or 1 oligo with 16X degeneracy CGACAGCCATGCANCACCT CGACAACCATGCANCACCT CGACGGCCATGCANCACCT CGACGACCATGCANCACCT
CGACRRCCATGCANCACCT
V6 Primer Sequence Target RefDB exact match
967F1-‐PP CNACGCGAAGAACCTTANC Bacteria 63.3%
967F-‐AQ CTAACCGANGAACCTYACC Deferribacteres, Aquificales 0.1%
967F-‐U12 CAACGCGMARAACCTTACC AddiKonal Proteobacteria 14.0%
967F-‐U3 ATACGCGARGAACCTTACC AddiKonal Bacteroides; Spirochaetes 12.7%
1046R CGACRRCCATGCANCACCT Bacteria 95.4%
Keep level of degeneracy low (2-‐16X) Are rare, but important groups missed? Primers with 1-‐2 mismatches open work—pay ajenKon to Tm
V4V5 Primer Sequence Target RefDB exact match
518F CCAGCAGCYGCGGTAAN Bacteria 97.0%
926R1 CCGTCAATTCNTTTRAGT Bacteria 96.2%
926R3 CCGTCAATTTCTTTGAGT Verrucomicrobia,
Delta proteobacteria, AcKnobacteria
2.3%
926R4 CCGTCTATTCCTTTGANT Epsilon proteobacteria 1.3%
Why not use CCGTCWATTYNTTTRANT?
128 variants and only 40 match a known bacterial sequence!
Fusion primer modificaKons • Phosphorothioate linkages • Barcoding: 5-‐12 nt • Indexing: 6-‐12 nt • Fusion with plaIorm-‐specific sequencing primer sequences (vs. ligaKon)
• Illumina fusion primers include bridge adapter regions
ILLUMINA TAG SEQUENCING
Illumina amplicon strategy
• Fused primer approach, not adapter ligaKon – Primers much longer than 454 versions – Include bridge adapter sequences – Minimize cost of barcoded primer sets
• Many labs now have Knight 515F/806R sets (V4; >1000 indices) • CombinaKon of 12 indices x 8 inline barcodes creates 96-‐plex set for other targets
Illumina amplicon targets
• How long are the reads? – Hiseq: 100/150 PE – Miseq 250 PE and longer
• How good is the quality towards the end? – Undetected errors will inflate alpha-‐diversity – Merged reads require overlap
100 nt PE HiSeq1000
V6R: 1046R
~60 bp complete overlap
967F
Read 1
Read 2
Longer reads on MiSeq
V5R: 926R
50-‐100 nt overlap
V4F: 518F
Read 1
Read 2
Start with v6
• Original 454 pyrotag target-‐lots of experience • Sequence overlap with v6v4 datasets • Cheaper to do • Greater sampling depth – 300-‐400 million clusters per lane possible – 3M reads at 96-‐plex – 300K reads at 960-‐plex
rRNA…TAATCTWTGGGVHCATCAGG-CCGACTGACTGA-TTGCGTGCGATC-TAGAGCATACGGCAGAAGACGAAC!
5' AATGATACGGCGACCACCGAGATCTACAC-TATGGTAATTGT-GTGCCAGCMGCCGCGGTAA…rRNA!
5’ adapter “pad/linker”*
*designed to have no homology to 16s or Illumina adapters/primers
“515F”
Knight lab sequencing primer 1
“806R-‐rc” “pad/linker-‐rc”* index-‐rc 3’ adapter-‐rc
Knight lab sequencing primer 2 RC is indexing sequencing primer
Fusion primer design
Learned during Miseq Training
• First 4 bases of read 1 used for cluster finding (no change)
• Next 8 bases (up through base 12) need to be high-‐complexity because of phasing/prephasing calculaKon
• This is also true of read 2 • Thus, need to have randomness in cycles 1-‐12 of both reads!
In-‐line bar codes vs. indices • In-‐line bar codes cut into sequencing reads, but: • Cluster finding requires random distribuKon of A,C,G,T at
start of read 1 • We designed our primers to provide this diversity
– 5’ NNNN (1-‐4) – Careful mix of bar codes (5-‐9)
• Didn’t work – Clusters are found, but amplicons are sKll ‘low complexity’ samples
– Diversity needed at first 12 bases of both read 1 and read 2 • Had to add high proporKon of PhiX or other ‘random’
library
5’ AATGATACGGCGACCACCGAGATCTACAC-‐ TCTTTCCCTACACGACGCTCTTCCGATCT-‐ NNNNTCAGC-‐ CNACGCGAAGAACCTTANC 3’
967F_PP_N4TCAGC: one of 4 primers with TCAGC barcode; forward read
Fusion primer design
5’ CAAGCAGAAGACGGCATACGAGAT-‐ CGTGAT-‐ GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT-‐ CGACRRCCATGCANCACCT 3’
1046R_index_1 : single degenerate primer with index; reverse read
Bridge adapter Sequencing primer binding In-‐line barcode 16S PCR
Bridge adapter Index Sequencing primer binding (x2) 16S PCR
Similar for v4v5 suite
Primer synthesis • >80 nt long • PurificaKon needed – DesalKng only – PAGE (used for first set; expensive) – No other opKon because of oligo length
• Pooled 4 967F primers for each barcode (V6) • Pooled 3 926R primers for each index (V4V5) • Final set of 8 forward/barcoded and 12 reverse/indexed primers = 96 unique combinaKons for each domain
• Working stocks @10 uM each
ILLUMINA SEQUENCING
Effect of low-‐complexity libraries
• Clusters found correctly • Then read hits highly
conserved sequence • Quality scores reduced • MiKgate with lower cluster
density and HC library spike-‐in – over 50% PhiX on early Miseq runs
• Illumina has sopware fix for problem on Miseq – now ~2% PhiX on Miseq runs
Control gDNA: known community
Error rate
• Generated data from HMP mock community gDNA • Expected sequences recovered • Aligned unique sequences to reference • Sequencing errors, not quality scores
• Overall error: 0.52%
How does Illumina tag sequencing compare to pyrotag sequencing?
Analysis of v6 data from real samples
Sandra Mclellan samples: Illumina (HiSeq) overrepresents AcKnomyces; misses Arcobacter almost enKrely
Samples are: 3 HiSeq v6 amplicon datasets and 2 454 datasets (v6v4 and v6 only)
Does length of primers bias recovery?
Test alternaKves in 2-‐step PCR. – 16s primers only, followed by full-‐length fusion – 16s plus part of Illumina adapters, followed by adapters to bridge
Sample, condidons Barcode Index
SS_WWTP_1_25_11, 16s only 20 cycles GAGAC 7: GATCTG/CAGATC
SS_WWTP_4_11_11, 16s only 20 cycles ACGCA 8: TCAAGT/ACTTGA
SS_WWTP_1_25_11, split primers TCAGC 1:CGTGAT
SS_WWTP_4_11_11, split primers TCAGC 2:ACATCG
“Split” primers
967F_PP+seq CACGACGCTCTTCCGATCTACGTTCAGCCNACGCGAAGAACCTTANC
967F_AQ+seq CACGACGCTCTTCCGATCTCGTATCAGCCTAACCGANGAACCTYACC
967F_U2+seq CACGACGCTCTTCCGATCTGTACTCAGCCAACGCGMARAACCTTACC
967F_U3+seq CACGACGCTCTTCCGATCTTACGTCAGCATACGCGARGAACCTTACC
ACGTbridge+seq AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTACGT
CGTAbridge+seq AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTCGTA
GTACbridge+seq AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTGTAC
TACGbridge+seq AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTTACG
1046R+seq GGAGTTCAGACGTGTGCTCTTCCGATCTCGACRRCCATGCANCACCT
BridgeIndex_1 CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCAGACGTGTGCTCTTCC
BridgeIndex_2 CAAGCAGAAGACGGCATACGAGATACATCGGTGACTGGAGTTCAGACGTGTGCTCTTCC
PCR condiKons Standard 3 x 33 ul amplicon reacKon plus negaKve control 1st round: either 16s only primers or 16s+parKal adapter Only 20 cycles (vs. 30) Pooled 3 reacKons Diluted 10 ul to 500 (1:50) Used 1 ul diluKon in 2nd round of PCR with 2nd set of primers, 5 cycles
PCR results
16s only, 20 cycles 16s only, +5 cycles 16s only, negaKve 16s only, 20 cycles 16s only, +5 cycles 16s only, negaKve Split, 20 cycles Split, +5 cycles Split, negaKve Split, 20 cycles Split, +5 cycles Split, negaKve
Sequencing results
Searched for match in raw data, read 1, each dataset >gi|343201492 Arcobacter cibarius TGGACTTGACATAGTAAGAACTTTCTAGAGATAGATTGGTGTCTGCTTGCAGAAACTTATATAC
Sample, condidons Count read1 Count Arcobacter %
SS_WWTP_1_25_11_Bv6v4 35,612 5,525 15.50%
SS_WWTP_1_25_11_ACGCA 214,950 2,309 1.10%
SS_WWTP_1_25_11_GACTC 240,520 2,606 1.10%
SS_WWTP_1_25_11_GTATC 392,570 5,445 1.40%
SS_WWTP_1_25_11_16s 6,106,654 900,824 14.80%
SS_WWTP_1_25_11_split 12,782,547 1,558,392 12.20%
SS_WWTP_4_11_11_16s 5,779,659 824,789 14.30%
SS_WWTP_4_11_11_split 17,102,934 2,108,534 12.30%
0"
5"
10"
15"
20"
25"
30"
35"
40"
45"
Beijerinckiaceae"
Alicyclobacillaceae"
Helicobacteraceae"
Xiphinematobacteriaceae"
Crenotrichaceae"
Rhodobacteraceae"
Auran@monadaceae"
Acetobacteraceae"
Xanthobacteraceae"
Tracheophyta"
Erythrobacteraceae"
Enterococcaceae"
Lactobacillaceae"
Phyllobacteriaceae"
Neisseriaceae"
Coxiellaceae"
Sinobacteraceae"
Listeriaceae"
Clostridiaceae"
Chi@nophagaceae"
RickeJsiaceae"
Bacteroidaceae"
Williamsiaceae"
Staphylococcaceae"
Rhodocyclaceae"
Streptococcaceae"
Nocardiaceae"
Nocardioidaceae"
Burkholderiaceae"
Bdellovibrionaceae"
Methylophilaceae"
Family"I"
Bacillaceae"
Brucellaceae"
Cryomorphaceae"
Microbacteriaceae"
Planctomycetaceae"
Bradyrhizobiaceae"
Moraxellaceae"
Legionellaceae"
Hyphomicrobiaceae"
Methylobacteriaceae"
Unassigned"
Rhodospirillaceae"
Oxalobacteraceae"
Paenibacillaceae"
Alcaligenaceae"
Caulobacteraceae"
Verrucomicrobiaceae"
Comamonadaceae"
Xanthomonadaceae"
Sphingobacteriaceae"
Sphingomonadaceae"
Rhizobiaceae"
Enterobacteriaceae"
Pseudomonadaceae"
Cytophagaceae"
Flavobacteriaceae"
MTG$vs$v4v5$
M51.MTG.RC"
M51.MTG.UC"
M51.v4v5.1step"
M51.v4v5.2step"
0"
5"
10"
15"
20"
25"
30"
35"
Micromonosporaceae"
Xanthobacteraceae"
Streptomycetaceae"
Nitrosomonadaceae"
Hyphomonadaceae"
Pseudonocardiaceae"
Methylocystaceae"
Hydrogenophilaceae"
Micrococcaceae"
Rhodospirillaceae"
Rhodocyclaceae"
SAR116"
Beijerinckiaceae"
Unassigned"
PiscirickeFsiaceae"
Paenibacillaceae"
Patulibacteraceae"
Oxalobacteraceae"
Psychromonadaceae"
Bacillaceae"
Erythrobacteraceae"
Acetobacteraceae"
Nocardioidaceae"
Bartonellaceae"
Phyllobacteriaceae"
Hyphomicrobiaceae"
Francisellaceae"
Bradyrhizobiaceae"
Neisseriaceae"
Mycobacteriaceae"
Alteromonadaceae"
Methylobacteriaceae"
Brucellaceae"
Enterobacteriaceae"
Cytophagaceae"
Verrucomicrobiaceae"
Microbacteriaceae"
Caulobacteraceae"
Mycoplasmataceae"
Alcaligenaceae"
Rhizobiaceae"
Xanthomonadaceae"
Nocardiaceae"
Sphingomonadaceae"
Sphingobacteriaceae"
Flavobacteriaceae"
Comamonadaceae"
Moraxellaceae"
Pseudomonadaceae"
MTG$vs$v4v6$
50ng.8cyc.R2"%"
367.v4v6"%"
Other evidence for ‘missing’ taxa
Arcobacter
Epsilons Rikenella
Primer Tm at 1.5-‐3.6mM Mg2+
Concentradon in working mix (uM)
Specific targets?
518F 64 10 Universal 926R1 52 8 Rikenella 926R3 57 1 926R4 55 1 Epsilons
Bacterial v4v5
Thermocycling: 94˚C, 3:00 25 cycles of: [94˚C, 0:30; 55-‐60˚C, 0:45; 72˚C, 1:00] 72˚C, 2:00 4˚C, hold
454 v6v4 and MiSeq v4v5 datasets
Sewage 19_SS 1 step 2 step Bv6v4
20_SS Bv6v4 21_SS Bv6v4
Leech I_GS1_R2 Bv6v4 REA_HDF Bv6v4
OPTIMIZING PCR CONDITIONS
or, how Sharon spent her spring
CondiKons to try
• Primer concentraKon – 0.2-‐0.4 uM
• Magnesium concentraKon – 1.5mM – 3.67mM
• Annealing temperature – Tm of primers suggest anywhere from 52˚C-‐60˚C
• Keep primer composiKon and cycle number constant
Leech
Ladd
er
sewage
Ladd
er
REA_HDF
55˚C 57˚C 60˚C 55˚C 57˚C 60˚C 55˚C 57˚C 60˚C
0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3
A B A B A B A B A B A B A B A B A B A B A B A B A B A B A B A B A B A B
Effect on amplificaNon of
Condidon Leech REA_HDF
Increase temperature Decrease Decrease
Increase primer Increase Increase
Increase magnesium Increase Increase
Annealing temperature, Primers, A: 1.5 mM MgSO4 B: 3.6 mM MgSO4 All NTC at 55˚C
19_SS
Ladd
er
20_SS
Ladd
er
21_SS
55˚C 57˚C 60˚C 55˚C 57˚C 60˚C 55˚C 57˚C 60˚C
0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3
A B A B A B A B A B A B A B A B A B A B A B A B A B A B A B A B A B A B
Effect on amplificaNon of
Condidon 19_SS 20_SS 21_SS
Increase temperature Decrease Decrease Decrease
Increase primer Increase Increase Increase
Increase magnesium Increase Increase Increase
Annealing temperature, Primers, A: 1.5 mM MgSO4 B: 3.6 mM MgSO4 All NTC at 55˚C
(+)/(-‐)
55˚C 57˚C
0.2 0.3 0.2 0.3
A B A B A B A B
Arco_1
Ladd
er
Arco_3
Ladd
er
209_5M
55˚C 57˚C 60˚C 55˚C 57˚C 60˚C 55˚C 57˚C 60˚C
0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3
A B A B A B A B A B A B A B A B A B A B A B A B A B A B A B A B A B A B
Effect on amplificaNon of
Condidon Arco_1 Arco_3 209_5M
Increase temperature Decrease Decrease N/A
Increase primer No Effect Minor increase N/A
Increase magnesium Increase Increase N/A
Annealing temperature, Primers, A: 1.5 mM MgSO4 B: 3.6 mM MgSO4 All NTC at 55˚C
(+)/(-‐)
55˚C 57˚C
0.2 0.3 0.2 0.3
A B A B A B A B
Archaeal v4v5 • Try annealing temperature, primer concentraKon, magnesium
concentraKon • Hold constant 1X HiFi Buffer, 0.1 U Taq/uL, template • Use Av6 and Bv6 as posiKve control for template
Primer Tm at 1.5-‐3.6mM Mg2+
Concentradon in working mix (uM)
arc517F1 63 10 arc517F2 59 10 arc517F3 58 10 arc517F4 56 10 arc517F5 56 10 arc958R 57 10
Thermocycling: 94˚C, 3:00 30 cycles of: [94˚C, 0:30; 55-‐60˚C, 0:45; 72˚C, 1:00] 72˚C, 2:00 4˚C, hold
Av4v5
Ladd
er2
Av6 Bv6*
55˚C 60˚C
NTC
55˚C 60˚C
NTC
60˚C
NTC
0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3 0.2 0.3
A B A B A B A B A B A B A B A B A B A B
Annealing temperature, Primers, MgSO4 A: 1.5 mM MgSO4 B: 3.6 mM MgSO4 1X HiFi Buffer, 0.1 U Taq/uL. All NTC at 0.2 uM each primer, 1.5 mM MgCl2, 55˚C
Effect on amplificaNon of
Condidon Av4v5 Av6 Bv6*
Increase temperature Decrease No change N/A
Increase primer Increase Increase Increase
Increase magnesium Increase Increase Increase
Thermocycling: 94˚C, 3:00 30 cycles of: [94˚C, 0:30; 55-‐60˚C, 0:45; 72˚C, 1:00] 72˚C, 2:00 4˚C, hold
*In comparison, the usual cocktail has 1X HiFi Buffer 2 mM MgSO4 0.2 mM each dNTP 0.4 uM each primer 0.02 U/uL Taq Plenty of DNA In 33 uL total
Arcobacter
Epsilons Rikenella
Primer Tm at 1.5-‐3.6mM Mg2+ Specific targets?
518F 64 Universal 926R1 52 Rikenella 926R3 57 926R4 55 Epsilons
Component 454 Bv4v5 Bv6 Av6 Av4v5
X HiFi 1 1 1 1 1
mM MgSO4 3.67 2 2 1.5 1.5
mM dNTPs 0.2 0.2 0.2 0.2 0.2
U/uL Taq 0.1 0.1 0.02 0.1 0.1
uM each primer (set) 0.3 0.4 0.2 0.3 0.3 annealing temp 60 60 60 60 55
cycles 30 25 25 25 30
If you see your products amplify with less Taq... tell us your secrets.
2 mM Mg2+, 0.02 U/ul Taq, 0.2 uM primers
3.67 mM Mg2+, 0.1 U/ul Taq, 0.37 uM primers
Current protocols bac v6 arc v6 bac v4v5 arc v4v5
HiFi buffer 1X 1X 1X 1X MgSO4 2 mM 2 mM 3 mM 3 mM dNTPs 0.2 mM 0.2 mM 0.2 mM 0.2 mM
combined primers 0.2 uM 0.3 uM 0.3 uM 0.3 uM PlaKnum HiFi 4 units 10 units 10 units 10 units Template 5-‐20 ng 5-‐20 ng 5-‐20 ng 5-‐20 ng Volume 100 ul 100 ul 100 ul 100 ul
PCR 3min 94o 94o 94o 94o 30 sec 94o 94o 94o 94o 45 sec 60o 60o 57o 57o 1 min 72o 72o 72o 72o 2 min 72o 72o 72o 72o
Cycles, step 1 25 30 30 30
Pooling, clean up Qiagen MinElute
plate Qiagen MinElute
plate Ampure, 1:1 Ampure, 1:1
Step 2 4 ul step 1 eluate 0 0 0 Cycles, step 2 5 0 0 0 QuanKtaKon Picogreen Picogreen Picogreen Picogreen Size selecKon Pippin Prep Pippin Prep Ampure Ampure
Typical amplificaKon results
4900
1900 2900
100
1100
300
500 700
Archaeal v4v5 pooled, on Caliper GX Bacterial v6 pooled, on Caliper GX
Size selecKon
v6 amplicons (Pippin)
v4v5 amplicons (Ampure)
Illumina amplicon QC
• Filter out reads based on: – ChasKty filter – Ns
• Then ajempt to merge forward and reverse reads • Keep only those that match perfectly over the region of overlap (v6) or have fewer than 3 mismatches (v4v5)
• Create consensus using basecall with higher Qscore • Trim away adapters, primers • Product goes into usual GAST pipeline, but as unique sequences plus frequency info, not all reads
2012-‐13 amplicon sequencing goals
• Complete 454 projects in progress √ • ReKre GS-‐FLX √ • ConKnue to use HS1000 for metagenomics, short reads, V6 √ • Use 96+ mulKplexed V6 amplicons on HiSeq (paired end
perfect overlaps) √ • Use mulKplexed V4 amplicons (no overlap) X • Run V4-‐V5 amplicons on MiSeq (parKal overlap for QC and
merging) √ • ReKre PGM √
Final comments • Amplicon sequencing is moving target
– Not as well supported by next-‐gen companies as shotgun sequencing – Databases growing – Sample sources expanding (animal microbiomes, extreme environments, low
biomass targets, marine/freshwater/sediments/rock surfaces) – Read lengths increasing; paired end reads possible – Processing s/w –many approaches to filtering, trimming
• So— – Look at your raw data – Assess expected coverage by primer sets – OpKmize amplicon protocols and processing – Evaluate accuracy, not just quality (run controls) – Don’t hesitate to discard LQ reads – Expect technological advances