Genome Sequence determination 陳中庸 E-mail: [email protected]@cycu.edu.tw Web site: .

87
Genome Sequence determination 陳陳陳 E-mail: [email protected] Web site: www.cychen.idv.tw

Transcript of Genome Sequence determination 陳中庸 E-mail: [email protected]@cycu.edu.tw Web site: .

Page 1: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Genome Sequence determination

陳中庸

E-mail: [email protected] site: www.cychen.idv.tw

Page 2: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Complete Microbial Genomes

Page 3: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Genome what now? Sequencing is…

Determining the full nucleotide sequence of one strain of an organism

Making predictions of genes within that sequence & predicting the function of those genes

HARD!!!! Sequencing requires…

Time Money People Computers

Page 4: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Before Sequencing … Nature of an organism Genetic code Genome size Genome structure

Sequencing means… - Bioinformatic

- Functional Assay - More….

Genome what now?

Page 5: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Library Creation

Page 6: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Library Creation

Sequencing

Page 7: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Library Creation

Sequencing

Assembly

Page 8: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Library Creation

Sequencing

Assembly

Page 9: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Library Creation

Sequencing

Assembly

Page 10: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Library Creation

Sequencing

Assembly

Gap Closure

Page 11: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Library Creation

Sequencing

Assembly

Gap Closure

Page 12: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Library Creation

Sequencing

Assembly

Gap Closure

Finishing

Page 13: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Library Creation

Sequencing

Assembly

Gap Closure

Finishing

Annotation

Page 14: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Library Creation

Sequencing

Assembly

Gap Closure

Finishing

Annotation

Which steps are computationally expensive?

Page 15: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Library Creation

Sequencing

Assembly

Gap Closure

Finishing

Annotation

Page 16: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Library Creation

Sequencing

Assembly

Gap Closure

Finishing

Annotation

Which steps have notalready been exceptionallywell studied?

Page 17: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Library Creation

Sequencing

Assembly

Gap Closure

Finishing

Annotation

Page 18: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Library Creation

Sequencing

Assembly

Gap Closure

Finishing

Annotation

Which step has not beensubjected to a variety ofapproaches?

Page 19: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Library Creation

Sequencing

Assembly

Gap Closure

Finishing

Annotation

Page 20: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Nature of an organism: Pathogen?

Genetic code

Genome size

Genome structure

Page 21: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Strain: YJ016

Genome Size: 5.2 Mb

Source: Southern Taiwan

Significance: Virulence

Strategy: Whole Genome Shotgun

Sequencing Coverage: 10X

Vibrio vulnificus

Page 22: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Nature of an organism: Pathogen?

Genetic code: Special Code?

Genome size

Genome structure

Page 23: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Genetic Code Tableshttp://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c

Genetic Code No Life forms

1 Standard Code

2 Vertebrate Mitochondrial Code

3 Yeast Mitochondrial Code

4 Mold, Protozoan, & Mycoplasma/Spiroplasma Code

5 Invertebrate Mitochondrial Code

6 Ciliate, Dasycladacean and Hexamita Nuclear Code

9 Echinoderm and Flatworm Mitochondrial Code

10 Euplotid Nuclear Code

11 Bacterial and Plant Plastid Code

12 Alternative Yeast Nuclear Code

13 Ascidian Mitochondrial Code

14 Alternative Flatworm Mitochondrial Code

15 Blepharisma Nuclear Code

16 Chlorophycean Mitochondrial Code

21 Trematode Mitochondrial Code

22 Scenedesmus obliquus mitochondrial

23 Thraustochytrium Mitochondrial Code

Page 24: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Nature of an organism: Pathogen?

Genetic code: Special Code?

Genome size: How many Megabases?

Genome structure

Page 25: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Organism Selection

Nature of an organism: Pathogen?

Genetic code: Special Code?

Genome size: How many Megabases?

Genome structure: Linear/Circular Chromosome? How many?

Page 26: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

How to sequence a complete genome?

Sizes of bacterial genomes vary between :Mycoplasma genitalium and Myxobacteria: 0.6 Mb to ~13 Mb

• reading length of DNA sequencing reactions is just ~600 bp (= 0.0006 Mb) ⇒ a subdivision of the genome is obviously necessary

• If the genome needs to be subdivided into small pieces of suitable sizes for sequencing, then• Individual sequences/fragments need to be ordered somehow into their "native" order• Therefore, overlaps between each other are necessary in order to re-assemble the pieces

⇒ there are two main sequencing strategies: 1. whole genome shotgun sequencing2. ordered shotgun sequencing

Page 27: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

c = Coverage;

Page 28: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

A. Two ends are overlappedB. Non overlappedC. Plasmid percentage in contigs

Page 29: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .
Page 30: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Library Creation

1.Team Works2.QC control3.Time Table4.Budget5.Paper

Page 31: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Standard Operation Procedures of a Genome projectA. Decision

Mapping Protocol 1

B. Library

PCR Confirm Protocol 2

Protocol 3 DNA purification

PFGFISH

PCR confirm

Protocol 4

Shotgun Library

Picking

Protocol 5

決定盤數

Plasmid DNA

Print Labels

Protocol 6

Sequencing Reactions

Dye Primers

Dye Terminator

Protocol 7

Protocol 8

Gel Running

Assemble

Annotation

377

3700

Protocol 9

Protocol 10

Protocol 11

Protocol 12

C. Sequencing

D. Finish

QC

QC

QC

QC

QC

Page 32: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

1. Restriction enzyme: Sau3AI (GATC)--- affected by CG methylase MboI (GATC) – affected by dam methylase -- not affected by CG methylase

2. Sonication: Sonication – Bal31 repair – T4 DNApolymerase – Sizing – Recover –Ligation

3. GeneMachine: easy sizing by filter

Random Shearing of Genomic DNALibrary (1)

Page 33: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .
Page 34: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Library (2)Library clones & Sequencing clones

1.8 Mb3.3 Mb

Chromosome I Chromosome II

Shotgun library

Library 1: 2.5-3.5 kb inserts7X Coverage

Library 2: 5.5-7.5 kb inserts3X Coverage

Library 3: 30 kb inserts Cosmid library 10X Clone Coverage, 0.4X Sequence Coverage

Sequenced for both ends Sequenced for both ends Sequenced for both ends

Assemble the reads by using phred/phrap/consed softwares

Contig 1 Contig 2 Contig 3

Closing the gaps by primer walking, PCR or re-sequencing

Annotation

Page 35: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Library (2)Library clones & Sequencing clones

5,000,000 bp 1000 bp/per clone

5,000,000/1000 = 5000 clones =52 x 96 well plates

10 x redundancy

52 x10 x 96 wells plates Library clones

Both ends sequencing

2 x 52 x 10 x 96 well plates ≒ 1000 plates Sequencing clones

Page 36: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Sequencing (1)Time table

1. 377: 2 runs/per day (one run for one 96 well plate) 3700 : 6 runs/per day (POP6) 8 runs/per day (POP5) 3730 : 12 runs/per day

2. 377 x 2 sets = 4 runs/per day 3700 x 2 sets = 6 x1 + 8 x 1 = 14 runs/per day total 18 runs per day

3. 1000 plates / 18 = 56 days = 11 weeks (3 months)

4. Today, 3730 for 4 sets = 48 runs/per day; 1000 plats /48 = 20 days

Page 37: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Sequencing (2)Cost

           

Library Cost   50000 per Genome    

        subtotal10% fail rate

Plasmid Purification   10.3 cost per sample 10.3 11.33

Sequencung   81.96 cost per sample 81.96 90.156

維修費   4.7 cost per sample 4.7 4.7

Total       96.96 106.186

             Sequening for 5 MB 5MB =500 x 2 x 96 x 106.19 = 10193856    Shotgun Library     50000          10,243,856  

Page 38: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

ABI 377

ABI 3700

硬體設施

MegaBace 4000

ABI 3730XL

Page 39: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

The automated production line for sample preparation at the Whitehead Institute, Center for Genome Research. The system consists of custom-designed factory-style conveyor belt robots that perform all functions from purifying DNA from bacterial cultures through setting up and purifying sequencing reactions.

Page 40: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Reads vs. Assembled Contigs

0

300

600

900

1200

1500

0 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000

Assembled reads

Assem

ble

d

con

tig

s

5X coverage

166

279

243245

328

359

Page 41: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Reads and Assembled Size

3.5

3.7

3.9

4.1

4.3

4.5

4.7

4.9

5.1

5.3

0 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000

5X coverage

Assembled reads

Assem

ble

d s

ize

(Mb

ps)

5.17

5.135.12

5.105.085.07

Page 42: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

How assemble software works?

Page 43: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .
Page 44: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .
Page 45: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .
Page 46: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .
Page 47: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

What is Gap Closure? What are gaps?

Unsequenced regions located between assembly generated fragments of contiguous sequence (contigs)

What causes gaps? Host toxicity, secondary structure, ???

Back to “gap closure” Producing, purifying, and sequencing, or

locating, the missing regions of DNA

Page 48: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

How Can I Close Gaps? Genome Walking

Blind PCR extension of contigs Multiplex PCR

Combinatorial trial of every contig pair Read Pair Analysis

Use information stored by the assembler to suggest alignments, then PCR

Comparative Alignment

Page 49: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Comparative Alignment(the Bioinformatics Approach)

Find locations where contigs are homologous to known sequences

Determine if any contigs share homology in the same region of the same sequence

Design primers Conduct PCR with those primers Sequence that product and use that

sequence to close the gap

Page 50: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Blast Organism X(cross) -

Comparison

Compares contig ends to NCBI “nr” database with BlastN

Parses all hits and finds biologically possible contig pairs

Using the flanking sequence and Primer3, designs primers that will produce a PCR product spanning that gap

Page 51: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Using the flanking sequence and Primer3, design primers that produce a PCR product spanning that gap

TTATGCTATCGAATTCCGACG GTCTGCAGGTCTTCCGACGTAG

Page 52: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Using the flanking sequence and Primer3, design primers that produce a PCR product spanning that gap

TTATGCTATCGAATTCCGACG GTCTGCAGGTCTTCCGACGTAG

Page 53: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Using the flanking sequence and Primer3, design primers that produce a PCR product spanning that gap

TTATGCTATCGAATTCCGACG GTCTGCAGGTCTTCCGACGTAG

Page 54: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Information to reduce gaps

1. The distance of both end sequences2. Cosmid anchors3. Known genes4. Compare with other genomes5. Good luck

Page 55: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Finishing Standards

1.GENERAL RULES FOR FINISHING Phase1: draft sequence assembled in contigs Phase2: Contigs in order and linking Phase3: Assembled as one contig with low error rate (0.01) 2. Strategy of finishing A. primer walking B. re-sequencing individual clone C. PCR and sequencing D. Screening new clones E.. Subcloning F. Deletion and sequencing G. Change sequencing chemical H. Restriction map I. End sequencing

Page 56: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Shotgun sequencing – analogy – shredding several copies of Essential Cell Biology, then putting back together via overlapping phrases

Really only good for small genomes – 1995 – used for genome of Haemophilus influenza

Problem: repetitive nucleotide sequences, which make up large part of vertebrate genomes

(Analogy -- phrases like “the human genome” and difficulties they cause)

Page 57: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

10_10_Repetit.sequence.jpgRepetitive sequences make correct assembly difficult

Page 58: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

AnnotationGene Name

Copy Number

Methyl-accepting chemotaxis protein

Tar54

EAL domain Rtn 28

Acetyltransferases; including N-acetylases of ribosomal proteins

RimL 14

Permeases of the drug/metabolite transporter (DMT) superfamily

RhaT 20

Permeases of the major facilitator superfamily

ProP 20

Multiple Genes

Page 59: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Timeline of large-scale genomic analyses. Shown are selected components of work on Several non-vertebrate model organisms (red), the mouse (blue) and the human (green) from 1990; earlier projects are described in the text. SNPs, single nucleotide polymorphisms; ESTs, expressed sequence tags.

Page 60: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

SCIENCE VOL. 277, p1453-1462, 1997

Page 61: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .
Page 62: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .
Page 63: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

1998

1999

2000

2001

2002

2003

Set up genome center

NLBL mappedOver 300 clones

榮陽團隊

千萬鹼基完成

台灣第一個細菌基因體計劃– 創傷弧菌

靈芝計劃

第二個細菌基因體計劃 : 十字花科黑死菌

第三個細菌基因體計劃 : 黴漿菌

第四個細菌基因體計劃 : 克雷氏肺炎菌

第五個細菌基因體計劃 : 固甲浣菌

中研院水稻基因體

食科所紅麴菌基因體

YMGRC/NHRI

Page 64: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Strain: YJ016

Genome Size: 5.2 Mb

Source: Southern Taiwan

Significance: Virulence

Strategy: Whole Genome Shotgun

Sequencing Coverage: 10X

Vibrio vulnificus

Page 65: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

http://genome.nhri.org.tw/vv/

Page 66: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Vibrio vulnificus

Page 67: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

  V. vulnificus Chr1

V. vulnificus. Chr 2

V. vulnificus. plasmid

Size (bp)3340475 1857025 48508

Total number of sequencing reads54662 33696 2676

G+C percentage46.4 47.2 44.9

Total number of ORFs3147 1625 62

Average ORF size(bp)938 1026 659

Percentage coding88.4% 89.7% 84.2%

Number of rRNA operon9 1 0

Number of tRNA87 12 0

Global feature of the Vibrio vulnificus YJ016 genome

Page 68: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

25

30

35

40

45

50

55

0 500000 1e+06 1.5e+06 2e+06 2.5e+06 3e+06 3.5e+06

GC

%

GC% of V. vulnificus Chromosome 1 & 2

25

30

35

40

45

50

55

0 2000004000006000008000001e+061.2e+061.4e+061.6e+061.8e+062e+06

GC

%

Chromosome 1 Chromosome 2

Page 69: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

0.25

0 500000 1e+06 1.5e+06 2e+06 2.5e+06 3e+06 3.5e+06

GC

skew

GC skew of V. vulnificus Chromosome 1 & 2

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

0 2000004000006000008000001e+061.2e+061.4e+061.6e+061.8e+062e+06

GC

skew

Chromosome 1 Chromosome 2

Page 70: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

VV2

VC2

VC1

VV1

VC2

VC1

ACT/blastn at E-value=1, Score>500

Comparison of the similarity between V.v. and V.c. genome

Page 71: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Circular presentation of Vibrio vulnificus YJ016 genome

Chromosome 1Chromosome 1

Chromosome 2Chromosome 2

Plasmid pYJ016Plasmid pYJ016

3.3 Mb

1.85 Mb

48.5 Kb

Page 72: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Category V.v. Paralogousgenes

Vulnificus-specific

V.c. E. coli

Cellular Processes 783 257(67) 168(66) 631 839

Cell envelop biogenesis and outer membrane 156 51(17) 33(12) 124 207

Cell motility and secretion 184 88(14) 11(7) 153 159Cell division and chromosome partitioning 34 12(5) 3(2) 30 29

Posttranscriptional modification, protein turnover, chaperones 108 31(16) 33(17) 91 117

Inorganic ion transport and metabolism 148 27(11) 58(20) 100 188

Signal transduction mechanisms 153 48(4) 30(8) 133 139Information Storage and Processing 618 204(71) 58(20) 415 651

DNA replication, recombination and repair 209 69(27) 19(6) 130 227

Transcription 220 69(20) 23(6) 132 261Translation, ribosomal structure and biogenesis 189 66(24) 16(8) 153 163

Metabolism 941 379(145) 110(55) 724 1303

Lipid metabolism 78 25(11) 21(8) 58 86Nucleotide transport and metabolism 75 28(13) 3(3) 62 86

Coenzyme metabolism 118 32(11) 8(3) 108 125Carbohydrate transport and metabolism 194 113(39) 17(6) 103 347

Amino acid transport and metabolism 227 67(31) 32(21) 200 353Energy production and conversion 173 68(32) 17(9) 125 215

Secondary metabolism biosynthesis, transport and catabolism

76 46(8) 12(5) 68 91

Poorly Characterized 2543     972  

Function unknown 236 ND ND 199 302General function prediction only 269 ND ND 193 335

Hypothetical protein 2038 ND ND 580 799

Comparison of predicted genes of V. vulnificus YJ016, V. cholerae El Tor N16961, and E. coli K12

Page 73: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .
Page 74: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .
Page 75: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Some more technological approaches…(some of which really work!)

• Sequencing by hybridization (annealing)• Sequencing by “ligase-edited” annealing• PyrosequencingNote: there are also higher tech versions of “classic” Sanger sequencing in the works (see http://www.helicosbio.com)

Page 76: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .
Page 77: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .
Page 78: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .
Page 79: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Several companies are pursuing massively parallel(= cheaper) new DNA sequencing strategies,including some that involve single moleculeanalyses.

Some of the main players are given below:454 Life Sciences(http://www.454.com/enabling-technology/the-system.asp)

Solexa (now part of Illumina)(http://www.illumina.com/pages.ilmn?ID=203)Helicos BioSciences(http://www.helicosbio.com)VisiGen Biotechnologies(http://www.visigenbio.com/technology.html)

Page 80: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .
Page 81: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .
Page 82: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .
Page 83: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .
Page 84: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Solexa sequencing technology

Page 85: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Solexa sequencing technology

Page 86: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .

Solexa sequencing technology

Page 87: Genome Sequence determination 陳中庸 E-mail: cychen@cycu.edu.twcychen@cycu.edu.tw Web site: .