Post on 05-Jan-2016
A Whirlwind Tour of BioinformaticsA Whirlwind Tour of Bioinformatics
Kun-Mao Chao (趙坤茂 )
National Taiwan University
http://www.csie.ntu.edu.tw/~kmchao/
2
The Best? The Cheapest?The Best? The Cheapest?
The Best Entrance The Cheapest
3
Bio-X? X-Informatics?Bio-X? X-Informatics?
Bio-X Bioinformatics X-Informatics
Source: NIH, Bioinformatics Journal, NPS
4
Interdisciplinary PioneersInterdisciplinary Pioneers
Leonardo da Vinci Isaac NewtonArchimedes of Syracuse
Source: Wikipedia
5
Amphibia, TriphibiaAmphibia, Triphibia
Source: Wikipedia, xplanes
My Journey
7
Band AlignmentBand Alignment(Joint work with W. Pearson and W. Miller, 1992)(Joint work with W. Pearson and W. Miller, 1992)
Seq. 1
Seq
. 2
8
Alignment in an Arbitrary RegionAlignment in an Arbitrary Region(Joint work with R. C. Hardison and W. Miller, 1993)(Joint work with R. C. Hardison and W. Miller, 1993)
9
Aligning Very Similar SequencesAligning Very Similar Sequences (Joint work with J. Zhang, J. Ostell and W. Miller, 1997)(Joint work with J. Zhang, J. Ostell and W. Miller, 1997)
10
Generalized Global AlignmentGeneralized Global Alignment (Joint work with X. Huang, 2003)(Joint work with X. Huang, 2003)
11
Tag SNPs & Haplotype InferenceTag SNPs & Haplotype Inference (Joint work with Y.-T. Huang (Joint work with Y.-T. Huang et alet al., 2006)., 2006)
Yao-Ting Huang Ting ChenKui Zhang Chia-Jung Chang Kun-Mao Chao
12
Sequence Comparison: Theory and Methods (Joint work with L. Zhang, 2009) (Joint work with L. Zhang, 2009)
13
Bioinformatics for BiologistsBioinformatics for BiologistsEdited by Pavel Pevzner and Ron Shamir Edited by Pavel Pevzner and Ron Shamir
Cambridge University Press, 2011Cambridge University Press, 2011
14
Bioinformatics for BiologistsBioinformatics for BiologistsEdited by Pavel Pevzner and Ron Shamir Edited by Pavel Pevzner and Ron Shamir
15
Bioinformatics for BiologistsBioinformatics for BiologistsEdited by Pavel Pevzner and Ron Shamir Edited by Pavel Pevzner and Ron Shamir
A Brief Introduction
17
Central Dogma of Molecular BiologyCentral Dogma of Molecular Biology
Source: http://www.ncbi.nlm.nih.gov
18
From Genes to ProteinsFrom Genes to Proteins
Source: http://www.ornl.gov
19
Double HelixDouble Helix
Source: http://www.nature.com
20
A Brief History of GeneticsA Brief History of Genetics
• 1859 Charles Darwin published “The Origin of Species.”
• 1865 Genes are particular factors. [Gregor Mendel]
• 1869 Discovery of nucleic acid [Friedrich Miescher]
• 1903 Chromosomes are hereditary units. [Walter Sutton]
• 1910 Genes lie on chromosomes. [Thomas Hunt Morgan]
• 1913 Chromosomes are linear arrays of genes. [Alfred Sturtevant]
• 1931 Recombination occurs by crossing over. [Harriet Creighton and
Barbara McClintock]
21
A Brief History of Genetics (cont’d)A Brief History of Genetics (cont’d)
• 1944 DNA is the genetic material. [Oswald Avery, Colin McLeod and Maclyn McCarty]
• 1953 DNA is a double helix. [James Watson and Francis Crick]
• 1961-1967 Genetic code is triplet. [Marshall Nirenberg, Har Gobind Khorana, Sydney Brenner & Francis Crick]
• 1977 DNA was sequenced for the first time. [Fred Sanger, Walter Gilbert, and Allan Maxam]
• 21th Century: Many genomes completely sequenced
MIT Open Courseware:Biology 7.012 Introduction to Biology
22
Multiple Nobel Laureates
23
Milestones of BioinformaticsMilestones of Bioinformatics
• 1962 Pauling's theory of molecular evolution• 1965 Margaret Dayhoff's Atlas of Protein Sequences• 1970 Needleman-Wunsch algorithm• 1977 DNA sequencing and software to analyze it (
Staden)• 1981 Smith-Waterman algorithm developed• 1981 The concept of a sequence motif (Doolittle)• 1982 GenBank Release 3 made public• 1982 Phage lambda genome sequenced
24
Milestones of Bioinformatics (cont’d)Milestones of Bioinformatics (cont’d)
• 1983 Sequence database searching algorithm (Wilbur-Lipman)
• 1985 FASTP/FASTN: fast sequence similarity searching• 1988 National Center for Biotechnology Information
(NCBI) created at NIH/NLM• 1990 BLAST: fast sequence similarity searching• 1991 EST: expressed sequence tag sequencing• 1993 Sanger Centre, Hinxton, UK• 1994 EMBL European Bioinformatics Institute,
Hinxton, UK
25
Milestones of Bioinformatics (cont’d)Milestones of Bioinformatics (cont’d)
• 1995 First bacterial genomes completely sequenced• 1996 Yeast genome completely sequenced• 1997 PSI-BLAST• 1998 Worm (multicellular) genome completely
sequenced • 1999 Fly genome completely sequenced
26
Milestones of Bioinformatics (cont’d)Milestones of Bioinformatics (cont’d)
• Human Genome Project (1990-2003)
• Mouse 2002
• Rat 2004
• Chimpanzee 2005
• Completed Genomes
27
Chimpanzee GenomeChimpanzee Genome
28
The Primate Family TreeThe Primate Family Tree
Source: Nature
29
TopicsTopics
• Sequencing and genotyping technologies • Molecular sequence analysis • Recognition of genes and regulatory elements • Comparative genomics• Gene expression • Molecular structural biology• Biological networks • Systems biology • Computational proteomics • Molecular evolution• Phylogenetic trees• Population genetics • Medical informatics
30
Bioinformatics CentersBioinformatics Centers
• National Center for Biotechnology Information (NCBI, NIH):– http://www.ncbi.nlm.nih.gov/
• European Bioinformatics Institute (EBI):– http://www.ebi.ac.uk/
• DNA Data Bank of Japan (DDBJ):– http://www.ddbj.nig.ac.jp/index-e.html
• UCSC Genome Browser Home• RCSB Protein Data Bank
31
Bioinformatics DepartmentsBioinformatics Departments
Computational Biology and Bioinformatics, USCBioinformatics and Systems Biology, UCSDThe Broad Institute of MIT and HarvardComputational and Genomic Biology, UC BerkeleyBiomedical Informatics Research, Stanford UniversityComparative Genomics and Bioinformatics, Penn StatePenn Center for BioinformaticsMax Planck Institute for Molecular GeneticsBioinformatics and Computational Biology, Iowa State
32
Bioinformatics JournalsBioinformatics Journals
BioinformaticsJournal of Computational BiologyGenome ResearchNatureNucleic Acid ResearchPLoS Computational BiologyScience
33
Nature & ScienceNature & Science
34
Bioinformatics ConferencesBioinformatics Conferences
The Annual International Conference on Research in Computational Molecular Biology (RECOMB)
The Symposium on Intelligent Systems for Molecular Biology (ISMB)
The European Conferences on Computational Biology (ECCB)
35
BooksBooks
Books (Cont’d)Books (Cont’d)
• All grading for the 100+ homework problems in the book is automatically done through the popular online bioinformatics education website Rosalind. All problems represent programming challenges with randomized input given to students.
36
37
Bioinformatics CommunityBioinformatics Community
• The International Society for Computational Biology (ISCB)
– Senior Scientist Accomplishment Award
38
Ten Steps to Success in Ten Steps to Success in Bioinformatics by Webb MillerBioinformatics by Webb Miller
1. Become a biologist.
2. Value your number of citations above your number of publications.
3. Collaborate, and do it with great collaborators.
4. Do not expect a warm welcome from everyone.
5. Be a good collaborator.
6. Distribute and maintain software and/or run web servers that you personally continue to use.
39
10 Steps to Success in Bioinformatics10 Steps to Success in Bioinformaticsby Webb Millerby Webb Miller
7. Alternate between working on specific datasets and writing general-purpose software.
8. Write some of your own software.
9. Don't give up.
10. Be excited about your work.