Bio info-r-matics

Post on 14-Jul-2015

40 views 0 download

Transcript of Bio info-r-matics

BIO-INFO-R-MATICS

劉佳鑫

中研院統計科學研究所

中研院國際研究生生物資訊學程

HI! STORY~~~C.S.

Atanasoff-Berry

FORTRAN

1939

1954

1955

First disk storage (IBM)

1963BASIC

1969UNIX OS

1970SD RAM

(first electronic digital computer )

1972C

1981~85DOS, Windows, C++

1994JAVA

2007

MAY, 1951

Rosalind Elsie Franklin

Photo No. 51

HI! STORY~~~BIOLOGY

1953

1939

Computer

science

HI! STORY~~~BIOLOGY

1953

Structure of DNA

A = TC≣G

1939

Computer

science

HI! STORY~~~BIOLOGY

1953Structure of DNA

1944

DNA was the agent

responsible for transferring

genetic information

1822

1884

The theories of Heredity

1943

Chromosome, gene, protein

1859 Darwinian Evolution

On the Origin of Species

1890

1962

Sir Ronald Aylmer Fisher

neo-Darwinian synthesis

population genetics

ANOVA

The Design of Experiments

Statistics !!

1990

2003Human Genome Project 2008

20121000 Genome Project

1939

Computer

science

Lamarckian Evolution1801

DATE! DATE! DATE!

Biology ComputerStatistics

Genetics

Evolution

Cell biology

Molecule biology

Biochemistry

Medical science

Bioinformatics

Bioinformatics

Biology

BIOINFOMATICIAN

5 levels

Level 1: Analyze data on websites ……

Level 2: Install software and conduct it !

Level 3: Programming (C/R/JAVA/PERL/Paython……)

Level 4: Write scripts of known algorithms, combine ,and maintain them

Level 5: Develop new algorithm to solve question in biology

1

2 3

4

5

LEVEL 1

Paste

Submit

LEVEL 1

LEVEL 2

Installation guide

DATA SCIENCE

Visualization

Parsing

LEVEL 3

LEVEL 4

seed=1234,

sample.size=500,

resample.number=1000,

alpha=0.05

original.sample<-runif(sample.size, min=0, max=1)

resample.results<-data.frame("Run.Number"=NULL,"mean"=NULL)

for(counter in 1:resample.number){

temp<-sample(original.sample, size=length(original.sample), replace = TRUE)

temp.mean<-mean(temp)

temp.table.row<-data.frame("Run.Number"=counter,"mean"=temp.mean)

resample.results<-rbind(resample.results,temp.table.row)

}

resample.results<-resample.results[with(resample.results, order(mean)), ]

lowerCI.row<-resample.number*alpha/2

upplerCI.row<-resample.number*(1-(alpha/2))

median.row<-resample.number/2

median<-resample.results$mean[median.row]

lowerCI<-resample.results$mean[lowerCI.row]

upperCI<-resample.results$mean[upplerCI.row]

median.run<-resample.results$Run.Number[median.row]

lowerCI.run<-resample.results$Run.Number[lowerCI.row]

upperCI.run<-resample.results$Run.Number[upplerCI.row]

mc.table<-data.frame("median"=NULL,"lowerCI"=NULL,"upperCI"=NULL)

values<-data.frame(median,lowerCI,upperCI)

runs<-as.numeric(data.frame(median.run,lowerCI.run,upperCI.run))

mc.table<-rbind(mc.table,values)

mc.table<-rbind(mc.table,runs)

Monte Carlo simulation

Hidden Markov model

Simulation annealing

Bayesian analysis

.

.

.

.

LEVEL 5

周易-形上形下

形而上者謂之道 形而下者謂之器

「形」: 天象地形

「道」: 天象地形上存在的抽象原理 (Metaphysics)

「器」: 天地變化、陰陽交感下所生的具體事物

道器不相離,如有天地,太極之理

QUESTION

lysClysineglucose

E.colilysine

lysine

lysine

lysine

lysine

lysine

lysine

lysine

lysine

lysinelysine

lysinelysine

QUESTION

Less than 10 % of human genome with know(?) functions

• Only 1% code for protein

Identify functional elements in the Human genome

• Genetic approach

• Evolutionary approach

• Biochemical approach

AIM

GENETIC

Rely on sequence alterations

To establish the biological relevance of a DNA segment

GENETIC

Association ~~~~

• Pearson

• Spearman

• Logistic

Genome-Wide Association Studies (GWAS)

NHGRI GWAS Catalog

GENETICS

EVOLUTIONARY

Comparative genomics

only 5% of mammalian genomes are under strong evolutionary

constraint across multiple species (e.g., human, mouse, and dog)

Multiple alignment technology

BIOCHEMICAL

Detect biochemical activity

Encyclopedia of DNA Elements (ENCODE) Project

50% of nucleotides in the human genome are readily recognizable

as repeat elements.

ANOTHER TOP IC IN

BIOINF ORM ATICS

The most popular bio-industry in human history

ANOTHER TOP IC IN

BIOINF ORM ATICS

Enzyme kinetics

ANOTHER TOP IC IN

BIOINF ORM ATICS

Metabolic control analysis

wiki

BIOLOGICAL NETWORKS

Metabolism network

KEGG2004, NRG, Barabasi

2008, Science Signaling

Signal transduction network

Protein

interaction

network

BIOL OGICAL NETWORKS ANALYSIS

Metabolic control analysis

Flux balance analysis

Sensitivity analysis

Network property analysis

Differential equation

Partial differential equation

Linear programming

Genetic algorithm

TAKE HOME MESSAGE

History

Components of bioinformatics

Levels of involving bioinformatics

Some topics in bioinformatics

Keep thinking……

Google!