Flash Basic Training Introduction - VietNam Young Talent Community
Basic Introduction of BLAST
description
Transcript of Basic Introduction of BLAST
Basic Introduction of BLAST
Jundi Wang
School of Computing
CSC691
09/08/2013
2
Overview
1.Introduction of BLAST Background of BLAST Programs in BLAST Function of BLAST2.Application of BLAST BLAST web version Stand-alone BLAST
3
Background of BLAST
BLAST (Basic Local Alignment Search Tool):1. The most widely used sequence similarity tool.2. BLAST is a family of programs: a) Compare protein queries to protein databases b) Compare nucleotide queries to nucleotide
databases
4
Background of BLAST
The Mechanism of BLAST Finding similar sequences:
BLAST finds similar sequences by locating short matches between the two sequence. After the first match, BLAST begins to make local alignments.
5
Programs in BLAST
There are some different BLAST programs available for different analytic purposes.
Nucleotide-nucleotide BLAST (blastn) This program, given a DNA query, returns the most similar
DNA sequences from the DNA database that the user specifies. Protein-protein BLAST (blastp) This program, given a protein query, returns the most similar
protein sequences from the protein database that the user specifies.
6
Programs in BLAST
Nucleotide 6-frame translation-protein (blastx) This program compares the six-frame conceptual translation
products of a nucleotide query sequence against a protein sequence database.
Nucleotide 6-frame translation-nucleotide 6-frame translation (tblastx)
This program translates the query nucleotide sequence in all six possible frames and compares it against the six-frame translations of a nucleotide sequence database.
Protein-nucleotide 6-frame translation (tblastn)
7
Programs in BLAST
Protein-nucleotide 6-frame translation (tblastn) This program compares a protein query against the all six
reading frames of a nucleotide sequence database.
8
Six-Frame Translation
Once a gene has been sequenced it is important to determine the correct open reading frame (ORF). Every region of DNA has six possible reading frames, three in each strand. The ORF that is used determines which amino acids will be encoded by a gene. Typically only one reading frame is used in translating a gene (in eukaryotes). The ORF starts with an start codon (ATG) and ends with a stop codon (TAA, TAG, or TGA).
9
Six-Frame Translation
Example:
10
Function of BLAST
BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families.
11
Application of BLAST
BLAST web version:Advantage:1. It is convenient to operate.2. Synchronously updates the databases.Weakness:3. It is not good enough to analyze large-scale data.4. Programmer cannot customize the database.
http://www.ncbi.nlm.nih.gov/BLAST/
12
Application of BLAST
Stand-alone BLAST:Advantage:1. It can be used to analyze large-scale data.2. Programmer can customize the database.3. Programmer can download different version for different
operating system.Weakness:4. It is difficult to user who don’t have computer science
background. ftp://ftp.ncbi.nlm.nih.gov/blast/executables/LATEST/
13
Application of BLAST
Statistics in BLAST1. Score:
It is a value calculated from the number of gaps and substitutions associated with each aligned.
2. E value: It describes the likelihood that a sequence with a similar score will occur in the database by chance.
14
Application of BLAST
3. Identities:It describes the identity between query sequence and the sequence from database.
4. Positive: It describes the similarity between query sequence and the sequence from database.
5. Gaps: It describes the gaps between query sequence and the sequence from database.
15
Application of BLAST (web version)
NCBI BLAST web page
NucleotideAlignment Protein
Alignment
16
Application of BLAST (web version)
Query Sequence
Upload File
Query Subrange
Select Database
17
Application of BLAST (web version)
SelectAlgorithm
E value limitation
18
Application of BLAST (web version)
Click “Mouse” to check the detail
19
Application of BLAST (web version)100% Identity
No Gap
TheValue
ofscore is
the result
of ScoreMatrix
20
Application of BLAST (web version)
All compared sequence
NCBI Accession ID
21
Application of BLAST (Stand-alone Version)
Download and install Stand-alone BLASTftp://ftp.ncbi.nlm.nih.gov/blast/executables/LATEST/ Download the database from NCBIftp://ftp.ncbi.nlm.nih.gov/blast/db/ Download and install Activeperl from ActiveStatehttp://www.activestate.com/activeperl
22
Application of BLAST (Stand-alone Version)
Build local database1. Enter the BLAST folder and create a database (db) folder.2. Extract the downloaded database into the db folder. Link the database to the BLAST1. Execute cmd.exe and link the database to the BLAST by
Perl. Modify the environment variables1. Set the new path variable in order to make the BLAST to be
recognized.
23
Application of BLAST (Stand-alone Version)
Create a query sequence with a FASTA format.
Start with “>”Follow by the name or description of the query
sequence
24
Application of BLAST (Stand-alone Version)
Example: Compare the query sequence with the sequence from the “refseq_rna.00” database.
Different program in BLAST package
Link the “refseq_rna.00” to
the BLAST
Name of database
25
Application of BLAST (Stand-alone Version)
The basic information of the current database
26
Application of BLAST (Stand-alone Version)
Execute “blastn” program
Import the query sequence
Import the target database
Report the result in a new file
27
Application of BLAST (Stand-alone Version)
The length of compared sequence
NCBI Accession ID
All compared sequence Statistic evaluation
28
Application of BLAST (Stand-alone Version)
29
Application of BLAST (Stand-alone Version)
Summary
DNA Sequencing in a new species
NCBI BLAST
Database
Query
Import
Outpu
t
31
Thank You