Introduction to Galaxy (UEB-UAT Bioinformatics Course - Session 2.2 - VHIR, Barcelona)
description
Transcript of Introduction to Galaxy (UEB-UAT Bioinformatics Course - Session 2.2 - VHIR, Barcelona)
1
Vall d’Hebron Institut de Recerca (VHIR)
Alex Sánchez
15/05/2014
Institut d’Investigació Sanitària acreditat per l’Instituto de Salud Carlos III (ISCIII)
Introduction to Galaxy
A web-based genome analysis platform
BIOINFORMATICS FOR
BIOMEDICAL RESEARCH
2
• Galaxy overview and Interface
• Getting Data in Galaxy
• Analyzing Data in Galaxy
– Quality Control
– Mapping Data
• History and workflow
• Galaxy Exercises
NGS Analysis Using Galaxy
3
What is Galaxy
• Galaxy is an open-source framework for integrating various computational tools and databases into a cohesive workspace.
But it can also be seen as
• A web-based service, integrating many popular tools and resources for comparative genomics.
And also
• A completely self-contained application for building your own Galaxy style sites.
4
http://galaxyproject.org
5
Galaxy Conceptual Framework
6 6
Galaxy Interface Sections
contains links to
the downloading,
preparation and
analysis tools.
The center column
is where the
menus and data
will appear
show you the history of your analysis steps,
allow you view data and results, and more.
Register User
7 7
Getting Data
Click Get Data
8 8
Getting Data: Table Browser
Get Table Main
9 9
Getting Data: UCSC Table Browser
Get Output
clade: Mammal genome: Human
assmbly: [current]
group: Genes and… track: UCSC Genes
table: knownGene
region: position, chrX
Output format: BED, and check Send output to
Galaxy
10 10
Getting Data: Upload File
Upload File
Execute
File Format
Species
Upload or paste file
11
Getting Data: Upload File
Specify multiple URLs
into the "URL / Text" box
12
• Sequences and Alignment Format • Galaxy overview and Interface • Getting Data in Galaxy • Analyzing Data in Galaxy
– Text Manipulation tools – Filter and Sort – Operate on Genomic Intervals – Quality Control – Mapping Data
• History and workflow • Galaxy Exercises
NGS Analysis Using Galaxy
13
Text Manipulation Tools
14
Filter and Sort
15
Operate on Genomic Intervals
16
Fasta Manipulation
17 17
Analyzing Data: Next Generation Sequencing
18
Analyzing Data: Next Generation Sequencing
FASTQ file manipulation,
like format conversation,
summary statistics,
trimming reads,
filtering reads
by quality score…
19
Analyzing Data: Next Generation Sequencing
Input: sanger FASTQ
Output: SAM format
20
Analyzing Data: Next Generation Sequencing
21
• Sequences and Alignment Format
• Galaxy overview and Interface
• Getting Data in Galaxy
• Analyzing Data in Galaxy – Quality Control
– Mapping Data
• History and workflow
• Galaxy Exercises
NGS Analysis Using Galaxy
22 Copyright OpenHelix. No use or
reproduction without express written
consent
22
History: History Options
List saved histories and shared histories.
Work on Current History, create new, clone, share,
create workflow, set permissions, show deleted datasets or delete history.
List saved histories
23
Workflow
Creates a workflow, allows
user to repeat analysis using different datasets.
24
• Sequences and Alignment Format
• Galaxy overview and Interface
• Getting Data in Galaxy
• Analyzing Data in Galaxy – Quality Control
– Mapping Data
• History and workflow
• Galaxy Exercises
NGS Analysis Using Galaxy