Genome browsers Apr 2013.ppt - Massachusetts Institute of ...

Genome browsers:

Discovering biology through genomics

BaRC Hot Topics ? April 2013

George Bell, Ph.D.


? Genome browser tutorial materials

? ? ? ?

? Browser file formats:

? Previous Hot Topics () ? OpenHelix training materials (some free)


? BaRC scientists 3

Today's outline

? Genome browser introduction

? Popular types of genome browsers

? UCSC Genome Browser ? Integrative Genomics Viewer (IGV) ? Ensembl ? Gbrowse (SGD, FlyBase, WormBase, TAIR, ZFIN,

HapMap, Planarian at Whitehead )

? Browser file formats for custom data tracks

? Throughout the talk: Mining the genome


Genome browser components

? Genome sequence (partially or fully assembled) ? Graphics + data browsing/searching system ? Collection of data (qualitative and quantitative)

linked to

? genome coordinates ? genome features linked to genome coordinates

? System to view custom data ? Algorithm to align sequences to genome


Practical hints

? Take careful notes of genome assembly for

? All coordinates ? All custom browser files

? Genome is updated infrequently ? Data in genome browser can be updated as

often as daily ? Data displayed in genome browser is often

generated by others ? Try out different genome browsers


UCSC: Demo and exercise 1

? Does the RefSeq gene catalog contain the correct isoforms of your favorite human gene?

? Provide evidence from primary sequence

? Examples: WASH2P, BMP4


UCSC Genome Browser


UCSC: Demo and exercise 2

? Get the promoter of your favorite gene (defined as 2kb upstream to 2kb downstream of the transcription start site)

? Examples: BMP4, SERPIND1 ? According to ENCODE, do any transcription

factors bind this promoter?


Integrative Genomics Viewer (IGV)


IGV: Demo and exercise 4

? Using the Illumina Body Map RNA-Seq data on IGV,

? Does the heart subject have any variants in GATA4? Where?

? Center the variant(s) in the display, zoom in all the way, and save that view as a session.

? Beyond IGV: Is this variant a known SNP?


IGV: Demo and exercise 3

? Using the Illumina Body Map RNA-Seq data on IGV,

? Is GATA4 really expressed at a higher level in heart than in skeletal muscle?

? Why isn't this comparison of mapped reads quantitative?


Ensembl: more than a browser

? An automated genome annotation pipeline ? Includes thorough homology analysis via

Compara ? Hosts hand-curated gene annotation projects

(Vega; Havana) ? All data can be downloaded in a variety of ways ? BioMart is a powerful web interface to the

Ensembl databases


Ensembl gene pages


Ensembl: Demo and exercise 6

? Use BioMart to get a list of all human genes on chromosome 1 and corresponding mouse homologs


Ensembl: Demo and exercise 5

? Go to the Ensembl page for mouse Uox (urate oxidase)

? Download Uox homologs (in fasta format) from as many species as possible

? Is this gene missing in any primates?


Gbrowse (many MODs)


GBrowse: Demo and exercise 7

? Go to TAIR (The Arabidopsis Information Resource)

? Find Gbrowse (under Tools)

? Find gene AT2G19420

? What non-coding gene overlaps it?

? Download a GFF file of these genes and view it in Excel.


Demo and exercise 8

? Go to UCSC ( - WI only) or IGV

? Locate track files in \\BaRC_Public\Hot_Topics\Genome_browsers_Apr_201 3

? Add the 4 tracks to the browser (mm9)

? TargetScanMouse6_mm9.chr3.bed ? TargetScanMouse6_mm9.chr3.bedgraph ? CGH.mm9.chr3-4.wig ? track type=bam name="Heart BAM"

bigDataUrl= browsers_Apr_2013/HeartCellRNASeq.bam

? Look at some chr3 genes (ex: Pfn2, Serp1, Ssr3, Hdgf) ? Optimize the display modes of your custom tracks


Viewing custom data

? About any data can be viewed in a genome browser as long as it is

? Linked to genome coordinates ? Organized in a standard format that is

? qualitative (ex: bed, bam), or ? quantitative (ex: wig, bedgraph)

? Different formats using different counting schemes (starting at 0 or 1) so off-by-one bugs are easy to make

? BAM files need to be sorted and indexed first


Other notable browsers

? JBrowse ? Golden Helix GenomeBrowse ? WashU Epigenome Browser ? UCSC Cancer Genome Browser ? 1000 Genomes Browser



In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download