--- title: "BioCircos: Generating circular multi-track plots" output: rmarkdown::html_vignette: toc: true toc_depth: 2 vignette: > %\VignetteIndexEntry{BioCircos: Generating circular multi-track plots} %\VignetteEngine{knitr::rmarkdown} %\usepackage[utf8]{inputenc} date: "`r Sys.Date()`" author: "Loan Vulliard" --- ## Introduction This package allows to implement in 'R' Circos-like visualizations of genomic data, as proposed by the BioCircos.js JavaScript library, based on the JQuery and D3 technologies. We will demonstrate here how to generate easily such plots and what are the main parameters to customize them. Each example can be run independently of the others. For a complete list of all the parameters available, please refer to the package documentation. ## Motivation The amount of data produced nowadays in a lot of different fields assesses the relevance of reactive analyses and interactive display of the results. This especially true in biology, where the cost of sequencing data has dropped must faster than the Moore's law prediction. New ways of integrating different level of information and accelerating the interpretation are therefore needed. The integration challenge appears to be of major importance, as it allows a deeper understanding of the biological phenomena happening, that cannot be observed in the single analyses independently. This package aims at offering an easy way of producing Circos-like visualizations to face distinct challenges : * On the one hand, data integration and visualization: Circos is a popular tool to combine different biological information on a single plot. * On the other hand, reactivity and interactivity: thanks to the *htmlwidgets* framework, the figures produced by this package are responsive to mouse events and display useful tooltips, and they can be integrated in shiny apps. Once the analyses have been performed and the shiny app coded, it is possible for the end-user to explore a massive amount of biological data without any programming or bioinformatics knowledge. The terminology used here arises from genomics but this tool may be of interest for different situations where different positional or temporal informations must be combined. ## Installation To install this package, you can use CRAN (the central R package repository) to get the last stable release or build the last development version directly from the GitHub repository. ### From CRAN ```{r eval=FALSE, screenshot.force = FALSE} install.packages('BioCircos') ``` ### From Github ```{r eval=FALSE, screenshot.force = FALSE} # You need devtools for that if (!require('devtools')){install.packages('devtools')} devtools::install_github('lvulliard/BioCircos.R', build_vignettes = TRUE) ``` ## Generating Circos-like visualizations ### Principle To produce a BioCircos visualization, you need to call the *BioCircos* method, that accepts a *tracklist* containing the different *tracks* to be displayed, the genome to be displayed and plotting parameters. By default, an empty *tracklist* is used, and the genome is automatically set to use the chromosome sizes of the reference genome hg19 (GRCh37). ```{r} library(BioCircos) BioCircos() ``` ### Genome configuration A genome needs to be set in order to map all the coordinates of the tracks on it. For now, the only pre-configured genome available is *hg19* (GRCh37), for which the length of the main 22 genomic autosomal chromosome pairs and of the sexual chromosomes are available. The Y chromosome can be removed using the *ychr* parameter. Visual parameters are also available, such as by giving a vector of colors or a *RColorBrewer* palette to change the colors of each chromosome (parameter *genomeFillColor*), the space between each chromosome (*chrPad*) or their borders (*displayGenomeBorder*). The ticks, displaying the scale on each chromosome, can be removed with *genomeTicksDisplay*, and the genome labels (chromosome names) can be brought closer or further away from the chromosomes with *genomeLabelDy*. ```{r, screenshot.force = FALSE} library(BioCircos) BioCircos(genome = "hg19", yChr = FALSE, genomeFillColor = "Reds", chrPad = 0, displayGenomeBorder = FALSE, genomeTicksDisplay = FALSE, genomeLabelDy = 0) ``` To use your own reference genome, you need to define a named list of chromosomal lengths and use it as the *genome* parameter. The names and lengths should match the coordinates you plan on using later for your tracks. You may want to change the scale of the ticks on the chromosomes, to fit to your reference genome, with the *genomeTickScale* parameters. ```{r, screenshot.force = FALSE} library(BioCircos) myGenome = list("A" = 10560, "B" = 8808, "C" = 12014, "D" = 7664, "E" = 9403, "F" = 8661) BioCircos(genome = myGenome, genomeFillColor = c("tomato2", "darkblue"), genomeTicksScale = 4e+3) ``` Another use of a custom genome can be seen in the [Bar tracks section](#barSection). ### Tracklists The different levels of information will be displayed on different *tracks* of different types and located at different radii on the visualization. All the track-generating functions of this package return tracklists that can be added together into a single tracklist, to be given as the *tracks* argument of the *BioCircos* method. The different kinds of tracks are presented in the following sections. All tracks need to be named. ## Text track A first track simply corresponds to text annotations. The obligatory parameters are the track name and the text to be displayed. Some parameters such as the size, the opacity and the coordinates can be customized. ```{r, screenshot.force = FALSE} library(BioCircos) tracklist = BioCircosTextTrack('myTextTrack', 'Some text', size = "2em", opacity = 0.5, x = -0.67, y = -0.5) BioCircos(tracklist, genomeFillColor = "PuOr", chrPad = 0, displayGenomeBorder = FALSE, genomeTicksLen = 2, genomeTicksTextSize = 0, genomeTicksScale = 1e+8, genomeLabelTextSize = "9pt", genomeLabelDy = 0) ``` ## Background track Another simple track type correspond to backgrounds, displayed under other tracks, in a given radius interval. ```{r, screenshot.force = FALSE} library(BioCircos) tracklist = BioCircosBackgroundTrack("myBackgroundTrack", minRadius = 0.5, maxRadius = 0.8, borderColors = "#AAAAAA", borderSize = 0.6, fillColors = "#FFBBBB") BioCircos(tracklist, genomeFillColor = "PuOr", chrPad = 0.05, displayGenomeBorder = FALSE, genomeTicksDisplay = FALSE, genomeLabelTextSize = "9pt", genomeLabelDy = 0) ``` ## SNP track To map punctual information associated with a single-dimensional value on the reference genome, such as a variant or an SNP associated with a confidence score, SNP tracks can be used. It is therefore needed to specify the chromosome and coordinates where each points are mapped, as well as the corresponding value, which will be used to compute the radial coordinate of the points. By default, points display a tooltip when hovered by the mouse. ```{r, screenshot.force = FALSE} library(BioCircos) # Chromosomes on which the points should be displayed points_chromosomes = c('X', '2', '7', '13', '9') # Coordinates on which the points should be displayed points_coordinates = c(102621, 140253678, 98567307, 28937403, 20484611) # Values associated with each point, used as radial coordinate # on a scale going to minRadius for the lowest value to maxRadius for the highest value points_values = 0:4 tracklist = BioCircosSNPTrack('mySNPTrack', points_chromosomes, points_coordinates, points_values, colors = c("tomato2", "darkblue"), minRadius = 0.5, maxRadius = 0.9) # Background are always placed below other tracks tracklist = tracklist + BioCircosBackgroundTrack("myBackgroundTrack", minRadius = 0.5, maxRadius = 0.9, borderColors = "#AAAAAA", borderSize = 0.6, fillColors = "#B3E6FF") BioCircos(tracklist, genomeFillColor = "PuOr", chrPad = 0.05, displayGenomeBorder = FALSE, yChr = FALSE, genomeTicksDisplay = FALSE, genomeLabelTextSize = 18, genomeLabelDy = 0) ``` ## Arc track Arc tracks are displaying arcs along the genomic circle, between the radii given as the *minRadius* and *maxRadius* parameters. As for an SNP track, the chromosome and coordinates (here corresponding to the beginning and end of each arc) should be specified. By default, arcs display a tooltip when hovered by the mouse. ```{r, screenshot.force = FALSE} library(BioCircos) arcs_chromosomes = c('X', 'X', '2', '9') # Chromosomes on which the arcs should be displayed arcs_begin = c(1, 45270560, 140253678, 20484611) arcs_end = c(155270560, 145270560, 154978472, 42512974) tracklist = BioCircosArcTrack('myArcTrack', arcs_chromosomes, arcs_begin, arcs_end, minRadius = 1.18, maxRadius = 1.25, opacities = c(0.4, 0.4, 1, 0.8)) BioCircos(tracklist, genomeFillColor = "PuOr", chrPad = 0.02, displayGenomeBorder = FALSE, yChr = FALSE, genomeTicksDisplay = FALSE, genomeLabelTextSize = 0) ``` ## Link track Links track represent links between different genomic position. They are displayed at the center of the visualization, and out to a radius specified by the *maxRadius* parameter. The chromosomes and beginning and end positions of the regions to be linked are necessary, and labels can be added. By default, links display a tooltip when hovered by the mouse. ```{r, screenshot.force = FALSE} library(BioCircos) links_chromosomes_1 = c('X', '2', '9') # Chromosomes on which the links should start links_chromosomes_2 = c('3', '18', '9') # Chromosomes on which the links should end links_pos_1 = c(155270560, 154978472, 42512974) links_pos_2 = c(102621477, 140253678, 20484611) links_labels = c("Link 1", "Link 2", "Link 3") tracklist = BioCircosBackgroundTrack("myBackgroundTrack", minRadius = 0, maxRadius = 0.55, borderSize = 0, fillColors = "#EEFFEE") tracklist = tracklist + BioCircosLinkTrack('myLinkTrack', links_chromosomes_1, links_pos_1, links_pos_1 + 50000000, links_chromosomes_2, links_pos_2, links_pos_2 + 750000, maxRadius = 0.55, labels = links_labels) BioCircos(tracklist, genomeFillColor = "PuOr", chrPad = 0.02, displayGenomeBorder = FALSE, yChr = FALSE, genomeTicksDisplay = FALSE, genomeLabelTextSize = "8pt", genomeLabelDy = 0) ``` ## Bar tracks {#barSection} Bar plots may be added on another type of tracks. The start and end coordinates of each bar, as well as the associated value need to be specified. By default, the radial range of the track will stretch from the minimal to the maximum value of the track, but other boundaries may be specified with the *range* parameter. Here, to add a track to the tracklist at each iteration of the loop, we initialize the *tracks* tracklist with an empty *BioCircosTracklist* object. ```{r figBarTrack, fig.width=4, fig.height=4, fig.align = 'center', screenshot.force = FALSE} library(BioCircos) library(RColorBrewer) library(grDevices) # Define a custom genome genomeChr = LETTERS lengthChr = 5*1:length(genomeChr) names(lengthChr) <- genomeChr tracks = BioCircosTracklist() # Add one track for each chromosome for (i in 1:length(genomeChr)){ # Define histogram/bars to be displayed nbBars = lengthChr[i] - 1 barValues = sapply(1:nbBars, function(x) 10 + nbBars%%x) barColor = colorRampPalette(brewer.pal(8, "YlOrBr"))(length(genomeChr))[i] # Add a track with bars on the i-th chromosome tracks = tracks + BioCircosBarTrack(paste0("bars", i), chromosome = genomeChr[i], starts = (1:nbBars) - 1, ends = 1:nbBars, values = barValues, color = barColor, range = c(5,75)) } # Add background tracks = tracks + BioCircosBackgroundTrack("bars_background", colors = "#2222EE") BioCircos(tracks, genomeFillColor = "YlOrBr", genome = as.list(lengthChr), genomeTicksDisplay = F, genomeLabelDy = 0) ``` ## CNV tracks {#cnvSection} Conceptually close to bar tracks, and commonly used for purposes such as representation of copy number variants, the CNV tracks consist of arcs at a given radial distance showing a value associated with a genome stretch. The start and end coordinates of each arc, as well as the associated value need to be specified. ```{r figCNVTrack, screenshot.force = FALSE} library(BioCircos) # Arcs coordinates snvChr = rep(4:9, 3) snvStart = c(rep(1,6), rep(40000000,6), rep(100000000,6)) snvEnd = c(rep(39999999,6), rep(99999999,6), 191154276, 180915260, 171115067, 159138663, 146364022, 141213431) # Values associated with each point, used as radial coordinate # on a scale going to minRadius for the lowest value to maxRadius for the highest value snvValues = (1:18%%5)+1 # Create CNV track tracks = BioCircosCNVTrack('cnv_track', as.character(snvChr), snvStart, snvEnd, snvValues, color = "#CC0000", range = c(0,6)) # Add background tracks = tracks + BioCircosBackgroundTrack("arcs_background", colors = "#2222EE") BioCircos(tracks, genomeFillColor = "YlOrBr", genomeTicksDisplay = F, genomeLabelDy = 0) ``` ## Heatmap tracks {#heatSection} For a given genome stretch, heatmaps associate linearly numerical values with a color range. For two-dimensional heatmaps, you can stack up *heatmap tracks*, as done in the following example. ```{r, screenshot.force = FALSE} library(BioCircos) # Define a custom genome genomeChr = LETTERS[1:10] lengthChr = 5*1:length(genomeChr) names(lengthChr) <- genomeChr # Define boxes positions boxPositions = unlist(sapply(lengthChr, seq)) boxChromosomes = rep(genomeChr, lengthChr) # Define values for two heatmap tracks boxVal1 = boxPositions %% 13 / 13 boxVal2 = (7 + boxPositions) %% 17 / 17 tracks = BioCircosHeatmapTrack("heatmap1", boxChromosomes, boxPositions - 1, boxPositions, boxVal1, minRadius = 0.6, maxRadius = 0.75) tracks = tracks + BioCircosHeatmapTrack("heatmap1", boxChromosomes, boxPositions - 1, boxPositions, boxVal2, minRadius = 0.75, maxRadius = 0.9, color = c("#FFAAAA", "#000000")) BioCircos(tracks, genome = as.list(lengthChr), genomeTicksDisplay = F, genomeLabelDy = 0, HEATMAPMouseOverColor = "#F3C73A") ``` ## Line tracks The *Line tracks* display segments on a track. They are defined by the set of vertices that will be joined to produce the segments. If the vertices provided span multiple chromosomes, the segments between the last point on a chromosome and the first point on the next chromosome will be discarded. ```{r, screenshot.force = FALSE} chrVert = rep(c(1, 3, 5), c(20,10,5)) posVert = c(249250621*log(c(20:1, 10:1, 5:1), base = 20)) tracks = BioCircosLineTrack('LineTrack', as.character(chrVert), posVert, values = cos(posVert)) tracks = tracks + BioCircosLineTrack('LineTrack2', as.character(chrVert+1), 0.95*posVert, values = sin(posVert), color = "#40D4B9") tracks = tracks + BioCircosBackgroundTrack('Bg', fillColors = '#FFEEBB', borderSize = 0) BioCircos(tracks, chrPad = 0.05, displayGenomeBorder = FALSE, LINEMouseOutDisplay = FALSE, LINEMouseOverTooltipsHtml01 = "Pretty lines
This tooltip won't go away!") ``` ## Removing track Tracks can be removed from a track list by substracting the name of the corresponding track. ```{r, screenshot.force = FALSE} library(BioCircos) # Create a tracklist with a text annotation and backgrounds tracklist = BioCircosTextTrack('t1', 'hi') tracklist = tracklist + BioCircosBackgroundTrack('b1') # Remove the text annotation and display the result BioCircos(tracklist - 't1') ``` ## Multi-track example You can combine and overlap as many tracks as you want. ```{r figMultiTrack, fig.width=5, fig.height=5, fig.align = 'center', screenshot.force = FALSE} library(BioCircos) # Fix random generation for reproducibility set.seed(3) # SNP tracks tracks = BioCircosSNPTrack("testSNP1", as.character(rep(1:10,10)), round(runif(100, 1, 135534747)), runif(100, 0, 10), colors = "Spectral", minRadius = 0.3, maxRadius = 0.45) tracks = tracks + BioCircosSNPTrack("testSNP2", as.character(rep(1:15,5)), round(runif(75, 1, 102531392)), runif(75, 2, 12), colors = c("#FF0000", "#DD1111", "#BB2222", "#993333"), maxRadius = 0.8, range = c(2,12)) # Overlap point of interest on previous track, fix range to use a similar scale tracks = tracks + BioCircosSNPTrack("testSNP3", "7", 1, 9, maxRadius = 0.8, size = 6, range = c(2,12)) # Background and text tracks tracks = tracks + BioCircosBackgroundTrack("testBGtrack1", minRadius = 0.3, maxRadius = 0.45, borderColors = "#FFFFFF", borderSize = 0.6) tracks = tracks + BioCircosBackgroundTrack("testBGtrack2", borderColors = "#FFFFFF", fillColor = "#FFEEEE", borderSize = 0.6, maxRadius = 0.8) tracks = tracks + BioCircosTextTrack("testText", 'BioCircos!', weight = "lighter", x = - 0.17, y = - 0.87) # Arc track arcsEnds = round(runif(7, 50000001, 133851895)) arcsLengths = round(runif(7, 1, 50000000)) tracks = tracks + BioCircosArcTrack("fredTestArc", as.character(sample(1:12, 7, replace=T)), starts = arcsEnds - arcsLengths, ends = arcsEnds, labels = 1:7, maxRadius = 0.97, minRadius = 0.83) # Link tracks linkPos1 = round(runif(5, 1, 50000000)) linkPos2 = round(runif(5, 1, 50000000)) chr1 = sample(1:22, 5, replace = T) chr2 = sample(1:22, 5, replace = T) linkPos3 = round(runif(5, 1, 50000000)) linkPos4 = round(runif(5, 1, 50000000)) chr3 = sample(1:22, 5, replace = T) chr4 = sample(1:22, 5, replace = T) tracks = tracks + BioCircosLinkTrack("testLink", gene1Chromosomes = chr1, gene1Starts = linkPos1, gene1Ends = linkPos1+1, gene2Chromosomes = chr2, axisPadding = 6, color = "#EEEE55", width = "0.3em", labels = paste(chr1, chr2, sep = "*"), displayLabel = F, gene2Starts = linkPos2, gene2Ends = linkPos2+1, maxRadius = 0.42) tracks = tracks + BioCircosLinkTrack("testLink2", gene1Chromosomes = chr3, gene1Starts = linkPos3, gene1Ends = linkPos3+5000000, axisPadding = 6, displayLabel = F, color = "#FF6666", labels = paste(chr3, chr4, sep = "-"), gene2Chromosomes = chr4, gene2Starts = linkPos4, gene2Ends = linkPos4+2500000, maxRadius = 0.42) # Display the BioCircos visualization BioCircos(tracks, genomeFillColor = "Spectral", yChr = T, chrPad = 0, displayGenomeBorder = F, genomeTicksLen = 3, genomeTicksTextSize = 0, genomeTicksScale = 50000000, genomeLabelTextSize = 18, genomeLabelDy = 0) ``` ## Saving plot as a standalone document You can save your visualization as a standalone *html* document, thanks to the *htmlwidgets* library: ```{r eval=FALSE, screenshot.force = FALSE} library(BioCircos) library(htmlwidgets) # Prepare your plot and store it in an object tracklist = BioCircosBackgroundTrack("myBackgroundTrack", minRadius = 0.5, maxRadius = 0.8, borderColors = "#AAAAAA", borderSize = 0.6, fillColors = "#FFBBBB") plot <- BioCircos(tracklist, genomeFillColor = "PuOr", displayGenomeBorder = FALSE, genomeTicksDisplay = FALSE, genomeLabelDy = 0) # Save it as an html file saveWidget(plot, "plot.html") ``` You can then convert it to a *PDF* or *PNG* file programatically thanks to Chrome's headless mode (version 59 or later). The following needs to be adapted to the path to your local installation of Chrome: ```{r eval=FALSE, screenshot.force = FALSE} # Output in screenshot.png system("/Applications/Google\\ Chrome.app/Contents/MacOS/Google\\ Chrome --headless --disable-gpu --screenshot plot.html") # Output in output.pdf system("/Applications/Google\\ Chrome.app/Contents/MacOS/Google\\ Chrome --headless --disable-gpu --print-to-pdf plot.html") ``` ## Contact To report bugs, request features or for any question or remark regarding this package, please use the GitHub page or contact Loan Vulliard. ## Credits The creation and implementation of the **BioCircos.js** JavaScript library is an independent work attributed to Ya Cui and Xiaowei Chen. This work is described in the following scientific article: BioCircos.js: an Interactive Circos JavaScript Library for Biological Data Visualization on Web Applications. Cui, Y., et al. Bioinformatics. (2016). This package relies on several open source projects other R packages, and is made possible thanks to **shiny** and **htmlwidgets**. The package **heatmaply** was used as a model for this vignette, as well as for the **htmlwidgets** configuration. ## Session info ```{r sessionINFO, screenshot.force = FALSE} sessionInfo() ```