Introduction - About Data Science | Data Science



BioVenn – an R and Python package for the comparison and visualization of biological lists using area-proportional Venn diagramsTim HULSENa,1aDepartment of Professional Health Solutions & Services, Philips Research, Eindhoven, The Netherlands1Corresponding author. E-mail: tim.hulsen@. ORCID: 0000-0002-0208-8443Abstract. One of the most popular methods to visualize the overlap and differences between data sets is the Venn diagram. Venn diagrams are especially useful when they are 'area-proportional' i.e. the sizes of the circles and the overlaps correspond to the sizes of the data sets. In 2007, the BioVenn web interface was launched, which is being used by many researchers. However, this web implementation requires users to copy and paste (or upload) lists of IDs into the web browser, which is not always convenient and makes it difficult for researchers to create Venn diagrams ‘in batch’, or to automatically update the diagram when the source data changes. This is only possible by using software such as R or Python. This paper describes the BioVenn R and Python packages, which are very easy-to-use packages that can generate accurate area-proportional Venn diagrams of two or three circles directly from lists of (biological) IDs. The only required input is two or three lists of IDs. Optional parameters include the main title, the subtitle, the printing of absolute numbers or percentages within the diagram, colors and fonts. The function can show the diagram on the screen, or it can write output to one of the supported file formats. The function also returns all thirteen lists. The BioVenn R package and Python package were created for biological IDs, but they can be used for other IDs as well. Finally, BioVenn can map Affymetrix and EntrezGene to Ensembl IDs. The BioVenn R package is available in the CRAN repository, and can be installed by running ‘install.packages(“BioVenn”)’. The BioVenn Python package is available in the PyPI repository, and can be installed by running ‘pip install BioVenn’. The BioVenn web interface remains available at : Bioinformatics, Visualization, Venn diagram, Combinatorics, Set theory, Genomics, Data science, R, PythonIntroductionIn many ‘big data’ projects, it can be very useful to see the overlap between different data sets, in terms of patient IDs, gene names, etc. One of the most popular methods to visualize the overlap between data sets is the Venn diagram: a diagram consisting of two or more circles in which each circle corresponds to a data set, and the overlap between the circles corresponds to the overlap between these data sets. Venn diagrams are especially useful when they are 'area-proportional' i.e. the sizes of the circles and the overlaps correspond to the sizes of the data sets. Some web-based tools were created that can create area-proportional Venn diagrams, such as the (deprecated) tools VennMaster ADDIN EN.CITE <EndNote><Cite><Author>Kestler</Author><Year>2008</Year><RecNum>117</RecNum><DisplayText>[1]</DisplayText><record><rec-number>117</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1605194960">117</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Kestler, H. A.</author><author>Muller, A.</author><author>Kraus, J. M.</author><author>Buchholz, M.</author><author>Gress, T. M.</author><author>Liu, H.</author><author>Kane, D. W.</author><author>Zeeberg, B. R.</author><author>Weinstein, J. N.</author></authors></contributors><auth-address>Neural Information Processing, University of Ulm, Germany. hans.kestler@uni-ulm.de</auth-address><titles><title>VennMaster: area-proportional Euler diagrams for functional GO analysis of microarrays</title><secondary-title>BMC Bioinformatics</secondary-title></titles><periodical><full-title>BMC Bioinformatics</full-title></periodical><pages>67</pages><volume>9</volume><keywords><keyword>*Algorithms</keyword><keyword>*Computer Graphics</keyword><keyword>*Databases, Protein</keyword><keyword>Gene Expression Profiling/*methods</keyword><keyword>Information Storage and Retrieval/*methods</keyword><keyword>Logistic Models</keyword><keyword>Models, Genetic</keyword><keyword>Oligonucleotide Array Sequence Analysis/*methods</keyword><keyword>*User-Computer Interface</keyword></keywords><dates><year>2008</year><pub-dates><date>Jan 29</date></pub-dates></dates><isbn>1471-2105 (Electronic)&#xD;1471-2105 (Linking)</isbn><accession-num>18230172</accession-num><urls><related-urls><url>;[1] and DrawEuler ADDIN EN.CITE <EndNote><Cite><Author>Stapleton</Author><Year>2011</Year><RecNum>118</RecNum><DisplayText>[2]</DisplayText><record><rec-number>118</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1605194960">118</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Stapleton, G.</author><author>Leishi, Z.</author><author>Howse, J.</author><author>Rodgers, P.</author></authors></contributors><titles><title>Drawing Euler Diagrams with Circles: The Theory of Piercings</title><secondary-title>IEEE Trans Vis Comput Graph</secondary-title></titles><periodical><full-title>IEEE Trans Vis Comput Graph</full-title></periodical><pages>1020-32</pages><volume>17</volume><number>7</number><dates><year>2011</year><pub-dates><date>Jul</date></pub-dates></dates><isbn>1941-0506 (Electronic)&#xD;1077-2626 (Linking)</isbn><accession-num>20855916</accession-num><urls><related-urls><url>;[2]. In 2003, the website Venndiagram.tk ADDIN EN.CITE <EndNote><Cite><Author>Hulsen</Author><Year>2003</Year><RecNum>123</RecNum><DisplayText>[3]</DisplayText><record><rec-number>123</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1605726874">123</key></foreign-keys><ref-type name="Computer Program">9</ref-type><contributors><authors><author>Hulsen, T.</author></authors></contributors><titles><title>VennDiagram.tk</title></titles><dates><year>2003</year></dates><urls><related-urls><url>;[3] was launched, followed in 2007 by the BioVenn web interface ADDIN EN.CITE <EndNote><Cite><Author>Hulsen</Author><Year>2008</Year><RecNum>104</RecNum><DisplayText>[4]</DisplayText><record><rec-number>104</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1598014383">104</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Hulsen, T.</author><author>de Vlieg, J.</author><author>Alkema, W.</author></authors></contributors><auth-address>Computational Drug Discovery, CMBI, NCMLS, Radboud University Nijmegen Medical Centre, PO Box 9101, 6500 HB Nijmegen, The Netherlands. T.Hulsen@cmbi.ru.nl</auth-address><titles><title>BioVenn - a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams</title><secondary-title>BMC Genomics</secondary-title></titles><periodical><full-title>BMC Genomics</full-title></periodical><pages>488</pages><volume>9</volume><keywords><keyword>Animals</keyword><keyword>*Computational Biology</keyword><keyword>Humans</keyword><keyword>Internet</keyword><keyword>*Software</keyword></keywords><dates><year>2008</year><pub-dates><date>Oct 16</date></pub-dates></dates><isbn>1471-2164 (Electronic)&#xD;1471-2164 (Linking)</isbn><accession-num>18925949</accession-num><urls><related-urls><url>;[4], which has been used to create publication figures by many researchers ADDIN EN.CITE <EndNote><Cite><Author>Scholar</Author><Year>2020</Year><RecNum>106</RecNum><DisplayText>[5]</DisplayText><record><rec-number>106</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1598432291">106</key></foreign-keys><ref-type name="Web Page">12</ref-type><contributors><authors><author>Google Scholar</author></authors></contributors><titles><title>Google Scholar Citations for &apos;BioVenn – a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams&apos;</title></titles><dates><year>2020</year></dates><urls><related-urls><url>;[5], and is still available at this moment. The “Bio” in BioVenn showcases that it can do mapping of biological identifiers before determining sets and overlaps. Another useful functionality of BioVenn is that it displays the contents of the thirteen resulting sets and overlaps, i.e. X, Y, Z, X only, Y only, Z only, XY, XZ, YZ, XY only, XZ only, YZ only and XYZ (for circles X, Y and Z). However, the BioVenn web application requires users to copy and paste lists of IDs (or upload files with lists of IDs) into the web browser, which is not always convenient and makes it difficult for researchers to create Venn diagrams ‘in batch’. Moreover, when the source data changes, it needs to be copy-and-pasted again into the web interface. Using programming languages, it is possible to do batch processing and to quickly rerun a script when the source data has changed. Two of the most popular programming languages used within many scientific fields are R and Python. There are some R and Python packages available that can create Venn diagrams, which are listed in the following two sections.Existing R packagescolorfulVennPlotThe first package is ‘colorfulVennPlot’ ADDIN EN.CITE <EndNote><Cite><Author>Noma</Author><Year>2013</Year><RecNum>120</RecNum><DisplayText>[6]</DisplayText><record><rec-number>120</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1605561252">120</key></foreign-keys><ref-type name="Computer Program">9</ref-type><contributors><authors><author>Noma, E.</author><author>Manvae, A.</author></authors></contributors><titles><title>colorfulVennPlot: Plot and add custom coloring to Venn diagrams for 2-dimensional, 3-dimensional and 4-dimensional data</title></titles><edition>Version 2.4</edition><dates><year>2013</year></dates><urls><related-urls><url>;[6]. This package can create 2-circle and 3-circle Venn diagrams, and use ellipses for diagrams of 4 sets. Only the 2-circle diagrams can be made area-proportional, but the user needs to calculate the circles’ sizes and overlap by using the separate ‘resizeCircles’ function.eulerrA second package is ‘eulerr’ ADDIN EN.CITE <EndNote><Cite><Author>Larsson</Author><Year>2020</Year><RecNum>108</RecNum><DisplayText>[7]</DisplayText><record><rec-number>108</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1598433923">108</key></foreign-keys><ref-type name="Computer Program">9</ref-type><contributors><authors><author>Larsson, J.</author></authors></contributors><titles><title>eulerr: Area-Proportional Euler and Venn Diagrams with Ellipses</title></titles><edition>Version 6.1.0</edition><dates><year>2020</year></dates><urls><related-urls><url>;[7], which can generate area-proportional Euler diagrams. A Euler diagram is a generalization of a Venn diagram, relaxing the criterion that all interactions need to be represented. In practice, both terms are used interchangeably. This package uses both ellipses and circles.nVennRA third package is ‘nVennR’ ADDIN EN.CITE <EndNote><Cite><Author>Perez-Silva</Author><Year>2018</Year><RecNum>110</RecNum><DisplayText>[8]</DisplayText><record><rec-number>110</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1598606155">110</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Perez-Silva, J. G.</author><author>Araujo-Voces, M.</author><author>Quesada, V.</author></authors></contributors><auth-address>Departamento de Bioquimica y Biologia Molecular, Universidad de Oviedo, Oviedo, Spain.</auth-address><titles><title>nVenn: generalized, quasi-proportional Venn and Euler diagrams</title><secondary-title>Bioinformatics</secondary-title></titles><periodical><full-title>Bioinformatics</full-title></periodical><pages>2322-2324</pages><volume>34</volume><number>13</number><keywords><keyword>Algorithms</keyword><keyword>*Software</keyword></keywords><dates><year>2018</year><pub-dates><date>Jul 1</date></pub-dates></dates><isbn>1367-4811 (Electronic)&#xD;1367-4803 (Linking)</isbn><accession-num>29949954</accession-num><urls><related-urls><url>;[8]. This package can create “quasi-proportional Venn and Euler diagrams” for an unlimited number of sets. For a large number of sets, the algorithm might be very slow, because it needs to run many simulation cycles. Because of the resulting complicated shapes, the diagrams might not be easy to read.vennA fourth package is ‘venn’ ADDIN EN.CITE <EndNote><Cite><Author>Dusa</Author><Year>2020</Year><RecNum>112</RecNum><DisplayText>[9]</DisplayText><record><rec-number>112</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1598607834">112</key></foreign-keys><ref-type name="Computer Program">9</ref-type><contributors><authors><author>Dusa, A.</author></authors></contributors><titles><title>venn: Draw Venn Diagrams</title></titles><edition>Version 1.9</edition><dates><year>2020</year></dates><urls><related-urls><url>;[9], which can generate Venn diagrams up to 7 sets, but not in an area-proportional manner. For more than three sets, it uses pre-set polygon shapes.VennDiagramThe most popular package at this moment is ‘VennDiagram’ ADDIN EN.CITE <EndNote><Cite><Author>Chen</Author><Year>2011</Year><RecNum>105</RecNum><DisplayText>[10]</DisplayText><record><rec-number>105</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1598014402">105</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Chen, H.</author><author>Boutros, P. C.</author></authors></contributors><auth-address>Informatics and Biocomputing Platform, Ontario Institute for Cancer Research, MaRS Centre, South Tower, 101 College Street, Suite 800, Toronto, Ontario, M5G 0A3, Canada.</auth-address><titles><title>VennDiagram: a package for the generation of highly-customizable Venn and Euler diagrams in R</title><secondary-title>BMC Bioinformatics</secondary-title></titles><periodical><full-title>BMC Bioinformatics</full-title></periodical><pages>35</pages><volume>12</volume><keywords><keyword>Computational Biology/*methods</keyword><keyword>Computer Graphics</keyword><keyword>*Data Interpretation, Statistical</keyword><keyword>*Software</keyword></keywords><dates><year>2011</year><pub-dates><date>Jan 26</date></pub-dates></dates><isbn>1471-2105 (Electronic)&#xD;1471-2105 (Linking)</isbn><accession-num>21269502</accession-num><urls><related-urls><url>;[10]. This package can generate Venn and Euler diagrams of up to five sets, but these are not area-proportional, unless the user calculates the radii and distances between the circles by himself, and passes these numbers through to one of the draw.*.venn functions.venneulerA sixth package is ‘venneuler’ ADDIN EN.CITE <EndNote><Cite><Author>Wilkinson</Author><Year>2011</Year><RecNum>107</RecNum><DisplayText>[11]</DisplayText><record><rec-number>107</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1598433557">107</key></foreign-keys><ref-type name="Computer Program">9</ref-type><contributors><authors><author>Wilkinson, L.</author><author>Urbanek, S.</author></authors></contributors><titles><title>venneuler: Venn and Euler diagrams</title></titles><edition>Version 1.1-0</edition><dates><year>2011</year></dates><urls><related-urls><url>;[11]. This package can create area-proportional Venn diagrams as well, if the sizes of the overlaps are passed to its venneuler function. The returned object also gives some mathematical information such as the residuals (percentage difference between input intersection area and fitted inter-section area) and stress values.vennplotThe seventh and final package is ‘vennplot’ ADDIN EN.CITE <EndNote><Cite><Author>Xu</Author><Year>2017</Year><RecNum>119</RecNum><DisplayText>[12]</DisplayText><record><rec-number>119</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1605541908">119</key></foreign-keys><ref-type name="Computer Program">9</ref-type><contributors><authors><author>Xu, Z.</author><author>Oldford, R.W.</author><author>Lysy, M.</author></authors></contributors><titles><title>vennplot: Venn Diagrams in 2D and 3D</title></titles><edition>Version 1.0</edition><dates><year>2017</year></dates><urls><related-urls><url>;[12]. It can create area-proportional Venn diagrams in 2D or 3D, with two or three circles or balls. The 3D functionality is interesting (the diagram can be rotated), but the mathematics behind it is actually the same as for the 2D plot. Existing Python packagesmatplotlib-vennThe most popular package at this moment is ‘matplotlib-venn’ ADDIN EN.CITE <EndNote><Cite><Author>Tretyakov</Author><Year>2012</Year><RecNum>116</RecNum><DisplayText>[13]</DisplayText><record><rec-number>116</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1605025441">116</key></foreign-keys><ref-type name="Computer Program">9</ref-type><contributors><authors><author>Tretyakov, K.</author></authors></contributors><titles><title>Matplotlib-Venn Python package at PyPi</title></titles><edition>Version 0.11.6</edition><dates><year>2012</year></dates><urls><related-urls><url>;[13]. Its ‘venn2’ and ‘venn3’ functions can create area-proportional Venn diagrams of two and three circles, respectively. However, they don’t offer the ID mapping functionality of BioVenn, and the ‘drag-and-drop’ functionality for repositioning of titles and labels in the SVG mode of BioVenn is missing as well. PyVennA second package is ‘PyVenn’ ADDIN EN.CITE <EndNote><Cite><Author>Grigorev</Author><Year>2018</Year><RecNum>114</RecNum><DisplayText>[14]</DisplayText><record><rec-number>114</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1605025440">114</key></foreign-keys><ref-type name="Computer Program">9</ref-type><contributors><authors><author>Grigorev, K.</author></authors></contributors><titles><title>PyVenn Python package at PyPi</title></titles><edition>Version 0.1.3</edition><dates><year>2018</year></dates><urls><related-urls><url>;[14]. This package offers plotting of Venn diagrams of two to six circles, but these are not area-proportional like in BioVenn or Matplotlib-Venn: the shapes are always the same.MethodsThe PHP script that forms the basis for the BioVenn web interface, was rewritten in the R and Python languages. The only function in the package is “draw.venn” (R) or “draw_venn” (Python), and it follows these steps:Remove duplicate IDs (note: BioVenn is case-sensitive)Map EntrezGene and Affymetrix IDs to Ensembl IDs (using the ‘hsapiens_gene_ensembl’ dataset of the ‘biomaRt’ R package ADDIN EN.CITE <EndNote><Cite><Author>Durinck</Author><Year>2009</Year><RecNum>124</RecNum><DisplayText>[15]</DisplayText><record><rec-number>124</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1612796923">124</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Durinck, S.</author><author>Spellman, P. T.</author><author>Birney, E.</author><author>Huber, W.</author></authors></contributors><auth-address>Lawrence Berkeley National Laboratory, Berkeley, CA, USA. steffen@stat.berkeley.edu</auth-address><titles><title>Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt</title><secondary-title>Nat Protoc</secondary-title></titles><periodical><full-title>Nat Protoc</full-title></periodical><pages>1184-91</pages><volume>4</volume><number>8</number><keywords><keyword>Cell Line</keyword><keyword>*Chromosome Mapping</keyword><keyword>Cluster Analysis</keyword><keyword>Databases, Genetic</keyword><keyword>Genomics/*methods</keyword><keyword>Humans</keyword><keyword>RNA, Messenger/metabolism</keyword><keyword>*Software</keyword></keywords><dates><year>2009</year></dates><isbn>1750-2799 (Electronic)&#xD;1750-2799 (Linking)</isbn><accession-num>19617889</accession-num><urls><related-urls><url>;[15] or the ‘biomart’ Python package ADDIN EN.CITE <EndNote><Cite><Author>Briois</Author><Year>2014</Year><RecNum>122</RecNum><DisplayText>[16]</DisplayText><record><rec-number>122</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1605689281">122</key></foreign-keys><ref-type name="Computer Program">9</ref-type><contributors><authors><author>Briois, S.</author></authors></contributors><titles><title>Biomart Python package at PyPi</title></titles><edition>Version 0.9.2</edition><dates><year>2014</year></dates><urls><related-urls><url>;[16], only used when the optional ‘map2ens’ parameter is set to ‘True’)Generate lists of IDs for the thirteen possible sets, and count the number of IDs within each setCalculate the radii of the circles so that the areas of the circles correspond to the size of the datasets they representCalculate the distances between the centers of the circles, so that the areas of the two-circle overlaps correspond to the size of the datasets they represent (see figure 1 of ADDIN EN.CITE <EndNote><Cite><Author>Hulsen</Author><Year>2008</Year><RecNum>104</RecNum><DisplayText>[4]</DisplayText><record><rec-number>104</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1598014383">104</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Hulsen, T.</author><author>de Vlieg, J.</author><author>Alkema, W.</author></authors></contributors><auth-address>Computational Drug Discovery, CMBI, NCMLS, Radboud University Nijmegen Medical Centre, PO Box 9101, 6500 HB Nijmegen, The Netherlands. T.Hulsen@cmbi.ru.nl</auth-address><titles><title>BioVenn - a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams</title><secondary-title>BMC Genomics</secondary-title></titles><periodical><full-title>BMC Genomics</full-title></periodical><pages>488</pages><volume>9</volume><keywords><keyword>Animals</keyword><keyword>*Computational Biology</keyword><keyword>Humans</keyword><keyword>Internet</keyword><keyword>*Software</keyword></keywords><dates><year>2008</year><pub-dates><date>Oct 16</date></pub-dates></dates><isbn>1471-2164 (Electronic)&#xD;1471-2164 (Linking)</isbn><accession-num>18925949</accession-num><urls><related-urls><url>;[4]); in cases where an 100% accurate three-circle diagram cannot be drawn, this method gives the optimal solutionCalculate the angles of the XYZ triangle, formed by connecting the three centers of the circlesCalculate the centers of the circlesCalculate the intersection points of the circlesCalculate the points where the numbers will be printedSet output type to file or screen (depending on the output parameter)Print the title and subtitlePrint the circles with the calculated centers and radiiPrint the absolute numbers/percentagesPrint the texts for the three circlesWrite to the selected output type and filenameIf SVG is selected as output type, do some post-processing in order to create the drag-and-drop functionality for repositioning titles and labelsReturn the contents of the thirteen lists: X, Y, Z, X only, Y only, Z only, XY, XZ, YZ, XY only, XZ only, YZ only and XYZ.Whereas the BioVenn web interface only supports PNG and SVG as output formats, the Python package also supports JPEG, PDF and TIFF. The R package even supports all of these file formats and BMP.ResultsThe BioVenn R/Python package can generate area-proportional Venn diagrams of two or three circles from lists of (biological) identifiers. It is a lightweight package, depending on only a small number of other packages, making it more likely that the package will still work in the future. The only function in the first version is the ‘draw.venn’ function (in R; ‘draw_venn’ in Python), for which the only required input is two or three lists of identifiers. Optional parameters include the main title, the subtitle, the printing of absolute numbers or percentages within the diagram, colors and fonts. The function can show the diagram on the screen, or it can write output to one of the supported file formats. The SVG mode also supports drag-and-drop of titles and labels, which is new functionality compared to the original publication of the web interface ADDIN EN.CITE <EndNote><Cite><Author>Hulsen</Author><Year>2008</Year><RecNum>104</RecNum><DisplayText>[4]</DisplayText><record><rec-number>104</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1598014383">104</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Hulsen, T.</author><author>de Vlieg, J.</author><author>Alkema, W.</author></authors></contributors><auth-address>Computational Drug Discovery, CMBI, NCMLS, Radboud University Nijmegen Medical Centre, PO Box 9101, 6500 HB Nijmegen, The Netherlands. T.Hulsen@cmbi.ru.nl</auth-address><titles><title>BioVenn - a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams</title><secondary-title>BMC Genomics</secondary-title></titles><periodical><full-title>BMC Genomics</full-title></periodical><pages>488</pages><volume>9</volume><keywords><keyword>Animals</keyword><keyword>*Computational Biology</keyword><keyword>Humans</keyword><keyword>Internet</keyword><keyword>*Software</keyword></keywords><dates><year>2008</year><pub-dates><date>Oct 16</date></pub-dates></dates><isbn>1471-2164 (Electronic)&#xD;1471-2164 (Linking)</isbn><accession-num>18925949</accession-num><urls><related-urls><url>;[4]. The function also returns the lists of IDs for the thirteen possible sets. The BioVenn R/Python package was created for biological identifiers, but it can be used for other identifiers as well. Finally, BioVenn can map Affymetrix and EntrezGene IDs to Ensembl IDs.R exampleThe following very simple R code creates the example plot of figure 1, and returns the data of table 1:list_x <- c("1007_s_at", "1053_at", "117_at", "121_at", "1255_g_at", "1294_at")list_y <- c("1255_g_at", "1294_at", "1316_at", "1320_at", "1405_i_at")list_z <- c("1007_s_at", "1405_i_at", "1255_g_at", "1431_at", "1438_at", "1487_at", "1494_f_at")biovenn <- draw.venn(list_x, list_y, list_z, subtitle="Example diagram", nrtype="abs")Python exampleThe Python code works in a very similar manner:list_x = ("1007_s_at", "1053_at", "117_at", "121_at", "1255_g_at", "1294_at")list_y = ("1255_g_at", "1294_at", "1316_at", "1320_at", "1405_i_at")list_z = ("1007_s_at", "1405_i_at", "1255_g_at", "1431_at", "1438_at", "1487_at", "1494_f_at")biovenn = draw_venn(list_x, list_y, list_z, subtitle="Example diagram", nrtype="abs")Note that the code in both R and Python could be even compressed into one line, by adding the lists directly to the draw.venn or draw_venn command. For improved readability we use a four-line code.left21209000Figure 1. Example BioVenn diagram. This example was created by just entering three lists of IDs and setting two other parameters (subtitle and nrtype).VariableData typeContents$xcharacter [6]1007_s_at, 1053_at, 117_at, 121_at, 1255_g_at, 1294_at$ycharacter [5]1255_g_at, 1294_at, 1316_at, 1320_at, 1405_i_at$zcharacter [7]1007_s_at, 1405_i_at, 1255_g_at, 1431_at, 1438_at, 1487_at, 1494_f_at$x_onlycharacter [3]1053_at, 117_at, 121_at$y_onlycharacter [2]1316_at, 1320_at$z_onlycharacter [4]1431_at, 1438_at, 1487_at, 1494_f_at$xycharacter [2]1255_g_at, 1294_at$xzcharacter [2]1007_s_at, 1255_g_at$yzcharacter [2]1255_g_at, 1405_i_at$xy_onlycharacter [1]1294_at$xz_onlycharacter [1]1007_s_at$yz_onlycharacter [1]1405_i_at$xyzcharacter [1]1255_g_atTable 1. Example output. This example was created by just entering three lists of IDs and setting two other parameters (subtitle and nrtype).Biological ID mappingTo enable the mapping of Affymetrix or Entrez identifiers to Ensembl identifiers, the parameter ‘map2ens’ should be set to ‘TRUE’, e.g. in R:list_x <- c("1007_s_at", "1053_at", "117_at", "121_at", "1255_g_at", "1294_at")list_y <- c("1255_g_at", "1294_at", "1316_at", "1320_at", "1405_i_at")list_z <- c("1007_s_at", "1405_i_at", "1255_g_at", "1431_at", "1438_at", "1487_at", "1494_f_at")biovenn <- draw.venn(list_x, list_y, list_z, subtitle="Example diagram", nrtype="abs", map2ens=TRUE)This code creates the example plot of figure 2 and returns the data of table 2. Note that, in comparison to figure 1 and table 1, some lists contain more identifiers. This is because some Affymetrix IDs map to multiple Ensembl Gene IDs (possibly homologous genes). With the ‘map2ens’ function, BioVenn automatically converts the Affymetrix IDs to their corresponding Ensembl Gene IDs, and draws the Venn diagrams using the Ensembl Gene IDs. This is useful for researchers that want to do a gene-based comparison from expression data.left14287500Figure 2. Example BioVenn diagram, created with biological ID mapping. This example was created by just entering three lists of IDs and setting three other parameters (subtitle, nrtype and map2ens).VariableData typeContents$xcharacter [13]ENSG00000234078, ENSG00000137332, ENSG00000230456, ENSG00000215522, ENSG00000204580, ENSG00000049541, ENSG00000287363, ENSG00000048545, ENSG00000182179, ENSG00000283726, ENSG00000125618, ENSG00000173110, ENSG00000225217$ycharacter [8]ENSG00000274233, ENSG00000070778, ENSG00000287363, ENSG00000048545, ENSG00000182179, ENSG00000283726, ENSG00000271503, ENSG00000126351$zcharacter [15]ENSG00000234078, ENSG00000137332, ENSG00000230456, ENSG00000215522, ENSG00000274233, ENSG00000215572, ENSG00000204580, ENSG00000287363, ENSG00000048545, ENSG00000255974, ENSG00000198077, ENSG00000130649, ENSG00000173153, ENSG00000182580, ENSG00000271503$x_onlycharacter [4]ENSG00000049541, ENSG00000125618, ENSG00000173110, ENSG00000225217$y_onlycharacter [2]ENSG00000070778, ENSG00000126351$z_onlycharacter [6]ENSG00000215572, ENSG00000255974, ENSG00000198077, ENSG00000130649, ENSG00000173153, ENSG00000182580$xycharacter [4]ENSG00000287363, ENSG00000048545, ENSG00000182179, ENSG00000283726$xzcharacter [7]ENSG00000234078, ENSG00000137332, ENSG00000230456, ENSG00000215522, ENSG00000204580, ENSG00000287363, ENSG00000048545$yzcharacter [4]ENSG00000274233, ENSG00000287363, ENSG00000048545, ENSG00000271503$xy_onlycharacter [2]ENSG00000182179, ENSG00000283726$xz_onlycharacter [5]ENSG00000234078, ENSG00000137332, ENSG00000230456, ENSG00000215522, ENSG00000204580$yz_onlycharacter [2]ENSG00000274233, ENSG00000271503$xyzcharacter [2]ENSG00000287363, ENSG00000048545Table 2. Example output, created with biological ID mapping. This example was created by just entering three lists of IDs and setting three other parameters (subtitle, nrtype and map2ens).Overall comparisonTo compare the different R and Python packages, we created Venn diagrams of a dataset showing orthologous genes that are present in human (Homo sapiens), mouse (Mus musculus) or the African clawed frog (Xenopus laevis) (available at the OMA Browser PEVuZE5vdGU+PENpdGU+PEF1dGhvcj5BbHRlbmhvZmY8L0F1dGhvcj48WWVhcj4yMDIwPC9ZZWFy

PjxSZWNOdW0+MTIxPC9SZWNOdW0+PERpc3BsYXlUZXh0PlsxN108L0Rpc3BsYXlUZXh0PjxyZWNv

cmQ+PHJlYy1udW1iZXI+MTIxPC9yZWMtbnVtYmVyPjxmb3JlaWduLWtleXM+PGtleSBhcHA9IkVO

IiBkYi1pZD0idGR0eDB0cHNhZDlyZDZlejJwcnB3eDJzMHMyMjIwencwcDB0IiB0aW1lc3RhbXA9

IjE2MDU2ODc5NDUiPjEyMTwva2V5PjwvZm9yZWlnbi1rZXlzPjxyZWYtdHlwZSBuYW1lPSJKb3Vy

bmFsIEFydGljbGUiPjE3PC9yZWYtdHlwZT48Y29udHJpYnV0b3JzPjxhdXRob3JzPjxhdXRob3I+

QWx0ZW5ob2ZmLCBBLiBNLjwvYXV0aG9yPjxhdXRob3I+VHJhaW4sIEMuIE0uPC9hdXRob3I+PGF1

dGhvcj5HaWxiZXJ0LCBLLiBKLjwvYXV0aG9yPjxhdXRob3I+TWVkaXJhdHRhLCBJLjwvYXV0aG9y

PjxhdXRob3I+TWVuZGVzIGRlIEZhcmlhcywgVC48L2F1dGhvcj48YXV0aG9yPk1vaSwgRC48L2F1

dGhvcj48YXV0aG9yPk5ldmVycywgWS48L2F1dGhvcj48YXV0aG9yPlJhZG95a292YSwgSC4gUy48

L2F1dGhvcj48YXV0aG9yPlJvc3NpZXIsIFYuPC9hdXRob3I+PGF1dGhvcj5XYXJ3aWNrIFZlc3p0

cm9jeSwgQS48L2F1dGhvcj48YXV0aG9yPkdsb3ZlciwgTi4gTS48L2F1dGhvcj48YXV0aG9yPkRl

c3NpbW96LCBDLjwvYXV0aG9yPjwvYXV0aG9ycz48L2NvbnRyaWJ1dG9ycz48YXV0aC1hZGRyZXNz

PlNJQiBTd2lzcyBJbnN0aXR1dGUgb2YgQmlvaW5mb3JtYXRpY3MsIDEwMTUgTGF1c2FubmUsIFN3

aXR6ZXJsYW5kLiYjeEQ7RVRIIFp1cmljaCwgQ29tcHV0ZXIgU2NpZW5jZSwgVW5pdmVyc2l0YXRz

dHIuIDYsIDgwOTIgWnVyaWNoLCBTd2l0emVybGFuZC4mI3hEO0RlcGFydG1lbnQgb2YgQ29tcHV0

YXRpb25hbCBCaW9sb2d5LCBVbml2ZXJzaXR5IG9mIExhdXNhbm5lLCAxMDE1IExhdXNhbm5lLCBT

d2l0emVybGFuZC4mI3hEO0NlbnRlciBmb3IgSW50ZWdyYXRpdmUgR2Vub21pY3MsIFVuaXZlcnNp

dHkgb2YgTGF1c2FubmUsIDEwMTUgTGF1c2FubmUsIFN3aXR6ZXJsYW5kLiYjeEQ7RGVwYXJ0bWVu

dCBvZiBDb21wdXRlciBTY2llbmNlIGFuZCBJbmZvcm1hdGlvbiBTeXN0ZW1zLCBCSVRTIFBpbGFu

aSBLLksuIEJpcmxhIEdvYSBDYW1wdXMsIEluZGlhLiYjeEQ7Q2VudHJlIGZvciBMaWZlJmFwb3M7

cyBPcmlnaW5zIGFuZCBFdm9sdXRpb24sIERlcGFydG1lbnQgb2YgR2VuZXRpY3MsIEV2b2x1dGlv

biBhbmQgRW52aXJvbm1lbnQsIFVuaXZlcnNpdHkgQ29sbGVnZSBMb25kb24sIEdvd2VyIFN0LCBM

b25kb24gV0MxRSA2QlQsIFVuaXRlZCBLaW5nZG9tLiYjeEQ7RGVwYXJ0bWVudCBvZiBDb21wdXRl

ciBTY2llbmNlLCBVbml2ZXJzaXR5IENvbGxlZ2UgTG9uZG9uLCBHb3dlciBTdCwgTG9uZG9uIFdD

MUUgNkJULCBVbml0ZWQgS2luZ2RvbS48L2F1dGgtYWRkcmVzcz48dGl0bGVzPjx0aXRsZT5PTUEg

b3J0aG9sb2d5IGluIDIwMjE6IHdlYnNpdGUgb3ZlcmhhdWwsIGNvbnNlcnZlZCBpc29mb3Jtcywg

YW5jZXN0cmFsIGdlbmUgb3JkZXIgYW5kIG1vcmU8L3RpdGxlPjxzZWNvbmRhcnktdGl0bGU+TnVj

bGVpYyBBY2lkcyBSZXM8L3NlY29uZGFyeS10aXRsZT48L3RpdGxlcz48cGVyaW9kaWNhbD48ZnVs

bC10aXRsZT5OdWNsZWljIEFjaWRzIFJlczwvZnVsbC10aXRsZT48L3BlcmlvZGljYWw+PGRhdGVz

Pjx5ZWFyPjIwMjA8L3llYXI+PHB1Yi1kYXRlcz48ZGF0ZT5Ob3YgMTE8L2RhdGU+PC9wdWItZGF0

ZXM+PC9kYXRlcz48aXNibj4xMzYyLTQ5NjIgKEVsZWN0cm9uaWMpJiN4RDswMzA1LTEwNDggKExp

bmtpbmcpPC9pc2JuPjxhY2Nlc3Npb24tbnVtPjMzMTc0NjA1PC9hY2Nlc3Npb24tbnVtPjx1cmxz

PjxyZWxhdGVkLXVybHM+PHVybD5odHRwczovL3d3dy5uY2JpLm5sbS5uaWguZ292L3B1Ym1lZC8z

MzE3NDYwNTwvdXJsPjwvcmVsYXRlZC11cmxzPjwvdXJscz48ZWxlY3Ryb25pYy1yZXNvdXJjZS1u

dW0+MTAuMTA5My9uYXIvZ2thYTEwMDc8L2VsZWN0cm9uaWMtcmVzb3VyY2UtbnVtPjwvcmVjb3Jk

PjwvQ2l0ZT48L0VuZE5vdGU+

ADDIN EN.CITE PEVuZE5vdGU+PENpdGU+PEF1dGhvcj5BbHRlbmhvZmY8L0F1dGhvcj48WWVhcj4yMDIwPC9ZZWFy

PjxSZWNOdW0+MTIxPC9SZWNOdW0+PERpc3BsYXlUZXh0PlsxN108L0Rpc3BsYXlUZXh0PjxyZWNv

cmQ+PHJlYy1udW1iZXI+MTIxPC9yZWMtbnVtYmVyPjxmb3JlaWduLWtleXM+PGtleSBhcHA9IkVO

IiBkYi1pZD0idGR0eDB0cHNhZDlyZDZlejJwcnB3eDJzMHMyMjIwencwcDB0IiB0aW1lc3RhbXA9

IjE2MDU2ODc5NDUiPjEyMTwva2V5PjwvZm9yZWlnbi1rZXlzPjxyZWYtdHlwZSBuYW1lPSJKb3Vy

bmFsIEFydGljbGUiPjE3PC9yZWYtdHlwZT48Y29udHJpYnV0b3JzPjxhdXRob3JzPjxhdXRob3I+

QWx0ZW5ob2ZmLCBBLiBNLjwvYXV0aG9yPjxhdXRob3I+VHJhaW4sIEMuIE0uPC9hdXRob3I+PGF1

dGhvcj5HaWxiZXJ0LCBLLiBKLjwvYXV0aG9yPjxhdXRob3I+TWVkaXJhdHRhLCBJLjwvYXV0aG9y

PjxhdXRob3I+TWVuZGVzIGRlIEZhcmlhcywgVC48L2F1dGhvcj48YXV0aG9yPk1vaSwgRC48L2F1

dGhvcj48YXV0aG9yPk5ldmVycywgWS48L2F1dGhvcj48YXV0aG9yPlJhZG95a292YSwgSC4gUy48

L2F1dGhvcj48YXV0aG9yPlJvc3NpZXIsIFYuPC9hdXRob3I+PGF1dGhvcj5XYXJ3aWNrIFZlc3p0

cm9jeSwgQS48L2F1dGhvcj48YXV0aG9yPkdsb3ZlciwgTi4gTS48L2F1dGhvcj48YXV0aG9yPkRl

c3NpbW96LCBDLjwvYXV0aG9yPjwvYXV0aG9ycz48L2NvbnRyaWJ1dG9ycz48YXV0aC1hZGRyZXNz

PlNJQiBTd2lzcyBJbnN0aXR1dGUgb2YgQmlvaW5mb3JtYXRpY3MsIDEwMTUgTGF1c2FubmUsIFN3

aXR6ZXJsYW5kLiYjeEQ7RVRIIFp1cmljaCwgQ29tcHV0ZXIgU2NpZW5jZSwgVW5pdmVyc2l0YXRz

dHIuIDYsIDgwOTIgWnVyaWNoLCBTd2l0emVybGFuZC4mI3hEO0RlcGFydG1lbnQgb2YgQ29tcHV0

YXRpb25hbCBCaW9sb2d5LCBVbml2ZXJzaXR5IG9mIExhdXNhbm5lLCAxMDE1IExhdXNhbm5lLCBT

d2l0emVybGFuZC4mI3hEO0NlbnRlciBmb3IgSW50ZWdyYXRpdmUgR2Vub21pY3MsIFVuaXZlcnNp

dHkgb2YgTGF1c2FubmUsIDEwMTUgTGF1c2FubmUsIFN3aXR6ZXJsYW5kLiYjeEQ7RGVwYXJ0bWVu

dCBvZiBDb21wdXRlciBTY2llbmNlIGFuZCBJbmZvcm1hdGlvbiBTeXN0ZW1zLCBCSVRTIFBpbGFu

aSBLLksuIEJpcmxhIEdvYSBDYW1wdXMsIEluZGlhLiYjeEQ7Q2VudHJlIGZvciBMaWZlJmFwb3M7

cyBPcmlnaW5zIGFuZCBFdm9sdXRpb24sIERlcGFydG1lbnQgb2YgR2VuZXRpY3MsIEV2b2x1dGlv

biBhbmQgRW52aXJvbm1lbnQsIFVuaXZlcnNpdHkgQ29sbGVnZSBMb25kb24sIEdvd2VyIFN0LCBM

b25kb24gV0MxRSA2QlQsIFVuaXRlZCBLaW5nZG9tLiYjeEQ7RGVwYXJ0bWVudCBvZiBDb21wdXRl

ciBTY2llbmNlLCBVbml2ZXJzaXR5IENvbGxlZ2UgTG9uZG9uLCBHb3dlciBTdCwgTG9uZG9uIFdD

MUUgNkJULCBVbml0ZWQgS2luZ2RvbS48L2F1dGgtYWRkcmVzcz48dGl0bGVzPjx0aXRsZT5PTUEg

b3J0aG9sb2d5IGluIDIwMjE6IHdlYnNpdGUgb3ZlcmhhdWwsIGNvbnNlcnZlZCBpc29mb3Jtcywg

YW5jZXN0cmFsIGdlbmUgb3JkZXIgYW5kIG1vcmU8L3RpdGxlPjxzZWNvbmRhcnktdGl0bGU+TnVj

bGVpYyBBY2lkcyBSZXM8L3NlY29uZGFyeS10aXRsZT48L3RpdGxlcz48cGVyaW9kaWNhbD48ZnVs

bC10aXRsZT5OdWNsZWljIEFjaWRzIFJlczwvZnVsbC10aXRsZT48L3BlcmlvZGljYWw+PGRhdGVz

Pjx5ZWFyPjIwMjA8L3llYXI+PHB1Yi1kYXRlcz48ZGF0ZT5Ob3YgMTE8L2RhdGU+PC9wdWItZGF0

ZXM+PC9kYXRlcz48aXNibj4xMzYyLTQ5NjIgKEVsZWN0cm9uaWMpJiN4RDswMzA1LTEwNDggKExp

bmtpbmcpPC9pc2JuPjxhY2Nlc3Npb24tbnVtPjMzMTc0NjA1PC9hY2Nlc3Npb24tbnVtPjx1cmxz

PjxyZWxhdGVkLXVybHM+PHVybD5odHRwczovL3d3dy5uY2JpLm5sbS5uaWguZ292L3B1Ym1lZC8z

MzE3NDYwNTwvdXJsPjwvcmVsYXRlZC11cmxzPjwvdXJscz48ZWxlY3Ryb25pYy1yZXNvdXJjZS1u

dW0+MTAuMTA5My9uYXIvZ2thYTEwMDc8L2VsZWN0cm9uaWMtcmVzb3VyY2UtbnVtPjwvcmVjb3Jk

PjwvQ2l0ZT48L0VuZE5vdGU+

ADDIN EN.CITE.DATA [17] through ). Since human and mouse are more closely related than human and Xenopus (and mouse and Xenopus), we expect that the circles of human and mouse have a larger overlap. Furthermore, Xenopus has more genes, so its circle should be larger than the circles of human and mouse.Figure 3 shows the Venn Diagrams created in each of the R packages, in alphabetical order. For each of the plots, the colours red, green and blue were used, titles were removed, and numbers were printed in the diagram (if that option was available). The code used to generate the plots can be viewed at . We can see that the packages that create area-proportional diagrams (a, c, d, g, h) give a better impression of what the data looks like: the human and mouse circles indeed have a larger overlap than with the Xenopus circle, and the Xenopus circle is larger than the other ones. The nVennR diagram (d) might be visually less appealing, but it displays the information correctly as well. The non-area-proportional diagrams (b, e, f) need some careful reading of the numbers in the figure before they can be interpreted.-6159530533700(a)-53975190500(b)-6350019240500(c)3175190500(d)-1006325517200(e)2413016441700(f)5588014287500(g)-9525000(h) Figure 3. Venn diagrams created by each of the R packages: a) BioVenn, b) colorfulVennPlot, c) eulerr, d) nVennR, e) venn, f) VennDiagram, g) venneuler and h) vennplot.Figure 4 shows the Venn Diagrams created in each of the Python packages, in alphabetical order, with the same method as described above. Again, the area-proportional diagrams (a, b) can be understood much more easily than the non-area-proportional diagram (c).-61595000(a)-7620000(b)492379700(c)Figure 4. Venn diagrams created by each of the Python packages: a) BioVenn, b) matplotlib-venn and c) PyVenn.Package nameBioVenncolorfulVennPloteulerrmatplotlib-VennnVennRPyVennvennvennDiagramvenneulervennplotProgramming languageR, Python (and web)RR (and web)PythonR (and web)PythonRR (and Cytoscape and web)RRMax. number of sets34 (>3 uses ellipses)Unlimited (in theory)3Unlimited (in theory)675Unlimited (in theory)3Area proportionalityAutomaticallyManually (only for 2-circle diagrams)AutomaticallyAutomaticallyAutomaticallyNoNoManuallyManuallyAutomaticallyBuilt-in biological ID mappingYesNoNoNoNoNoNoNoNoNoInput formatSets of IDsSets of IDs, numbersSets of IDs, numbersSets of IDs, numbersSets of ID, numbersSets of IDs, numbersNumbersSets of IDs, numbersSets of IDs, numbersSets of IDs, numbersOutput formatBMP (only in R), JPEG, PDF, PNG, SVG, TIFF, R/Python graphicsR graphicsR graphicsPython graphicsSVG, R graphicsPython graphicsR graphicsR graphics, TIFFR graphicsR graphicsDrag-and-drop of titles, labelsOnly in SVG modeNoNoNoNoNoNoNoNoNoShapes usedCirclesCircles/ EllipsesCircles/ EllipsesCirclesPolygonsCircles/ Ellipses/ PolygonsCircles/ Ellipses/ PolygonsCircles/ EllipsesCirclesCircles/ BallsPrint absolute numbers / percentagesBothOnly absolute numbersBothOnly absolute numbersOnly absolute numbersBothOnly absolute numbersBothNoNoSet title(s)Title and subtitleOnly titleOnly titleNoNoNoNoTitle and subtitleNoNoSet circle colorsYesYesYesYesYesYesYesYesYesYesSet circle textsYesYesYesYesYesYesYesYesYesYesSet background colorYesNoNoNoNoNoNoNoNoNoSet text colorsYesNoNoNoNoNoNoYesNoNoSet text fonts (family, face, size)YesNoNoNoOnly font sizeOnly font sizeOnly font sizeYesNoNoTable 3. Venn diagram package comparison. All currently available R and Python packages that can generate Venn diagrams compared. Note that this table only lists built-in functionality; some functionality such as plotting to certain file formats might be possible by using other R or Python functions.Table 3 shows a comparison of all features of BioVenn and the seven other packages mentioned above. BioVenn is the only package that is available in both R and Python (as well as a web interface). There are packages that can generate Venn diagrams from more than three sets, but these are either not area-proportional or inaccurate. For three sets, it is sometimes impossible to create a completely accurate area-proportional Venn diagram; when more sets are added, this becomes an even larger issue. Only BioVenn has built-in biological ID mapping functionality, which earns it the prefix ‘bio’. Some programs support not only the input of IDs, but also the numbers of the sets and their overlaps. In BioVenn, these are automatically calculated from the ID lists. This also makes sure that the user cannot input mathematically impossible numbers (e.g. overlaps larger than the sets themselves). BioVenn supports a large number of output formats as well. It needs to be noted here that this table only lists built-in functionality; some functionality such as plotting to certain file formats might be possible by using other R or Python functions (e.g. the ‘matplotlib.pyplot’ functions in Python). BioVenn is the only package that supports drag-and-drop of the titles and labels (in SVG mode), which can be a very useful functionality when a set or overlap is very small compared to the rest of the figure, or when the circle title (e.g. ‘Set X’, ‘Set Y’, ‘Set Z’) overlaps with a number. BioVenn uses only circles, whereas other packages also use ellipses, polygons or even 3D balls. There are four packages (BioVenn, eulerr, PyVenn and VennDiagram) that are able to print absolute numbers or percentages in the diagram. Finally, BioVenn offers the most flexibility in formatting: title, subtitle and circle texts can be changed (as well as their fonts and colors), and the background color and the circle colors can be set.ConclusionAlthough there are currently many tools available that can visualize sets and intersections using Venn and/or Euler diagrams or other ways (e.g. UpSetR ADDIN EN.CITE <EndNote><Cite><Author>Conway</Author><Year>2017</Year><RecNum>125</RecNum><DisplayText>[18]</DisplayText><record><rec-number>125</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1612800518">125</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Conway, J. R.</author><author>Lex, A.</author><author>Gehlenborg, N.</author></authors></contributors><auth-address>Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA.&#xD;SCI Institute, School of Computing, University of Utah, Salt Lake City, UT 84112, USA.</auth-address><titles><title>UpSetR: an R package for the visualization of intersecting sets and their properties</title><secondary-title>Bioinformatics</secondary-title></titles><periodical><full-title>Bioinformatics</full-title></periodical><pages>2938-2940</pages><volume>33</volume><number>18</number><keywords><keyword>Computational Biology/*methods</keyword><keyword>Genotyping Techniques/methods</keyword><keyword>Sequence Analysis, DNA/methods</keyword><keyword>*Software</keyword></keywords><dates><year>2017</year><pub-dates><date>Sep 15</date></pub-dates></dates><isbn>1367-4811 (Electronic)&#xD;1367-4803 (Linking)</isbn><accession-num>28645171</accession-num><urls><related-urls><url>;[18], which employs a scalable matrix-based visualization), BioVenn still has added value. The BioVenn R and Python packages are a useful addition to the existing web interface, and they have some unique advantages over existing packages that can create Venn diagrams, such as the mapping of biological IDs and the drag-and-drop functionality in SVG mode. Other useful functions are the area-proportionality, printing absolute numbers or percentages, and the possibility to change all colors (including text and background) and fonts. The BioVenn R package is available in the CRAN repository ADDIN EN.CITE <EndNote><Cite><Author>T.</Author><Year>2020</Year><RecNum>113</RecNum><DisplayText>[19]</DisplayText><record><rec-number>113</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1600245622">113</key></foreign-keys><ref-type name="Computer Program">9</ref-type><contributors><authors><author>Hulsen, T.</author></authors></contributors><titles><title>BioVenn R package at CRAN</title></titles><edition>Version 1.1.1</edition><dates><year>2020</year></dates><urls><related-urls><url>;[19], and can be installed by running ‘install.packages(“BioVenn”)’. The Python package is available in the PyPI repository ADDIN EN.CITE <EndNote><Cite><Author>Hulsen</Author><Year>2020</Year><RecNum>115</RecNum><DisplayText>[20]</DisplayText><record><rec-number>115</rec-number><foreign-keys><key app="EN" db-id="tdtx0tpsad9rd6ez2prpwx2s0s2220zw0p0t" timestamp="1605025440">115</key></foreign-keys><ref-type name="Computer Program">9</ref-type><contributors><authors><author>Hulsen, T.</author></authors></contributors><titles><title>BioVenn Python package at PyPI</title></titles><edition>Version 1.1.1</edition><dates><year>2020</year></dates><urls><related-urls><url>;[20], and can be installed by running ‘pip install BioVenn’. The BioVenn web interface remains available at author would like to thank the numerous people who have sent their suggestions for improvements over the past years, which have resulted in a more precise web tool (and now also an R package as well as a Python package).Competing interest statementDr. Hulsen is employed by Philips Research.References ADDIN EN.REFLIST [1]H.A. Kestler, A. Muller, J.M. Kraus, M. Buchholz, T.M. Gress, H. Liu, D.W. Kane, B.R. Zeeberg, and J.N. Weinstein, VennMaster: area-proportional Euler diagrams for functional GO analysis of microarrays, BMC Bioinformatics 9 (2008), 67. PubMed ID: 18230172. .[2]G. Stapleton, Z. Leishi, J. Howse, and P. Rodgers, Drawing Euler Diagrams with Circles: The Theory of Piercings, IEEE Trans Vis Comput Graph 17 (2011), 1020-1032. PubMed ID: 20855916. .[3]T. Hulsen, VennDiagram.tk, .[4]T. Hulsen, J. de Vlieg, and W. Alkema, BioVenn - a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams, BMC Genomics 9 (2008), 488. PubMed ID: 18925949. .[5]Google Scholar Citations for 'BioVenn – a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams', .[6]E. Noma and A. Manvae, colorfulVennPlot: Plot and add custom coloring to Venn diagrams for 2-dimensional, 3-dimensional and 4-dimensional data, Version 2.4, .[7]J. Larsson, eulerr: Area-Proportional Euler and Venn Diagrams with Ellipses, Version 6.1.0, .[8]J.G. Perez-Silva, M. Araujo-Voces, and V. Quesada, nVenn: generalized, quasi-proportional Venn and Euler diagrams, Bioinformatics 34 (2018), 2322-2324. PubMed ID: 29949954. .[9]A. Dusa, venn: Draw Venn Diagrams, Version 1.9, .[10]H. Chen and P.C. Boutros, VennDiagram: a package for the generation of highly-customizable Venn and Euler diagrams in R, BMC Bioinformatics 12 (2011), 35. PubMed ID: 21269502. .[11]L. Wilkinson and S. Urbanek, venneuler: Venn and Euler diagrams, Version 1.1-0, .[12]Z. Xu, R.W. Oldford, and M. Lysy, vennplot: Venn Diagrams in 2D and 3D, Version 1.0, .[13]K. Tretyakov, Matplotlib-Venn Python package at PyPi, Version 0.11.6, .[14]K. Grigorev, PyVenn Python package at PyPi, Version 0.1.3, .[15]S. Durinck, P.T. Spellman, E. Birney, and W. Huber, Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt, Nat Protoc 4 (2009), 1184-1191. PubMed ID: 19617889. .[16]S. Briois, Biomart Python package at PyPi, Version 0.9.2, .[17]A.M. Altenhoff, C.M. Train, K.J. Gilbert, I. Mediratta, T. Mendes de Farias, D. Moi, Y. Nevers, H.S. Radoykova, V. Rossier, A. Warwick Vesztrocy, N.M. Glover, and C. Dessimoz, OMA orthology in 2021: website overhaul, conserved isoforms, ancestral gene order and more, Nucleic Acids Res (2020). PubMed ID: 33174605. .[18]J.R. Conway, A. Lex, and N. Gehlenborg, UpSetR: an R package for the visualization of intersecting sets and their properties, Bioinformatics 33 (2017), 2938-2940. PubMed ID: 28645171. .[19]T. Hulsen, BioVenn R package at CRAN, Version 1.1.1, .[20]T. Hulsen, BioVenn Python package at PyPI, Version 1.1.1, . ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download