PeptidePicker: a Tool for Determining Most Appropriate ...



for Journal of Proteome ResearchPeptidePicker: a Scientific Workflow with Web Interface for Determining Appropriate Peptides for Targeted Proteomics Experiments Yassene Mohammed*1,2, Dominik Domański3, Angela M. Jackson1, Derek S. Smith1, André M. Deelder2, Magnus N. Palmblad*2, and Christoph H. Borchers*11University of Victoria - Genome British Columbia Proteomics Centre, University of Victoria, Canada2Center for Proteomics and Metabolomics, Leiden University Medical Center, the Netherlands3Mass Spectrometry Laboratory, Institute of Biochemistry and Biophysics PAS, Warsaw, PolandABSTRACTOne challenge in Multiple Reaction Monitoring Proteomics analysis is to select the most appropriate surrogate peptides to represent a target protein. Several features are required for a good surrogate peptide -- it should be unique within the target proteome, efficiently liberated during enzymatic digestion, and free of post translational modifications or variants. We present here a software package to automatically generate the most appropriate surrogate peptides to represent target proteins in an LC/MRM-MS analysis. Our method integrates most of the information about the proteins, their tryptic peptides and suitability for MRM that is available online in UniProtKB, NCBI’s dbSNP, ExPASy, PeptideAtlas, and GPM. We introduce a new scoring algorithm that reflects our knowledge in choosing best candidate peptides for an MRM experiment based on the uniqueness of the peptide in the targeted proteome, the chemophysio properties of each peptide, and whether it was previously observed or not. The scoring mechanism can be modified to allow different weighting of peptides selecting criteria if desired. The scientific workflow allows linking to a user's database to determine whether already-used peptides are indeed the most appropriate ones or not by matching them with the selection criteria, other possible peptides, and scoring them. The modularity of the developed workflow, i.e. different processing unites each dedicated to one specific task, allows it to be easily extended and allows additional selection criteria to be incorporated. In order to hide the complexity of the underlying system, and to make it user friendly, we have developed a simple Web interface where the researcher provides the protein accession number, the subject organism, and peptide-specific options (for example, if only the mature form of the protein should be considered). Currently the software is designed for human and mouse proteomes, but additional species can be easily be added.. In comparison with manually selected peptides, our methods enriched the peptide selection by eliminating human error, considering all of a protein isoforms, and allowed faster peptide selection. Regarding speed our tool managed more than 50 proteins in 12 minutes compared to 8 proteins a day by expert. In all tests all peptides chosen by expert were a subset of the selected peptides by PeptidePicker, except when there was a human error. Keywords: MRM, SRM, targeted proteomics, peptide selection, data integration, scientific workflowINTRODUCTION Multiple Reaction Monitoring (MRM), also known as Selected Reaction Monitoring (SRM), is a powerful method for performing targeted proteomics. The approach is based on the generation of specific quantitative assays for each protein of interest and can be used to accurately quantitate large sets of proteins at high throughput. ADDIN EN.CITE <EndNote><Cite><Author>Percy</Author><Year>2013</Year><RecNum>16209</RecNum><DisplayText><style face="superscript">1</style></DisplayText><record><rec-number>16209</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16209</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Percy, Andrew J. </author><author>Chambers, Andrew G. </author><author>Yang, Juncong </author><author>Hardie, Darryl </author><author>Borchers, Christoph H. </author></authors></contributors><titles><title>Advances in Multiplexed MRM-based Protein Biomarker Quantitation Toward Clinical Utility.</title><secondary-title>Biochimica et Biophysica Acta</secondary-title></titles><volume>pii: S1570-9639(13)00239-2. </volume><dates><year>2013</year></dates><urls></urls><electronic-resource-num>doi:pii: S1570-9639(13)00239-2. 10.1016/j.bbapap.2013.06.008</electronic-resource-num></record></Cite></EndNote>1 MRM studies are typically performed on triple quadrupole mass spectrometers. The first quadrupole is used to isolate a specific precursor peptide ion, the second quadrupole is used for collision-induced dissociation, and the third is used to isolate a characteristic fragment ion. In multiplexed analysis, specific precursor/product ion pairs are monitored while the peptide is eluting from the liquid chromatography system, which allows quantitation of specific peptides and, by inference, the corresponding proteins. Designing an MRM/SRM assay for a specific protein starts with the selection of the most appropriate peptides to represent the target protein. Depending on the study design, these peptides are selected according to multiple criteria so that they are unique to the target protein. In addition, they must follow other rules in order to facilitate their accurate and precise measurement in the mass spectrometer. The selection process has, thus far, been cumbersome, and is usually performed by scientists with proficiency in selecting peptides for MRM analyses. We present here a scientific workflow to automatically select the most appropriate surrogate peptides to represent a target protein in an LC/MRM-MS analysis. The method integrates the data and information available online in UniProt HYPERLINK \l "_ENREF_2" \o "UniProt_Consortium, 2009 #14892" ADDIN EN.CITE <EndNote><Cite><Author>UniProt_Consortium</Author><Year>2009</Year><RecNum>14892</RecNum><DisplayText><style face="superscript">2</style></DisplayText><record><rec-number>14892</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">14892</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>UniProt_Consortium</author></authors></contributors><titles><title>The Universal Protein Resource (UniProt) 2009.</title><secondary-title>Nucleic Acids Research</secondary-title></titles><pages>D169–D174</pages><volume>37</volume><dates><year>2009</year></dates><urls></urls></record></Cite></EndNote>2, NCBI’s dbSNP HYPERLINK \l "_ENREF_3" \o "Sherry, 1999 #16341" ADDIN EN.CITE <EndNote><Cite><Author>Sherry</Author><Year>1999</Year><RecNum>16341</RecNum><DisplayText><style face="superscript">3</style></DisplayText><record><rec-number>16341</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="1387319350">16341</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Sherry, S. T.</author><author>Ward, M.</author><author>Sirotkin, K.</author></authors></contributors><auth-address>National Center for Biotechnology Information, National Institutes of Health, Bethesda, Maryland 20894, USA. sherry@ray.nlm.</auth-address><titles><title>dbSNP-database for single nucleotide polymorphisms and other classes of minor genetic variation</title><secondary-title>Genome Res</secondary-title><alt-title>Genome research</alt-title></titles><periodical><full-title>Genome Res</full-title><abbr-1>Genome research</abbr-1></periodical><alt-periodical><full-title>Genome Res</full-title><abbr-1>Genome research</abbr-1></alt-periodical><pages>677-9</pages><volume>9</volume><number>8</number><keywords><keyword>Animals</keyword><keyword>*Databases, Factual</keyword><keyword>*Genetic Variation</keyword><keyword>Humans</keyword><keyword>*Nucleotides</keyword><keyword>*Polymorphism, Genetic</keyword></keywords><dates><year>1999</year><pub-dates><date>Aug</date></pub-dates></dates><isbn>1088-9051 (Print)&#xD;1088-9051 (Linking)</isbn><accession-num>10447503</accession-num><urls><related-urls><url>, ExPASy HYPERLINK \l "_ENREF_4" \o "ExPASy_Bioinformatics_Portal, 2005 #15519" ADDIN EN.CITE <EndNote><Cite><Author>ExPASy_Bioinformatics_Portal</Author><Year>2005</Year><RecNum>15519</RecNum><DisplayText><style face="superscript">4</style></DisplayText><record><rec-number>15519</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">15519</key></foreign-keys><ref-type name="Web Page">12</ref-type><contributors><authors><author>ExPASy_Bioinformatics_Portal</author></authors></contributors><titles><title>Peptide Cutter</title></titles><number>July 2012</number><dates><year>2005</year></dates><publisher>, PeptideAtlas HYPERLINK \l "_ENREF_5" \o "PeptideAtlas, 2010 #14082" ADDIN EN.CITE <EndNote><Cite><Author>PeptideAtlas</Author><Year>2010</Year><RecNum>14082</RecNum><DisplayText><style face="superscript">5</style></DisplayText><record><rec-number>14082</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">14082</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>PeptideAtlas</author></authors></contributors><titles><title></title></titles><dates><year>2010</year></dates><urls></urls></record></Cite></EndNote>5, and the Global Proteome Machine -- GPM. ADDIN EN.CITE <EndNote><Cite><Author>The_Global_Proteome_Machine_Organization</Author><Year>2004-2011</Year><RecNum>16326</RecNum><DisplayText><style face="superscript">6</style></DisplayText><record><rec-number>16326</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16326</key></foreign-keys><ref-type name="Web Page">12</ref-type><contributors><authors><author>The_Global_Proteome_Machine_Organization</author></authors></contributors><titles><title>The Global Proteome Machine</title></titles><volume>November 19, 2013</volume><number>November 19, 2013</number><dates><year>2004-2011</year></dates><publisher> By linking one's own database of MRM peptides, the scientific workflow can also verify whether an already selected peptide is indeed the most appropriate surrogate peptide for the target protein. The workflow also integrates our knowledge and practice in using available information for choosing peptides. It also gives the researcher the freedom to switch some of the selection criteria on and off, according to the analysis. The method can also be extended or modified to include (or exclude) additional peptide-selection criteria. Compared to the manual process of choosing peptides by a scientist with proficiency in peptide selection for MRM experiment, our method is faster and more accurate. Due to the automatic enforcement of the selection criteria, the tool can also be used to find and eliminate human error in previously selected lists of peptides. The software can also be used for periodic improvement of precursor/product ion lists, for example when the discovery of new post translational modifications or SNPs might result in some peptides no longer meeting the selection criteria.MATERIALS AND METHODSData formats Our implementation uses different formats corresponding to the specific data repositories from which the information is retrieved. The protein Data downloaded from UniProt HYPERLINK \l "_ENREF_2" \o "UniProt_Consortium, 2009 #14892" ADDIN EN.CITE <EndNote><Cite><Author>UniProt_Consortium</Author><Year>2009</Year><RecNum>14892</RecNum><DisplayText><style face="superscript">2</style></DisplayText><record><rec-number>14892</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">14892</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>UniProt_Consortium</author></authors></contributors><titles><title>The Universal Protein Resource (UniProt) 2009.</title><secondary-title>Nucleic Acids Research</secondary-title></titles><pages>D169–D174</pages><volume>37</volume><dates><year>2009</year></dates><urls></urls></record></Cite></EndNote>2 is in XML. The online search results from GPM ADDIN EN.CITE <EndNote><Cite><Author>The_Global_Proteome_Machine_Organization</Author><Year>2004-2011</Year><RecNum>16326</RecNum><DisplayText><style face="superscript">6</style></DisplayText><record><rec-number>16326</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16326</key></foreign-keys><ref-type name="Web Page">12</ref-type><contributors><authors><author>The_Global_Proteome_Machine_Organization</author></authors></contributors><titles><title>The Global Proteome Machine</title></titles><volume>November 19, 2013</volume><number>November 19, 2013</number><dates><year>2004-2011</year></dates><publisher> are in HTML. Data from PeptideAtlas HYPERLINK \l "_ENREF_5" \o "PeptideAtlas, 2010 #14082" ADDIN EN.CITE <EndNote><Cite><Author>PeptideAtlas</Author><Year>2010</Year><RecNum>14082</RecNum><DisplayText><style face="superscript">5</style></DisplayText><record><rec-number>14082</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">14082</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>PeptideAtlas</author></authors></contributors><titles><title></title></titles><dates><year>2010</year></dates><urls></urls></record></Cite></EndNote>5 is in fasta and XML formats. For the purpose of our workflow, using markup languages was essential due to the convenience of parsing them to extract different values of interest according to specific tags. Data in fasta format (i.e., lists of protein sequences) are generally easier and faster to search, and do not reflect the complexity and standardization of XML formats. Our workflow takes advantage of and uses both formats whenever appropriate. The intermediate results from each processing module of the scientific workflow is transferred to the next module using data lists that can be saved as text files or as comma-separated values lists whenever needed. The ability to see the intermediate stages of the selection process allows the researcher to see the origins and plausibility of the results at some later time point, which is useful if an unexpected result is generated. This is especially important when the content of the data repositories we are using is changing over time as new discoveries are made and as scientists upload new data.Scientific Workflow EngineWe used the Taverna workflow engine to build one complete data processing workflow from the different data processing steps. ADDIN EN.CITE <EndNote><Cite><Author>Oinn</Author><Year>2004</Year><RecNum>45</RecNum><DisplayText><style face="superscript">7</style></DisplayText><record><rec-number>45</rec-number><foreign-keys><key app="EN" db-id="902fd0xsnx999oef9d6pd5twtdzzd9r9r59e">45</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Oinn, T.</author><author>Addis, M.</author><author>Ferris, J.</author><author>Marvin, D.</author><author>Senger, M.</author><author>Greenwood, M.</author><author>Carver, T.</author><author>Glover, K.</author><author>Pocock, M. R.</author><author>Wipat, A.</author><author>Li, P.</author></authors></contributors><auth-address>Li, P&#xD;Univ Newcastle Upon Tyne, Sch Comp Sci, Newcastle Upon Tyne NE1 7RU, Tyne &amp; Wear, England&#xD;Univ Newcastle Upon Tyne, Sch Comp Sci, Newcastle Upon Tyne NE1 7RU, Tyne &amp; Wear, England&#xD;EMBL European Bioinformat Inst, Cambridge CB10 1SD, England&#xD;Univ Southampton, IT Innovat Ctr, Southampton SO16 7NP, Hants, England&#xD;Univ Manchester, Dept Comp Sci, Manchester M13 9PL, Lancs, England&#xD;MRC, Rosalind Franklin Ctr Genom Res, Cambridge CB10 1SB, England&#xD;Univ Nottingham, Sch Comp Sci &amp; Informat Technol, Nottingham NG8 1BB, England</auth-address><titles><title>Taverna: a tool for the composition and enactment of bioinformatics workflows</title><secondary-title>Bioinformatics</secondary-title><alt-title>Bioinformatics&#xD;Bioinformatics</alt-title></titles><periodical><full-title>Bioinformatics</full-title></periodical><pages>3045-3054</pages><volume>20</volume><number>17</number><keywords><keyword>web-services</keyword><keyword>pathways</keyword><keyword>biology</keyword><keyword>suite</keyword></keywords><dates><year>2004</year><pub-dates><date>Nov 22</date></pub-dates></dates><isbn>1367-4803</isbn><accession-num>ISI:000225361400018</accession-num><urls><related-urls><url>&lt;Go to ISI&gt;://000225361400018</url></related-urls></urls><language>English</language></record></Cite></EndNote>7 Scientific workflows allow automatic transition between processing steps, and workflow engines such as Galaxy ADDIN EN.CITE <EndNote><Cite><Author>Goecks</Author><Year>2010</Year><RecNum>46</RecNum><DisplayText><style face="superscript">8</style></DisplayText><record><rec-number>46</rec-number><foreign-keys><key app="EN" db-id="902fd0xsnx999oef9d6pd5twtdzzd9r9r59e">46</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Goecks, J.</author><author>Nekrutenko, A.</author><author>Taylor, J.</author></authors></contributors><auth-address>Department of Biology and Department of Mathematics and Computer Science, Emory University, 1510 Clifton Road NE, Atlanta, GA 30322, USA. jeremy.goecks@emory.edu</auth-address><titles><title>Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences</title><secondary-title>Genome Biol</secondary-title></titles><periodical><full-title>Genome Biol</full-title></periodical><pages>R86</pages><volume>11</volume><number>8</number><edition>2010/08/27</edition><keywords><keyword>Algorithms</keyword><keyword>Animals</keyword><keyword>Computational Biology/*methods</keyword><keyword>Databases, Nucleic Acid</keyword><keyword>Genomics/methods</keyword><keyword>Humans</keyword><keyword>*Internet</keyword></keywords><dates><year>2010</year></dates><isbn>1465-6914 (Electronic)&#xD;1465-6906 (Linking)</isbn><accession-num>20738864</accession-num><urls><related-urls><url> [pii]&#xD;10.1186/gb-2010-11-8-r86</electronic-resource-num><language>eng</language></record></Cite></EndNote>8, Moteur HYPERLINK \l "_ENREF_9" \o "Maheshwari, 2010 #48" ADDIN EN.CITE <EndNote><Cite><Author>Maheshwari</Author><Year>2010</Year><RecNum>48</RecNum><DisplayText><style face="superscript">9</style></DisplayText><record><rec-number>48</rec-number><foreign-keys><key app="EN" db-id="902fd0xsnx999oef9d6pd5twtdzzd9r9r59e">48</key></foreign-keys><ref-type name="Conference Proceedings">10</ref-type><contributors><authors><author>Maheshwari, K.</author><author>Montagnat, J.</author></authors></contributors><titles><title>Scientific Workflow Development Using Both Visual and Script-Based Representation</title><secondary-title>Services (SERVICES-1), 2010 6th World Congress on</secondary-title><alt-title>Services (SERVICES-1), 2010 6th World Congress on</alt-title></titles><pages>328-335</pages><keywords><keyword>graphical user interfaces</keyword><keyword>programming language semantics</keyword><keyword>scientific information systems</keyword><keyword>GUI</keyword><keyword>enactment engine</keyword><keyword>execution semantics</keyword><keyword>script-based representation</keyword><keyword>visual-based representation</keyword><keyword>workflow development</keyword></keywords><dates><year>2010</year><pub-dates><date>5-10 July 2010</date></pub-dates></dates><urls></urls></record></Cite></EndNote>9, Kepler HYPERLINK \l "_ENREF_10" \o "Altintas, 2004 #53" ADDIN EN.CITE <EndNote><Cite><Author>Altintas</Author><Year>2004</Year><RecNum>53</RecNum><DisplayText><style face="superscript">10</style></DisplayText><record><rec-number>53</rec-number><foreign-keys><key app="EN" db-id="902fd0xsnx999oef9d6pd5twtdzzd9r9r59e">53</key></foreign-keys><ref-type name="Conference Paper">47</ref-type><contributors><authors><author>Ilkay Altintas</author><author>Chad Berkley</author><author>Efrat Jaeger</author><author>Matthew Jones</author><author>Bertram Ludascher</author><author>Steve Mock</author></authors></contributors><titles><title>Kepler: An Extensible System for Design and Execution of Scientific Workflows</title><secondary-title>Proceedings of the 16th International Conference on Scientific and Statistical Database Management</secondary-title></titles><pages>423</pages><dates><year>2004</year></dates><publisher>IEEE Computer Society</publisher><urls></urls><custom1>1007097</custom1><electronic-resource-num>10.1109/ssdbm.2004.44</electronic-resource-num></record></Cite></EndNote>10, and Taverna HYPERLINK \l "_ENREF_7" \o "Oinn, 2004 #45" ADDIN EN.CITE <EndNote><Cite><Author>Oinn</Author><Year>2004</Year><RecNum>45</RecNum><DisplayText><style face="superscript">7</style></DisplayText><record><rec-number>45</rec-number><foreign-keys><key app="EN" db-id="902fd0xsnx999oef9d6pd5twtdzzd9r9r59e">45</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Oinn, T.</author><author>Addis, M.</author><author>Ferris, J.</author><author>Marvin, D.</author><author>Senger, M.</author><author>Greenwood, M.</author><author>Carver, T.</author><author>Glover, K.</author><author>Pocock, M. R.</author><author>Wipat, A.</author><author>Li, P.</author></authors></contributors><auth-address>Li, P&#xD;Univ Newcastle Upon Tyne, Sch Comp Sci, Newcastle Upon Tyne NE1 7RU, Tyne &amp; Wear, England&#xD;Univ Newcastle Upon Tyne, Sch Comp Sci, Newcastle Upon Tyne NE1 7RU, Tyne &amp; Wear, England&#xD;EMBL European Bioinformat Inst, Cambridge CB10 1SD, England&#xD;Univ Southampton, IT Innovat Ctr, Southampton SO16 7NP, Hants, England&#xD;Univ Manchester, Dept Comp Sci, Manchester M13 9PL, Lancs, England&#xD;MRC, Rosalind Franklin Ctr Genom Res, Cambridge CB10 1SB, England&#xD;Univ Nottingham, Sch Comp Sci &amp; Informat Technol, Nottingham NG8 1BB, England</auth-address><titles><title>Taverna: a tool for the composition and enactment of bioinformatics workflows</title><secondary-title>Bioinformatics</secondary-title><alt-title>Bioinformatics&#xD;Bioinformatics</alt-title></titles><periodical><full-title>Bioinformatics</full-title></periodical><pages>3045-3054</pages><volume>20</volume><number>17</number><keywords><keyword>web-services</keyword><keyword>pathways</keyword><keyword>biology</keyword><keyword>suite</keyword></keywords><dates><year>2004</year><pub-dates><date>Nov 22</date></pub-dates></dates><isbn>1367-4803</isbn><accession-num>ISI:000225361400018</accession-num><urls><related-urls><url>&lt;Go to ISI&gt;://000225361400018</url></related-urls></urls><language>English</language></record></Cite></EndNote>7 have been introduced to facilitate interfacing modular processing steps, automating analysis pipelines, scaling them up to Big Data, and making analyses reproducible and shareable. Here we used Taverna 2.4 which offers various types of processors, such as WSDL/REST Web services, Tools, and XPath processors.PEVuZE5vdGU+PENpdGU+PFJlY051bT41NjwvUmVjTnVtPjxEaXNwbGF5VGV4dD48c3R5bGUgZmFj

ZT0ic3VwZXJzY3JpcHQiPjcsIDExPC9zdHlsZT48L0Rpc3BsYXlUZXh0PjxyZWNvcmQ+PHJlYy1u

dW1iZXI+NTY8L3JlYy1udW1iZXI+PGZvcmVpZ24ta2V5cz48a2V5IGFwcD0iRU4iIGRiLWlkPSJm

ZmFzOXBydDhhcjUwZmVkZHJvdnhhMjJ4ZDJyZTVkYXRycHMiIHRpbWVzdGFtcD0iMCI+NTY8L2tl

eT48L2ZvcmVpZ24ta2V5cz48cmVmLXR5cGUgbmFtZT0iSm91cm5hbCBBcnRpY2xlIj4xNzwvcmVm

LXR5cGU+PGNvbnRyaWJ1dG9ycz48YXV0aG9ycz48YXV0aG9yPkxhbmdsZXksIEcuSi48L2F1dGhv

cj48YXV0aG9yPkhlcm5pbWFuLCBKLk0uPC9hdXRob3I+PGF1dGhvcj5EYXZpZXMsIE4uTC48L2F1

dGhvcj48YXV0aG9yPkJyb3duLCBSLjwvYXV0aG9yPjwvYXV0aG9ycz48L2NvbnRyaWJ1dG9ycz48

dGl0bGVzPjx0aXRsZT5TaW1wbGlmaWVkIHNhbXBsZSBwcmVwYXJhdGlvbiBmb3IgdGhlIGFuYWx5

c2lzIG9mIG9saWdvbnVjbGVvdGlkZXMgYnkgbWF0cml4LWFzc2lzdGVkIGxhc2VyIGRlc29ycHRp

b24vaW9uaXNhdGlvbiB0aW1lLW9mLWZsaWdodCBtYXNzIHNwZWN0cm9tZXRyeTwvdGl0bGU+PHNl

Y29uZGFyeS10aXRsZT5SYXBpZCBDb21tdW4uIE1hc3MgU3BlY3Ryb20uPC9zZWNvbmRhcnktdGl0

bGU+PC90aXRsZXM+PHBhZ2VzPjE3MTctMTcyMzwvcGFnZXM+PHZvbHVtZT4xMzwvdm9sdW1lPjxk

YXRlcz48eWVhcj4xOTk5PC95ZWFyPjxwdWItZGF0ZXM+PGRhdGU+MTk5OTwvZGF0ZT48L3B1Yi1k

YXRlcz48L2RhdGVzPjxsYWJlbD41NjwvbGFiZWw+PHVybHM+PC91cmxzPjwvcmVjb3JkPjwvQ2l0

ZT48Q2l0ZT48QXV0aG9yPk9pbm48L0F1dGhvcj48WWVhcj4yMDA0PC9ZZWFyPjxSZWNOdW0+NDU8

L1JlY051bT48cmVjb3JkPjxyZWMtbnVtYmVyPjQ1PC9yZWMtbnVtYmVyPjxmb3JlaWduLWtleXM+

PGtleSBhcHA9IkVOIiBkYi1pZD0iOTAyZmQweHNueDk5OW9lZjlkNnBkNXR3dGR6emQ5cjlyNTll

Ij40NTwva2V5PjwvZm9yZWlnbi1rZXlzPjxyZWYtdHlwZSBuYW1lPSJKb3VybmFsIEFydGljbGUi

PjE3PC9yZWYtdHlwZT48Y29udHJpYnV0b3JzPjxhdXRob3JzPjxhdXRob3I+T2lubiwgVC48L2F1

dGhvcj48YXV0aG9yPkFkZGlzLCBNLjwvYXV0aG9yPjxhdXRob3I+RmVycmlzLCBKLjwvYXV0aG9y

PjxhdXRob3I+TWFydmluLCBELjwvYXV0aG9yPjxhdXRob3I+U2VuZ2VyLCBNLjwvYXV0aG9yPjxh

dXRob3I+R3JlZW53b29kLCBNLjwvYXV0aG9yPjxhdXRob3I+Q2FydmVyLCBULjwvYXV0aG9yPjxh

dXRob3I+R2xvdmVyLCBLLjwvYXV0aG9yPjxhdXRob3I+UG9jb2NrLCBNLiBSLjwvYXV0aG9yPjxh

dXRob3I+V2lwYXQsIEEuPC9hdXRob3I+PGF1dGhvcj5MaSwgUC48L2F1dGhvcj48L2F1dGhvcnM+

PC9jb250cmlidXRvcnM+PGF1dGgtYWRkcmVzcz5MaSwgUCYjeEQ7VW5pdiBOZXdjYXN0bGUgVXBv

biBUeW5lLCBTY2ggQ29tcCBTY2ksIE5ld2Nhc3RsZSBVcG9uIFR5bmUgTkUxIDdSVSwgVHluZSAm

YW1wOyBXZWFyLCBFbmdsYW5kJiN4RDtVbml2IE5ld2Nhc3RsZSBVcG9uIFR5bmUsIFNjaCBDb21w

IFNjaSwgTmV3Y2FzdGxlIFVwb24gVHluZSBORTEgN1JVLCBUeW5lICZhbXA7IFdlYXIsIEVuZ2xh

bmQmI3hEO0VNQkwgRXVyb3BlYW4gQmlvaW5mb3JtYXQgSW5zdCwgQ2FtYnJpZGdlIENCMTAgMVNE

LCBFbmdsYW5kJiN4RDtVbml2IFNvdXRoYW1wdG9uLCBJVCBJbm5vdmF0IEN0ciwgU291dGhhbXB0

b24gU08xNiA3TlAsIEhhbnRzLCBFbmdsYW5kJiN4RDtVbml2IE1hbmNoZXN0ZXIsIERlcHQgQ29t

cCBTY2ksIE1hbmNoZXN0ZXIgTTEzIDlQTCwgTGFuY3MsIEVuZ2xhbmQmI3hEO01SQywgUm9zYWxp

bmQgRnJhbmtsaW4gQ3RyIEdlbm9tIFJlcywgQ2FtYnJpZGdlIENCMTAgMVNCLCBFbmdsYW5kJiN4

RDtVbml2IE5vdHRpbmdoYW0sIFNjaCBDb21wIFNjaSAmYW1wOyBJbmZvcm1hdCBUZWNobm9sLCBO

b3R0aW5naGFtIE5HOCAxQkIsIEVuZ2xhbmQ8L2F1dGgtYWRkcmVzcz48dGl0bGVzPjx0aXRsZT5U

YXZlcm5hOiBhIHRvb2wgZm9yIHRoZSBjb21wb3NpdGlvbiBhbmQgZW5hY3RtZW50IG9mIGJpb2lu

Zm9ybWF0aWNzIHdvcmtmbG93czwvdGl0bGU+PHNlY29uZGFyeS10aXRsZT5CaW9pbmZvcm1hdGlj

czwvc2Vjb25kYXJ5LXRpdGxlPjxhbHQtdGl0bGU+QmlvaW5mb3JtYXRpY3MmI3hEO0Jpb2luZm9y

bWF0aWNzPC9hbHQtdGl0bGU+PC90aXRsZXM+PHBlcmlvZGljYWw+PGZ1bGwtdGl0bGU+QmlvaW5m

b3JtYXRpY3M8L2Z1bGwtdGl0bGU+PC9wZXJpb2RpY2FsPjxwYWdlcz4zMDQ1LTMwNTQ8L3BhZ2Vz

Pjx2b2x1bWU+MjA8L3ZvbHVtZT48bnVtYmVyPjE3PC9udW1iZXI+PGtleXdvcmRzPjxrZXl3b3Jk

PndlYi1zZXJ2aWNlczwva2V5d29yZD48a2V5d29yZD5wYXRod2F5czwva2V5d29yZD48a2V5d29y

ZD5iaW9sb2d5PC9rZXl3b3JkPjxrZXl3b3JkPnN1aXRlPC9rZXl3b3JkPjwva2V5d29yZHM+PGRh

dGVzPjx5ZWFyPjIwMDQ8L3llYXI+PHB1Yi1kYXRlcz48ZGF0ZT5Ob3YgMjI8L2RhdGU+PC9wdWIt

ZGF0ZXM+PC9kYXRlcz48aXNibj4xMzY3LTQ4MDM8L2lzYm4+PGFjY2Vzc2lvbi1udW0+SVNJOjAw

MDIyNTM2MTQwMDAxODwvYWNjZXNzaW9uLW51bT48dXJscz48cmVsYXRlZC11cmxzPjx1cmw+Jmx0

O0dvIHRvIElTSSZndDs6Ly8wMDAyMjUzNjE0MDAwMTg8L3VybD48L3JlbGF0ZWQtdXJscz48L3Vy

bHM+PGxhbmd1YWdlPkVuZ2xpc2g8L2xhbmd1YWdlPjwvcmVjb3JkPjwvQ2l0ZT48L0VuZE5vdGU+

AG==

ADDIN EN.CITE PEVuZE5vdGU+PENpdGU+PFJlY051bT41NjwvUmVjTnVtPjxEaXNwbGF5VGV4dD48c3R5bGUgZmFj

ZT0ic3VwZXJzY3JpcHQiPjcsIDExPC9zdHlsZT48L0Rpc3BsYXlUZXh0PjxyZWNvcmQ+PHJlYy1u

dW1iZXI+NTY8L3JlYy1udW1iZXI+PGZvcmVpZ24ta2V5cz48a2V5IGFwcD0iRU4iIGRiLWlkPSJm

ZmFzOXBydDhhcjUwZmVkZHJvdnhhMjJ4ZDJyZTVkYXRycHMiIHRpbWVzdGFtcD0iMCI+NTY8L2tl

eT48L2ZvcmVpZ24ta2V5cz48cmVmLXR5cGUgbmFtZT0iSm91cm5hbCBBcnRpY2xlIj4xNzwvcmVm

LXR5cGU+PGNvbnRyaWJ1dG9ycz48YXV0aG9ycz48YXV0aG9yPkxhbmdsZXksIEcuSi48L2F1dGhv

cj48YXV0aG9yPkhlcm5pbWFuLCBKLk0uPC9hdXRob3I+PGF1dGhvcj5EYXZpZXMsIE4uTC48L2F1

dGhvcj48YXV0aG9yPkJyb3duLCBSLjwvYXV0aG9yPjwvYXV0aG9ycz48L2NvbnRyaWJ1dG9ycz48

dGl0bGVzPjx0aXRsZT5TaW1wbGlmaWVkIHNhbXBsZSBwcmVwYXJhdGlvbiBmb3IgdGhlIGFuYWx5

c2lzIG9mIG9saWdvbnVjbGVvdGlkZXMgYnkgbWF0cml4LWFzc2lzdGVkIGxhc2VyIGRlc29ycHRp

b24vaW9uaXNhdGlvbiB0aW1lLW9mLWZsaWdodCBtYXNzIHNwZWN0cm9tZXRyeTwvdGl0bGU+PHNl

Y29uZGFyeS10aXRsZT5SYXBpZCBDb21tdW4uIE1hc3MgU3BlY3Ryb20uPC9zZWNvbmRhcnktdGl0

bGU+PC90aXRsZXM+PHBhZ2VzPjE3MTctMTcyMzwvcGFnZXM+PHZvbHVtZT4xMzwvdm9sdW1lPjxk

YXRlcz48eWVhcj4xOTk5PC95ZWFyPjxwdWItZGF0ZXM+PGRhdGU+MTk5OTwvZGF0ZT48L3B1Yi1k

YXRlcz48L2RhdGVzPjxsYWJlbD41NjwvbGFiZWw+PHVybHM+PC91cmxzPjwvcmVjb3JkPjwvQ2l0

ZT48Q2l0ZT48QXV0aG9yPk9pbm48L0F1dGhvcj48WWVhcj4yMDA0PC9ZZWFyPjxSZWNOdW0+NDU8

L1JlY051bT48cmVjb3JkPjxyZWMtbnVtYmVyPjQ1PC9yZWMtbnVtYmVyPjxmb3JlaWduLWtleXM+

PGtleSBhcHA9IkVOIiBkYi1pZD0iOTAyZmQweHNueDk5OW9lZjlkNnBkNXR3dGR6emQ5cjlyNTll

Ij40NTwva2V5PjwvZm9yZWlnbi1rZXlzPjxyZWYtdHlwZSBuYW1lPSJKb3VybmFsIEFydGljbGUi

PjE3PC9yZWYtdHlwZT48Y29udHJpYnV0b3JzPjxhdXRob3JzPjxhdXRob3I+T2lubiwgVC48L2F1

dGhvcj48YXV0aG9yPkFkZGlzLCBNLjwvYXV0aG9yPjxhdXRob3I+RmVycmlzLCBKLjwvYXV0aG9y

PjxhdXRob3I+TWFydmluLCBELjwvYXV0aG9yPjxhdXRob3I+U2VuZ2VyLCBNLjwvYXV0aG9yPjxh

dXRob3I+R3JlZW53b29kLCBNLjwvYXV0aG9yPjxhdXRob3I+Q2FydmVyLCBULjwvYXV0aG9yPjxh

dXRob3I+R2xvdmVyLCBLLjwvYXV0aG9yPjxhdXRob3I+UG9jb2NrLCBNLiBSLjwvYXV0aG9yPjxh

dXRob3I+V2lwYXQsIEEuPC9hdXRob3I+PGF1dGhvcj5MaSwgUC48L2F1dGhvcj48L2F1dGhvcnM+

PC9jb250cmlidXRvcnM+PGF1dGgtYWRkcmVzcz5MaSwgUCYjeEQ7VW5pdiBOZXdjYXN0bGUgVXBv

biBUeW5lLCBTY2ggQ29tcCBTY2ksIE5ld2Nhc3RsZSBVcG9uIFR5bmUgTkUxIDdSVSwgVHluZSAm

YW1wOyBXZWFyLCBFbmdsYW5kJiN4RDtVbml2IE5ld2Nhc3RsZSBVcG9uIFR5bmUsIFNjaCBDb21w

IFNjaSwgTmV3Y2FzdGxlIFVwb24gVHluZSBORTEgN1JVLCBUeW5lICZhbXA7IFdlYXIsIEVuZ2xh

bmQmI3hEO0VNQkwgRXVyb3BlYW4gQmlvaW5mb3JtYXQgSW5zdCwgQ2FtYnJpZGdlIENCMTAgMVNE

LCBFbmdsYW5kJiN4RDtVbml2IFNvdXRoYW1wdG9uLCBJVCBJbm5vdmF0IEN0ciwgU291dGhhbXB0

b24gU08xNiA3TlAsIEhhbnRzLCBFbmdsYW5kJiN4RDtVbml2IE1hbmNoZXN0ZXIsIERlcHQgQ29t

cCBTY2ksIE1hbmNoZXN0ZXIgTTEzIDlQTCwgTGFuY3MsIEVuZ2xhbmQmI3hEO01SQywgUm9zYWxp

bmQgRnJhbmtsaW4gQ3RyIEdlbm9tIFJlcywgQ2FtYnJpZGdlIENCMTAgMVNCLCBFbmdsYW5kJiN4

RDtVbml2IE5vdHRpbmdoYW0sIFNjaCBDb21wIFNjaSAmYW1wOyBJbmZvcm1hdCBUZWNobm9sLCBO

b3R0aW5naGFtIE5HOCAxQkIsIEVuZ2xhbmQ8L2F1dGgtYWRkcmVzcz48dGl0bGVzPjx0aXRsZT5U

YXZlcm5hOiBhIHRvb2wgZm9yIHRoZSBjb21wb3NpdGlvbiBhbmQgZW5hY3RtZW50IG9mIGJpb2lu

Zm9ybWF0aWNzIHdvcmtmbG93czwvdGl0bGU+PHNlY29uZGFyeS10aXRsZT5CaW9pbmZvcm1hdGlj

czwvc2Vjb25kYXJ5LXRpdGxlPjxhbHQtdGl0bGU+QmlvaW5mb3JtYXRpY3MmI3hEO0Jpb2luZm9y

bWF0aWNzPC9hbHQtdGl0bGU+PC90aXRsZXM+PHBlcmlvZGljYWw+PGZ1bGwtdGl0bGU+QmlvaW5m

b3JtYXRpY3M8L2Z1bGwtdGl0bGU+PC9wZXJpb2RpY2FsPjxwYWdlcz4zMDQ1LTMwNTQ8L3BhZ2Vz

Pjx2b2x1bWU+MjA8L3ZvbHVtZT48bnVtYmVyPjE3PC9udW1iZXI+PGtleXdvcmRzPjxrZXl3b3Jk

PndlYi1zZXJ2aWNlczwva2V5d29yZD48a2V5d29yZD5wYXRod2F5czwva2V5d29yZD48a2V5d29y

ZD5iaW9sb2d5PC9rZXl3b3JkPjxrZXl3b3JkPnN1aXRlPC9rZXl3b3JkPjwva2V5d29yZHM+PGRh

dGVzPjx5ZWFyPjIwMDQ8L3llYXI+PHB1Yi1kYXRlcz48ZGF0ZT5Ob3YgMjI8L2RhdGU+PC9wdWIt

ZGF0ZXM+PC9kYXRlcz48aXNibj4xMzY3LTQ4MDM8L2lzYm4+PGFjY2Vzc2lvbi1udW0+SVNJOjAw

MDIyNTM2MTQwMDAxODwvYWNjZXNzaW9uLW51bT48dXJscz48cmVsYXRlZC11cmxzPjx1cmw+Jmx0

O0dvIHRvIElTSSZndDs6Ly8wMDAyMjUzNjE0MDAwMTg8L3VybD48L3JlbGF0ZWQtdXJscz48L3Vy

bHM+PGxhbmd1YWdlPkVuZ2xpc2g8L2xhbmd1YWdlPjwvcmVjb3JkPjwvQ2l0ZT48L0VuZE5vdGU+

AG==

ADDIN EN.CITE.DATA 7, 11 For our implementation, we used Beanshell processors, which enable execution of small Java code snippets as part of a workflow. Rshell processors were used to execute the R scripts for peptide selection and comparison as well as for advanced XML parsing. Finally XPath processors are used for simple XML files parsing and extracting needed information.Evaluation DataIn order to evaluate the accuracy of the method in selecting the right peptides, the method was tested using multiple lists of proteins. In order to test the correctness of the results and the processing speed on a desktop computer, we used a list of 50 proteins for which peptides had been previously manually selected. In order to test the robustness of the software, to simulate future use, and to test the software for Big Data scenarios, we also used this software to select peptides for all of the proteins from human chromosome 21.Related WorkMany current software packages select the “best” proteotypic peptides from a given protein based on the sequence information alone. In this context, the best peptides for MRM are considered to be those whose physicochemical properties are likely to result in high sensitivity in electrospray ionization (ESI)-MS experiments, and which will generate high fragment ion signal intensities for a given transition. Software packages using this approach include the ESP predictor ADDIN EN.CITE <EndNote><Cite><Author>Fusaro</Author><Year>2009</Year><RecNum>15488</RecNum><DisplayText><style face="superscript">12</style></DisplayText><record><rec-number>15488</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">15488</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Fusaro, V.A.</author><author>Mani, D.R.</author><author>Mesirov, J.P.</author><author>Carr, S.A.</author></authors></contributors><titles><title>Prediction of high-responding peptides for targeted protein assays by mass spectrometry</title><secondary-title>Nature Biotechnology</secondary-title></titles><pages>190-8</pages><volume>27</volume><number>2</number><dates><year>2009</year></dates><urls></urls></record></Cite></EndNote>12, Peptide Sieve ADDIN EN.CITE <EndNote><Cite><Author>Mallick</Author><Year>2007</Year><RecNum>13898</RecNum><DisplayText><style face="superscript">13</style></DisplayText><record><rec-number>13898</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">13898</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Mallick, Parag </author><author>Schirle, Markus</author><author>Chen, Sharon S. </author><author>Flory, Mark R. </author><author>Lee, Hookeun </author><author>Martin, Daniel </author><author>Ranish, Jeffrey </author><author>Raught, Brian </author><author>Schmitt, Robert </author><author>Werner, Thilo </author><author>Kuster, Bernhard</author><author>Aebersold, Ruedi</author></authors></contributors><titles><title>Computational prediction of proteotypic peptides for quantitative proteomics</title><secondary-title>Nature Biotechnology</secondary-title></titles><pages>125-31</pages><volume>12</volume><number>1</number><dates><year>2007</year></dates><urls></urls></record></Cite></EndNote>13, and PepFly ADDIN EN.CITE <EndNote><Cite><Author>Sanders</Author><Year>2007</Year><RecNum>16322</RecNum><DisplayText><style face="superscript">14, 15</style></DisplayText><record><rec-number>16322</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16322</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Sanders, W.S.</author><author>Bridges, S.M.</author><author>McCarthy, F.M.</author><author>Nanduri, B.</author><author>Burgess, S.C.</author></authors></contributors><titles><title>Prediction of peptides observable by mass spectrometry applied at the experimental set level.</title><secondary-title>BMC Bioinformatics</secondary-title></titles><pages>S23</pages><volume>1</volume><number>8 Suppl 7</number><dates><year>2007</year></dates><urls></urls></record></Cite><Cite><Author>Boja</Author><Year>2012</Year><RecNum>16192</RecNum><record><rec-number>16192</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16192</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Boja, Emily S.</author><author>Rodriguez, Henry </author></authors></contributors><titles><title>Mass spectrometry-based targeted quantitative proteomics: Achieving sensitive and reproducible&#xD;detection of proteins.</title><secondary-title>Proteomics</secondary-title></titles><pages>1093–1110</pages><volume>12</volume><dates><year>2012</year></dates><urls></urls><electronic-resource-num>doi: 10.1002/pmic.201100387 1093</electronic-resource-num></record></Cite></EndNote>14, 15 [Boja and Rodriguez 2012]. These software packages generally depend on “training sets” considering hundreds of physiochemical peptide/amino acid properties with peptides classified according to their precursor responses as derived from extracted ion chromatograms (XICs). Compared to manual empirical selection of MRM transitions which are composed of high-responding precursor/product ion pairs instead of an XIC signal, the ESP predictor, however, correctly selected on average only two out of the top five experimentally validated MRM peptides per protein (#9). ADDIN EN.CITE <EndNote><Cite ExcludeYear="1"><Author>Fusaro</Author><Year>2009</Year><RecNum>15488</RecNum><DisplayText><style face="superscript">12</style></DisplayText><record><rec-number>15488</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">15488</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Fusaro, V.A.</author><author>Mani, D.R.</author><author>Mesirov, J.P.</author><author>Carr, S.A.</author></authors></contributors><titles><title>Prediction of high-responding peptides for targeted protein assays by mass spectrometry</title><secondary-title>Nature Biotechnology</secondary-title></titles><pages>190-8</pages><volume>27</volume><number>2</number><dates><year>2009</year></dates><urls></urls></record></Cite></EndNote>12 PeptideAtlas ADDIN EN.CITE <EndNote><Cite><Author>PeptideAtlas</Author><Year>2010</Year><RecNum>14082</RecNum><DisplayText><style face="superscript">5, 16</style></DisplayText><record><rec-number>14082</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">14082</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>PeptideAtlas</author></authors></contributors><titles><title></title></titles><dates><year>2010</year></dates><urls></urls></record></Cite><Cite><Author>Deutsch</Author><Year>2008</Year><RecNum>14302</RecNum><record><rec-number>14302</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">14302</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Deutsch, E. W.</author><author>Lam, H.</author><author>Aebersold, R.</author></authors></contributors><titles><title>PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows.</title><secondary-title>EMBO Reports</secondary-title></titles><pages>429–434</pages><volume>9</volume><dates><year>2008</year></dates><urls></urls></record></Cite></EndNote>5, 16 has combined its database of observed peptides with PABST (Peptide Atlas Best SRM Transition) ADDIN EN.CITE <EndNote><Cite><Author>PABST</Author><RecNum>16323</RecNum><DisplayText><style face="superscript">17</style></DisplayText><record><rec-number>16323</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16323</key></foreign-keys><ref-type name="Web Page">12</ref-type><contributors><authors><author>PABST</author></authors></contributors><titles><title>Peptide Atlas Best SRM Transition tool</title></titles><volume>November 19, 2013</volume><number>November 19, 2013</number><dates></dates><publisher> to select the best peptides. Its score is based on in silico prediction and experimental MS observation, with the addition of a number of weighting factors such as reactive amino acids and peptide length criteria. However, the PABST tool can still, for example, rank peptides highly if they have multiple gene locations or contain cysteine and methionine residues if they are predicted to yield high ion signals, even though peptides that contain cysteine or methionine are undesirable in an MRM assay because these amino acids can easily undergo chemical modification such as oxidation. This makes them "non-unique" because multiple peptide forms can be produced, all of which would need to be monitored for accurate quantitation. Brusniak et al introduced Automated and Targeted Analysis with Quantitative SRM (ATAQS), ADDIN EN.CITE <EndNote><Cite><Author>Brusniak</Author><Year>2011</Year><RecNum>16324</RecNum><DisplayText><style face="superscript">18</style></DisplayText><record><rec-number>16324</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16324</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Brusniak, Mi-Youn K. </author><author>Kwok, Sung-Tat </author><author>Christiansen, Mark </author><author>Campbell, David </author><author>Reiter, Lukas </author><author>Picotti, Paola </author><author>Kusebauch, Ulrike </author><author>Ramos, Hector </author><author>Deutsch, Eric W. </author><author>Chen, Jingchun </author><author>Moritz, Robert L. </author><author>Aebersold, Ruedi </author></authors></contributors><titles><title>ATAQS: A computational software tool for high throughput transition optimization and validation for selected reaction monitoring mass spectrometry.</title><secondary-title>BMC Bioinformatics</secondary-title></titles><volume>12</volume><number>78</number><dates><year>2011</year></dates><urls></urls><electronic-resource-num>doi:10.1186/1471-2105-12-78</electronic-resource-num></record></Cite></EndNote>18 an analysis pipeline that organizes, generates, and verifies transition lists, and performs post-acquisition analysis. Although ATAQS integrates information from online repositories such as protein-protein interactions or cellular functions, this software does not take biological constraints into account or make use of available data on post-translational modifications, natural variants, biological protein processing and isoform variants during the peptide selection.Other approaches include MRMaid, ADDIN EN.CITE <EndNote><Cite><Author>Mead</Author><Year>2009</Year><RecNum>13907</RecNum><DisplayText><style face="superscript">19</style></DisplayText><record><rec-number>13907</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">13907</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Mead, Jennifer A. </author><author>Bianco, Luca </author><author>Ottone, Vanessa </author><author>Barton, Chris </author><author>Kay, Richard G. </author><author>Lilley, Kathryn S. </author><author>Bond, Nicholas J.</author><author>Bessant, Conrad</author></authors></contributors><titles><title>MRMaid, the web-based tool for designing multiple reaction monitoring ( MRM ) transitions.</title><secondary-title>Molecular and Cellular Proteomics</secondary-title></titles><pages>696-705</pages><volume>8 </volume><number>4</number><dates><year>2009</year></dates><urls></urls></record></Cite></EndNote>19 the GPMDB MRM Worksheet, ADDIN EN.CITE <EndNote><Cite><Author>Walsh</Author><Year>2009</Year><RecNum>13910</RecNum><DisplayText><style face="superscript">20</style></DisplayText><record><rec-number>13910</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">13910</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Walsh, Geraldine M.</author><author>Lin, Shujun </author><author>Evans, Daniel M. </author><author>Khosrovi-Eghbal, Arash </author><author>Beavis, Ronald C. </author><author>Kast, Juergen </author></authors></contributors><titles><title>Implementation of a data repository-driven approach for targeted proteomics experiments by multiple reaction monitoring</title><secondary-title>Journal of Proteomics</secondary-title></titles><pages>838-852</pages><volume>72</volume><dates><year>2009</year></dates><urls></urls></record></Cite></EndNote>20 SRM/MRMAtlas, ADDIN EN.CITE <EndNote><Cite><Author>Institute_for_Systems_Biology</Author><Year>2010</Year><RecNum>14735</RecNum><DisplayText><style face="superscript">21</style></DisplayText><record><rec-number>14735</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">14735</key></foreign-keys><ref-type name="Web Page">12</ref-type><contributors><authors><author>Institute_for_Systems_Biology</author></authors></contributors><titles><title>SRMAtlas</title></titles><dates><year>2010</year></dates><publisher></publisher><urls></urls></record></Cite></EndNote>21 TIQAM ADDIN EN.CITE <EndNote><Cite><Author>TIQAM</Author><RecNum>16328</RecNum><DisplayText><style face="superscript">22</style></DisplayText><record><rec-number>16328</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16328</key></foreign-keys><ref-type name="Web Page">12</ref-type><contributors><authors><author>TIQAM</author></authors></contributors><titles><title>TIQAM (Targeted Identification for Quantitative Analysis by MRM)</title></titles><volume>November 19, 2013</volume><number>November 19, 2013</number><dates></dates><publisher> (used by the previously mentioned ATAQS workflow), Skyline, ADDIN EN.CITE <EndNote><Cite><Author>Skyline_SRM/MRM_Builder</Author><Year>2011, update v0.7</Year><RecNum>13826</RecNum><DisplayText><style face="superscript">23</style></DisplayText><record><rec-number>13826</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">13826</key></foreign-keys><ref-type name="Web Page">12</ref-type><contributors><authors><author>Skyline_SRM/MRM_Builder</author></authors></contributors><titles><title>Skyline Targeted Proteomics Environment</title></titles><volume>2011</volume><number>2011</number><dates><year>2011, update v0.7</year><pub-dates><date>Skyline v0.7 Release Updated on 3/30/2011</date></pub-dates></dates><publisher> and TPP-MaRiMba ADDIN EN.CITE <EndNote><Cite><Author>Sherwood</Author><Year>2009</Year><RecNum>13906</RecNum><DisplayText><style face="superscript">24, 25</style></DisplayText><record><rec-number>13906</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">13906</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Sherwood, Carly </author><author>Eastham, Ashley </author><author>Peterson, Amelia </author><author>Eng, Jimmy K.</author><author>Shteynberg, David </author><author>Mendoza, Luis </author><author>Deutsch, Eric</author><author>Risler, Jenni </author><author>Lee, Lik Wee</author><author>Tasman, Natalie </author><author>Aebersold, Ruedi Lam, Henry </author><author>Martin, Daniel B </author></authors></contributors><titles><title>MaRiMba: a Software Application for Spectral Library-Based MRM Transition List Assembly</title><secondary-title>J. Proteome Res.</secondary-title></titles><pages>4396–4405</pages><volume>8</volume><number>10</number><dates><year>2009</year></dates><urls></urls></record></Cite><Cite><Author>Cham (Mead)</Author><Year>2010</Year><RecNum>16329</RecNum><record><rec-number>16329</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16329</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Cham (Mead), Jennifer A. </author><author>Bianco, Luca </author><author>Bessant, Conrad</author></authors></contributors><titles><title>Free computational resources for designing selected reaction monitoring transitions.</title><secondary-title>Proteomics</secondary-title></titles><pages>1106–1126</pages><volume>10</volume><number>6</number><dates><year>2010</year></dates><urls></urls><electronic-resource-num>doi: 10.1002/pmic.200900396</electronic-resource-num></record></Cite></EndNote>24, 25. These, however, are computational tools that focus primarily on the design, optimization, and selection of appropriate MRM transitions, rather than the selection of the precursor peptide. MRMaid does suggest peptide targets in addition to transitions for the submitted protein. ADDIN EN.CITE <EndNote><Cite><Author>Fan</Author><Year>2012</Year><RecNum>16330</RecNum><DisplayText><style face="superscript">26</style></DisplayText><record><rec-number>16330</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16330</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Fan, J. </author><author>Mohareb, F.</author><author>Jones, A.M.</author><author>Bessant, C.</author></authors></contributors><titles><title>MRMaid: The SRM Assay Design Tool for Arabidopsis and Other Species.</title><secondary-title>Frontiers in Plant Science</secondary-title></titles><volume>3</volume><number>164</number><dates><year>2012</year></dates><urls></urls><electronic-resource-num>doi: 10.3389/fpls.2012.00164.</electronic-resource-num></record></Cite></EndNote>26 The suggestion of ions to be monitored is based on EBI’s PRIDE spectral database ADDIN EN.CITE <EndNote><Cite><Author>PRIDE</Author><RecNum>16325</RecNum><DisplayText><style face="superscript">27</style></DisplayText><record><rec-number>16325</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16325</key></foreign-keys><ref-type name="Web Page">12</ref-type><contributors><authors><author>PRIDE</author></authors></contributors><titles><title>PRIDE (Proteomics Identifications Database)</title></titles><dates></dates><publisher> with scores given for the likelihood of observing a given peptide. ADDIN EN.CITE <EndNote><Cite><Author>Fan</Author><Year>2012</Year><RecNum>16331</RecNum><DisplayText><style face="superscript">28</style></DisplayText><record><rec-number>16331</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16331</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Fan, Jun </author><author>Mohareb, Fady </author><author>Bond, Nicholas J. </author><author>Lilley, Kathryn S.</author><author>Bessant, Conrad </author></authors></contributors><titles><title>MRMaid 2.0: Mining PRIDE for Evidence-Based SRM Transitions.</title><secondary-title>OMICS: A Journal of Integrative Biology</secondary-title></titles><pages>483-488</pages><volume>16</volume><number>9</number><dates><year>2012</year></dates><urls></urls><electronic-resource-num>MRMaid 2.0: Mining PRIDE for Evidence-Based SRM Transitions&#xD;To cite this article:&#xD;Jun Fan, Fady Mohareb, Nicholas J. Bond, Kathryn S. Lilley, and Conrad Bessant. OMICS: A Journal of Integrative Biology. September 2012, 16(9): 483-488. doi:10.1089/omi.2011.0143.&#xD;&#xD;Published in Volume: 16 Issue 9: September 10, 2012</electronic-resource-num></record></Cite></EndNote>28 The score also takes into consideration the suitability of an MRM transition, based on the frequency and intensity of the observed fragment ions of the specific peptide. The software checks selected peptides for uniqueness in a given proteome, and the user can specify peptide lengths, the specific MS instrument used, and can indicate LC conditions so that predicted retention times can be taken into account. The software can also filter peptides based on additional criteria such as unwanted amino acids, but it does not go any further in using criteria that would filter out peptides based on biological context information available from UniProt, such as post-translational modifications or the presence of natural variants. ADDIN EN.CITE <EndNote><Cite><Author>Mead</Author><Year>2009</Year><RecNum>13907</RecNum><DisplayText><style face="superscript">19</style></DisplayText><record><rec-number>13907</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">13907</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Mead, Jennifer A. </author><author>Bianco, Luca </author><author>Ottone, Vanessa </author><author>Barton, Chris </author><author>Kay, Richard G. </author><author>Lilley, Kathryn S. </author><author>Bond, Nicholas J.</author><author>Bessant, Conrad</author></authors></contributors><titles><title>MRMaid, the web-based tool for designing multiple reaction monitoring ( MRM ) transitions.</title><secondary-title>Molecular and Cellular Proteomics</secondary-title></titles><pages>696-705</pages><volume>8 </volume><number>4</number><dates><year>2009</year></dates><urls></urls></record></Cite></EndNote>19 Similarly, MRM Worksheet in gpmDB and MRMAtlas in PeptideAtlas suggests MRM transitions based on spectral information contained in their respective databases but actually does less in terms of suggesting suitable peptides for MRM assay design. ADDIN EN.CITE <EndNote><Cite><Author>Cham (Mead)</Author><Year>2010</Year><RecNum>16329</RecNum><DisplayText><style face="superscript">25, 29</style></DisplayText><record><rec-number>16329</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16329</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Cham (Mead), Jennifer A. </author><author>Bianco, Luca </author><author>Bessant, Conrad</author></authors></contributors><titles><title>Free computational resources for designing selected reaction monitoring transitions.</title><secondary-title>Proteomics</secondary-title></titles><pages>1106–1126</pages><volume>10</volume><number>6</number><dates><year>2010</year></dates><urls></urls><electronic-resource-num>doi: 10.1002/pmic.200900396</electronic-resource-num></record></Cite><Cite><Author>Vizcaíno</Author><Year>2010</Year><RecNum>16332</RecNum><record><rec-number>16332</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16332</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Vizcaíno, J.A.</author><author>Foster, J.M.</author><author>Martens, L.</author></authors></contributors><titles><title>Proteomics data repositories: providing a safe haven for your data and acting as a springboard for further research.</title><secondary-title>Journal of Proteomics</secondary-title></titles><pages>2136-46</pages><volume>73</volume><number>11</number><dates><year>2010</year></dates><urls></urls><electronic-resource-num>doi: 10.1016/j.jprot.2010.06.008. Epub 2010 Jul 6.</electronic-resource-num></record></Cite></EndNote>25, 29 Skyline, MaRiMba, and TIQAM/ATAQS focus more on the design of MRM transitions for selected sets of peptides and the empirical validation of this pool of peptides from user-generated experimental MRM data,PEVuZE5vdGU+PENpdGU+PEF1dGhvcj5CZXJlbWFuPC9BdXRob3I+PFllYXI+MjAxMjwvWWVhcj48

UmVjTnVtPjE2MzMzPC9SZWNOdW0+PERpc3BsYXlUZXh0PjxzdHlsZSBmYWNlPSJzdXBlcnNjcmlw

dCI+MTgsIDI0LCAzMDwvc3R5bGU+PC9EaXNwbGF5VGV4dD48cmVjb3JkPjxyZWMtbnVtYmVyPjE2

MzMzPC9yZWMtbnVtYmVyPjxmb3JlaWduLWtleXM+PGtleSBhcHA9IkVOIiBkYi1pZD0iZmZhczlw

cnQ4YXI1MGZlZGRyb3Z4YTIyeGQycmU1ZGF0cnBzIiB0aW1lc3RhbXA9IjAiPjE2MzMzPC9rZXk+

PC9mb3JlaWduLWtleXM+PHJlZi10eXBlIG5hbWU9IkpvdXJuYWwgQXJ0aWNsZSI+MTc8L3JlZi10

eXBlPjxjb250cmlidXRvcnM+PGF1dGhvcnM+PGF1dGhvcj5CZXJlbWFuLCBNLlMuPC9hdXRob3I+

PGF1dGhvcj5NYWNMZWFuLCBCLjwvYXV0aG9yPjxhdXRob3I+VG9tYXplbGEsIEQuTS48L2F1dGhv

cj48YXV0aG9yPkxpZWJsZXIsIEQuQy48L2F1dGhvcj48YXV0aG9yPk1hY0Nvc3MsIE0uSi48L2F1

dGhvcj48L2F1dGhvcnM+PC9jb250cmlidXRvcnM+PHRpdGxlcz48dGl0bGU+VGhlIGRldmVsb3Bt

ZW50IG9mIHNlbGVjdGVkIHJlYWN0aW9uIG1vbml0b3JpbmcgbWV0aG9kcyBmb3IgdGFyZ2V0ZWQg

cHJvdGVvbWljcyB2aWEgZW1waXJpY2FsIHJlZmluZW1lbnQuPC90aXRsZT48c2Vjb25kYXJ5LXRp

dGxlPlByb3Rlb21pY3M8L3NlY29uZGFyeS10aXRsZT48L3RpdGxlcz48cGFnZXM+MTEzNC00MTwv

cGFnZXM+PHZvbHVtZT4xMjwvdm9sdW1lPjxudW1iZXI+ODwvbnVtYmVyPjxkYXRlcz48eWVhcj4y

MDEyPC95ZWFyPjwvZGF0ZXM+PHVybHM+PC91cmxzPjxlbGVjdHJvbmljLXJlc291cmNlLW51bT5k

b2k6IDEwLjEwMDIvcG1pYy4yMDEyMDAwNDIuPC9lbGVjdHJvbmljLXJlc291cmNlLW51bT48L3Jl

Y29yZD48L0NpdGU+PENpdGU+PEF1dGhvcj5TaGVyd29vZDwvQXV0aG9yPjxZZWFyPjIwMDk8L1ll

YXI+PFJlY051bT4xMzkwNjwvUmVjTnVtPjxyZWNvcmQ+PHJlYy1udW1iZXI+MTM5MDY8L3JlYy1u

dW1iZXI+PGZvcmVpZ24ta2V5cz48a2V5IGFwcD0iRU4iIGRiLWlkPSJmZmFzOXBydDhhcjUwZmVk

ZHJvdnhhMjJ4ZDJyZTVkYXRycHMiIHRpbWVzdGFtcD0iMCI+MTM5MDY8L2tleT48L2ZvcmVpZ24t

a2V5cz48cmVmLXR5cGUgbmFtZT0iSm91cm5hbCBBcnRpY2xlIj4xNzwvcmVmLXR5cGU+PGNvbnRy

aWJ1dG9ycz48YXV0aG9ycz48YXV0aG9yPlNoZXJ3b29kLCBDYXJseSA8L2F1dGhvcj48YXV0aG9y

PkVhc3RoYW0sIEFzaGxleSA8L2F1dGhvcj48YXV0aG9yPlBldGVyc29uLCBBbWVsaWEgIDwvYXV0

aG9yPjxhdXRob3I+RW5nLCBKaW1teSBLLjwvYXV0aG9yPjxhdXRob3I+U2h0ZXluYmVyZywgRGF2

aWQgPC9hdXRob3I+PGF1dGhvcj5NZW5kb3phLCBMdWlzIDwvYXV0aG9yPjxhdXRob3I+RGV1dHNj

aCwgRXJpYzwvYXV0aG9yPjxhdXRob3I+UmlzbGVyLCAgSmVubmkgPC9hdXRob3I+PGF1dGhvcj5M

ZWUsIExpayBXZWU8L2F1dGhvcj48YXV0aG9yPlRhc21hbiwgTmF0YWxpZSA8L2F1dGhvcj48YXV0

aG9yPkFlYmVyc29sZCwgUnVlZGkgTGFtLCBIZW5yeSA8L2F1dGhvcj48YXV0aG9yPk1hcnRpbiwg

RGFuaWVsIEIgPC9hdXRob3I+PC9hdXRob3JzPjwvY29udHJpYnV0b3JzPjx0aXRsZXM+PHRpdGxl

Pk1hUmlNYmE6IGEgU29mdHdhcmUgQXBwbGljYXRpb24gZm9yIFNwZWN0cmFsIExpYnJhcnktQmFz

ZWQgTVJNIFRyYW5zaXRpb24gTGlzdCBBc3NlbWJseTwvdGl0bGU+PHNlY29uZGFyeS10aXRsZT5K

LiBQcm90ZW9tZSBSZXMuPC9zZWNvbmRhcnktdGl0bGU+PC90aXRsZXM+PHBhZ2VzPjQzOTbigJM0

NDA1PC9wYWdlcz48dm9sdW1lPjg8L3ZvbHVtZT48bnVtYmVyPjEwPC9udW1iZXI+PGRhdGVzPjx5

ZWFyPjIwMDk8L3llYXI+PC9kYXRlcz48dXJscz48L3VybHM+PC9yZWNvcmQ+PC9DaXRlPjxDaXRl

PjxBdXRob3I+QnJ1c25pYWs8L0F1dGhvcj48WWVhcj4yMDExPC9ZZWFyPjxSZWNOdW0+MTYzMjQ8

L1JlY051bT48cmVjb3JkPjxyZWMtbnVtYmVyPjE2MzI0PC9yZWMtbnVtYmVyPjxmb3JlaWduLWtl

eXM+PGtleSBhcHA9IkVOIiBkYi1pZD0iZmZhczlwcnQ4YXI1MGZlZGRyb3Z4YTIyeGQycmU1ZGF0

cnBzIiB0aW1lc3RhbXA9IjAiPjE2MzI0PC9rZXk+PC9mb3JlaWduLWtleXM+PHJlZi10eXBlIG5h

bWU9IkpvdXJuYWwgQXJ0aWNsZSI+MTc8L3JlZi10eXBlPjxjb250cmlidXRvcnM+PGF1dGhvcnM+

PGF1dGhvcj5CcnVzbmlhaywgTWktWW91biBLLiA8L2F1dGhvcj48YXV0aG9yPkt3b2ssIFN1bmct

VGF0IDwvYXV0aG9yPjxhdXRob3I+Q2hyaXN0aWFuc2VuLCBNYXJrIDwvYXV0aG9yPjxhdXRob3I+

Q2FtcGJlbGwsIERhdmlkIDwvYXV0aG9yPjxhdXRob3I+UmVpdGVyLCBMdWthcyA8L2F1dGhvcj48

YXV0aG9yPlBpY290dGksIFBhb2xhIDwvYXV0aG9yPjxhdXRob3I+S3VzZWJhdWNoLCBVbHJpa2Ug

PC9hdXRob3I+PGF1dGhvcj5SYW1vcywgIEhlY3RvciA8L2F1dGhvcj48YXV0aG9yPkRldXRzY2gs

IEVyaWMgVy4gPC9hdXRob3I+PGF1dGhvcj5DaGVuLCBKaW5nY2h1biA8L2F1dGhvcj48YXV0aG9y

Pk1vcml0eiwgUm9iZXJ0IEwuIDwvYXV0aG9yPjxhdXRob3I+QWViZXJzb2xkLCBSdWVkaSA8L2F1

dGhvcj48L2F1dGhvcnM+PC9jb250cmlidXRvcnM+PHRpdGxlcz48dGl0bGU+QVRBUVM6IEEgY29t

cHV0YXRpb25hbCBzb2Z0d2FyZSB0b29sIGZvciBoaWdoIHRocm91Z2hwdXQgdHJhbnNpdGlvbiBv

cHRpbWl6YXRpb24gYW5kIHZhbGlkYXRpb24gZm9yIHNlbGVjdGVkIHJlYWN0aW9uIG1vbml0b3Jp

bmcgbWFzcyBzcGVjdHJvbWV0cnkuPC90aXRsZT48c2Vjb25kYXJ5LXRpdGxlPkJNQyBCaW9pbmZv

cm1hdGljczwvc2Vjb25kYXJ5LXRpdGxlPjwvdGl0bGVzPjx2b2x1bWU+MTI8L3ZvbHVtZT48bnVt

YmVyPjc4PC9udW1iZXI+PGRhdGVzPjx5ZWFyPjIwMTE8L3llYXI+PC9kYXRlcz48dXJscz48L3Vy

bHM+PGVsZWN0cm9uaWMtcmVzb3VyY2UtbnVtPmRvaToxMC4xMTg2LzE0NzEtMjEwNS0xMi03ODwv

ZWxlY3Ryb25pYy1yZXNvdXJjZS1udW0+PC9yZWNvcmQ+PC9DaXRlPjwvRW5kTm90ZT4A

ADDIN EN.CITE PEVuZE5vdGU+PENpdGU+PEF1dGhvcj5CZXJlbWFuPC9BdXRob3I+PFllYXI+MjAxMjwvWWVhcj48

UmVjTnVtPjE2MzMzPC9SZWNOdW0+PERpc3BsYXlUZXh0PjxzdHlsZSBmYWNlPSJzdXBlcnNjcmlw

dCI+MTgsIDI0LCAzMDwvc3R5bGU+PC9EaXNwbGF5VGV4dD48cmVjb3JkPjxyZWMtbnVtYmVyPjE2

MzMzPC9yZWMtbnVtYmVyPjxmb3JlaWduLWtleXM+PGtleSBhcHA9IkVOIiBkYi1pZD0iZmZhczlw

cnQ4YXI1MGZlZGRyb3Z4YTIyeGQycmU1ZGF0cnBzIiB0aW1lc3RhbXA9IjAiPjE2MzMzPC9rZXk+

PC9mb3JlaWduLWtleXM+PHJlZi10eXBlIG5hbWU9IkpvdXJuYWwgQXJ0aWNsZSI+MTc8L3JlZi10

eXBlPjxjb250cmlidXRvcnM+PGF1dGhvcnM+PGF1dGhvcj5CZXJlbWFuLCBNLlMuPC9hdXRob3I+

PGF1dGhvcj5NYWNMZWFuLCBCLjwvYXV0aG9yPjxhdXRob3I+VG9tYXplbGEsIEQuTS48L2F1dGhv

cj48YXV0aG9yPkxpZWJsZXIsIEQuQy48L2F1dGhvcj48YXV0aG9yPk1hY0Nvc3MsIE0uSi48L2F1

dGhvcj48L2F1dGhvcnM+PC9jb250cmlidXRvcnM+PHRpdGxlcz48dGl0bGU+VGhlIGRldmVsb3Bt

ZW50IG9mIHNlbGVjdGVkIHJlYWN0aW9uIG1vbml0b3JpbmcgbWV0aG9kcyBmb3IgdGFyZ2V0ZWQg

cHJvdGVvbWljcyB2aWEgZW1waXJpY2FsIHJlZmluZW1lbnQuPC90aXRsZT48c2Vjb25kYXJ5LXRp

dGxlPlByb3Rlb21pY3M8L3NlY29uZGFyeS10aXRsZT48L3RpdGxlcz48cGFnZXM+MTEzNC00MTwv

cGFnZXM+PHZvbHVtZT4xMjwvdm9sdW1lPjxudW1iZXI+ODwvbnVtYmVyPjxkYXRlcz48eWVhcj4y

MDEyPC95ZWFyPjwvZGF0ZXM+PHVybHM+PC91cmxzPjxlbGVjdHJvbmljLXJlc291cmNlLW51bT5k

b2k6IDEwLjEwMDIvcG1pYy4yMDEyMDAwNDIuPC9lbGVjdHJvbmljLXJlc291cmNlLW51bT48L3Jl

Y29yZD48L0NpdGU+PENpdGU+PEF1dGhvcj5TaGVyd29vZDwvQXV0aG9yPjxZZWFyPjIwMDk8L1ll

YXI+PFJlY051bT4xMzkwNjwvUmVjTnVtPjxyZWNvcmQ+PHJlYy1udW1iZXI+MTM5MDY8L3JlYy1u

dW1iZXI+PGZvcmVpZ24ta2V5cz48a2V5IGFwcD0iRU4iIGRiLWlkPSJmZmFzOXBydDhhcjUwZmVk

ZHJvdnhhMjJ4ZDJyZTVkYXRycHMiIHRpbWVzdGFtcD0iMCI+MTM5MDY8L2tleT48L2ZvcmVpZ24t

a2V5cz48cmVmLXR5cGUgbmFtZT0iSm91cm5hbCBBcnRpY2xlIj4xNzwvcmVmLXR5cGU+PGNvbnRy

aWJ1dG9ycz48YXV0aG9ycz48YXV0aG9yPlNoZXJ3b29kLCBDYXJseSA8L2F1dGhvcj48YXV0aG9y

PkVhc3RoYW0sIEFzaGxleSA8L2F1dGhvcj48YXV0aG9yPlBldGVyc29uLCBBbWVsaWEgIDwvYXV0

aG9yPjxhdXRob3I+RW5nLCBKaW1teSBLLjwvYXV0aG9yPjxhdXRob3I+U2h0ZXluYmVyZywgRGF2

aWQgPC9hdXRob3I+PGF1dGhvcj5NZW5kb3phLCBMdWlzIDwvYXV0aG9yPjxhdXRob3I+RGV1dHNj

aCwgRXJpYzwvYXV0aG9yPjxhdXRob3I+UmlzbGVyLCAgSmVubmkgPC9hdXRob3I+PGF1dGhvcj5M

ZWUsIExpayBXZWU8L2F1dGhvcj48YXV0aG9yPlRhc21hbiwgTmF0YWxpZSA8L2F1dGhvcj48YXV0

aG9yPkFlYmVyc29sZCwgUnVlZGkgTGFtLCBIZW5yeSA8L2F1dGhvcj48YXV0aG9yPk1hcnRpbiwg

RGFuaWVsIEIgPC9hdXRob3I+PC9hdXRob3JzPjwvY29udHJpYnV0b3JzPjx0aXRsZXM+PHRpdGxl

Pk1hUmlNYmE6IGEgU29mdHdhcmUgQXBwbGljYXRpb24gZm9yIFNwZWN0cmFsIExpYnJhcnktQmFz

ZWQgTVJNIFRyYW5zaXRpb24gTGlzdCBBc3NlbWJseTwvdGl0bGU+PHNlY29uZGFyeS10aXRsZT5K

LiBQcm90ZW9tZSBSZXMuPC9zZWNvbmRhcnktdGl0bGU+PC90aXRsZXM+PHBhZ2VzPjQzOTbigJM0

NDA1PC9wYWdlcz48dm9sdW1lPjg8L3ZvbHVtZT48bnVtYmVyPjEwPC9udW1iZXI+PGRhdGVzPjx5

ZWFyPjIwMDk8L3llYXI+PC9kYXRlcz48dXJscz48L3VybHM+PC9yZWNvcmQ+PC9DaXRlPjxDaXRl

PjxBdXRob3I+QnJ1c25pYWs8L0F1dGhvcj48WWVhcj4yMDExPC9ZZWFyPjxSZWNOdW0+MTYzMjQ8

L1JlY051bT48cmVjb3JkPjxyZWMtbnVtYmVyPjE2MzI0PC9yZWMtbnVtYmVyPjxmb3JlaWduLWtl

eXM+PGtleSBhcHA9IkVOIiBkYi1pZD0iZmZhczlwcnQ4YXI1MGZlZGRyb3Z4YTIyeGQycmU1ZGF0

cnBzIiB0aW1lc3RhbXA9IjAiPjE2MzI0PC9rZXk+PC9mb3JlaWduLWtleXM+PHJlZi10eXBlIG5h

bWU9IkpvdXJuYWwgQXJ0aWNsZSI+MTc8L3JlZi10eXBlPjxjb250cmlidXRvcnM+PGF1dGhvcnM+

PGF1dGhvcj5CcnVzbmlhaywgTWktWW91biBLLiA8L2F1dGhvcj48YXV0aG9yPkt3b2ssIFN1bmct

VGF0IDwvYXV0aG9yPjxhdXRob3I+Q2hyaXN0aWFuc2VuLCBNYXJrIDwvYXV0aG9yPjxhdXRob3I+

Q2FtcGJlbGwsIERhdmlkIDwvYXV0aG9yPjxhdXRob3I+UmVpdGVyLCBMdWthcyA8L2F1dGhvcj48

YXV0aG9yPlBpY290dGksIFBhb2xhIDwvYXV0aG9yPjxhdXRob3I+S3VzZWJhdWNoLCBVbHJpa2Ug

PC9hdXRob3I+PGF1dGhvcj5SYW1vcywgIEhlY3RvciA8L2F1dGhvcj48YXV0aG9yPkRldXRzY2gs

IEVyaWMgVy4gPC9hdXRob3I+PGF1dGhvcj5DaGVuLCBKaW5nY2h1biA8L2F1dGhvcj48YXV0aG9y

Pk1vcml0eiwgUm9iZXJ0IEwuIDwvYXV0aG9yPjxhdXRob3I+QWViZXJzb2xkLCBSdWVkaSA8L2F1

dGhvcj48L2F1dGhvcnM+PC9jb250cmlidXRvcnM+PHRpdGxlcz48dGl0bGU+QVRBUVM6IEEgY29t

cHV0YXRpb25hbCBzb2Z0d2FyZSB0b29sIGZvciBoaWdoIHRocm91Z2hwdXQgdHJhbnNpdGlvbiBv

cHRpbWl6YXRpb24gYW5kIHZhbGlkYXRpb24gZm9yIHNlbGVjdGVkIHJlYWN0aW9uIG1vbml0b3Jp

bmcgbWFzcyBzcGVjdHJvbWV0cnkuPC90aXRsZT48c2Vjb25kYXJ5LXRpdGxlPkJNQyBCaW9pbmZv

cm1hdGljczwvc2Vjb25kYXJ5LXRpdGxlPjwvdGl0bGVzPjx2b2x1bWU+MTI8L3ZvbHVtZT48bnVt

YmVyPjc4PC9udW1iZXI+PGRhdGVzPjx5ZWFyPjIwMTE8L3llYXI+PC9kYXRlcz48dXJscz48L3Vy

bHM+PGVsZWN0cm9uaWMtcmVzb3VyY2UtbnVtPmRvaToxMC4xMTg2LzE0NzEtMjEwNS0xMi03ODwv

ZWxlY3Ryb25pYy1yZXNvdXJjZS1udW0+PC9yZWNvcmQ+PC9DaXRlPjwvRW5kTm90ZT4A

ADDIN EN.CITE.DATA 18, 24, 30. Although these tools can access information stored in spectral databases and can point out frequently observed fragment ions and peptides, the selection of the “best” peptide based on biological criteria is limited. A search of PeptideAtlas displays the occurrences of SNPs and post-translationally modified residues in peptides, but their presence is not taken into consideration when calculating the different peptide suitability scores. This should be considered as an indicator of a peptide's suitability for an MRM assay, but it is not considered at this time by these software packages. Clearly there is a need for an automated software package that takes the biological criteria into consideration during the peptide-selection stage of the MRM assay development. Our workflow integrates information from different data sources and builds on the information stored in UniProt, NCBI’s dbSNP, GPM, and PeptideAtlas. Our method considers the post-translational modifications, natural variants, and post-translational peptide chain processing as well as allowing or excluding specific amino acids. Our software also takes into account other scores, including the PeptideAtlas Empirical and Predicted Suitability Scores. ADDIN EN.CITE <EndNote><Cite><Author>Deutsch</Author><Year>2008</Year><RecNum>14302</RecNum><DisplayText><style face="superscript">16</style></DisplayText><record><rec-number>14302</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">14302</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Deutsch, E. W.</author><author>Lam, H.</author><author>Aebersold, R.</author></authors></contributors><titles><title>PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows.</title><secondary-title>EMBO Reports</secondary-title></titles><pages>429–434</pages><volume>9</volume><dates><year>2008</year></dates><urls></urls></record></Cite></EndNote>16 Because it is not limited to these sources, our software can be easily extended to consider additional data repositories, including the user's own peptide libraries. The ability to extend or reduce the workflow allows our method to match the demands of a specific laboratory or experiment. The reproducibility of the results is also an important part of method development and is facilitated by the use of Tavern as a workflow engine, as well as by the possibility of archiving all of the data retrieved from online repositories for later inspection. RESULTS Scientific workflowsFigure 1 shows the Taverna modular workflow that implements the peptide selection logic. The workflow requires a list of protein accession numbers as input, along with the target organism and specific selection criteria if they are different from the default values. Currently the tool supports human and mouse proteomes, but can be easily extended to include additional organisms. The software incorporates different tools to perform different steps: finding protein isoforms, in silico tryptic digestion, peptide filtering, peptide and protein information retrieval from online repositories, peptide selection criteria enforcement, and report generation. The workflow compares the peptides passing the selection criteria with the online contents of Global Proteome Machine and PeptideAtlas and indicates whether or not the peptides have been previously observed. All processing modules run locally, and all of the data used to generate the final list is stored for later manual or automated inspection if needed. The software mainly uses three technologies: Java, ADDIN EN.CITE <EndNote><Cite><Author>Arnold</Author><Year>2005</Year><RecNum>16340</RecNum><DisplayText><style face="superscript">31</style></DisplayText><record><rec-number>16340</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="1386857038">16340</key></foreign-keys><ref-type name="Book">6</ref-type><contributors><authors><author>Ken Arnold</author><author>James Gosling</author><author>David Holmes</author></authors></contributors><titles><title>The Java Programming Language</title></titles><edition>4</edition><dates><year>2005</year></dates><publisher>Addison-Wesley Professional</publisher><urls></urls></record></Cite></EndNote>31 R statistical language, ADDIN EN.CITE <EndNote><Cite><Author>Urbanek</Author><RecNum>184</RecNum><DisplayText><style face="superscript">32</style></DisplayText><record><rec-number>184</rec-number><foreign-keys><key app="EN" db-id="902fd0xsnx999oef9d6pd5twtdzzd9r9r59e">184</key></foreign-keys><ref-type name="Web Page">12</ref-type><contributors><authors><author>Simon Urbanek</author></authors></contributors><titles><title>Rserve - Binary R server, Rserve/index.html</title></titles><number>January 21, 2013</number><dates></dates><urls><related-urls><url>Rserve/index.html</url></related-urls></urls></record></Cite></EndNote>32 and XML Path Language (XPath). ADDIN EN.CITE <EndNote><Cite><Author>Clark</Author><RecNum>16339</RecNum><DisplayText><style face="superscript">33</style></DisplayText><record><rec-number>16339</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="1386856279">16339</key></foreign-keys><ref-type name="Web Page">12</ref-type><contributors><authors><author>James Clark</author><author>Steve DeRose</author></authors></contributors><titles><title>XML Path Language (XPath)</title></titles><volume>2013</volume><dates></dates><urls><related-urls><url> Java Beanshell consists mainly of command-line calls for the needed software with the desired parameters, which also include connecting and downloading data from online repositories. Statistical analysis modules are written using R and are run locally. XPath parsing modules extract information snippets from XML files and report them as lists. Peptide Acceptance Criteria Surrogate proteotypic peptides should be efficiently liberated during trypsin digestion, free of PTMs and variants, and unique within the target proteome. The peptide selection can, however, be based on more stringent criteria, which are summarized below:Peptide must be unique in the proteome of the studied organism. To ensure correct quantitation of proteins, the representative surrogate peptides must appear only once in the proteome. We consider a peptide unique to a protein if it appears only once in the UniProt database of reviewed ADDIN EN.CITE <EndNote><Cite><Author>Magrane</Author><Year>2011</Year><RecNum>16343</RecNum><DisplayText><style face="superscript">34</style></DisplayText><record><rec-number>16343</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="1387320484">16343</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Magrane, M.</author><author>Consortium, U.</author></authors></contributors><auth-address>European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK. magrane@ebi.ac.uk</auth-address><titles><title>UniProt Knowledgebase: a hub of integrated protein data</title><secondary-title>Database (Oxford)</secondary-title><alt-title>Database : the journal of biological databases and curation</alt-title></titles><periodical><full-title>Database (Oxford)</full-title><abbr-1>Database : the journal of biological databases and curation</abbr-1></periodical><alt-periodical><full-title>Database (Oxford)</full-title><abbr-1>Database : the journal of biological databases and curation</abbr-1></alt-periodical><pages>bar009</pages><volume>2011</volume><keywords><keyword>Amino Acid Sequence</keyword><keyword>*Databases, Protein</keyword><keyword>*Knowledge Bases</keyword><keyword>Molecular Sequence Annotation</keyword><keyword>Protein Binding</keyword><keyword>Proteins/*chemistry</keyword><keyword>Sequence Analysis, Protein</keyword></keywords><dates><year>2011</year></dates><isbn>1758-0463 (Electronic)</isbn><accession-num>21447597</accession-num><urls><related-urls><url> proteins for the organism under study.Previously observed peptides. PeptideAtlas and gpmDB collect experimental MS data and are therefore very good sources of information regarding peptides that can have sufficient sensitivity to be detected in a mass spectrometer. Peptides in these repositories that match all of the other obligatory criteria are therefore favored, as it is most likely that they will be observed in an MS experiment. PeptideAtlas ranks observed peptides using the Empirical Suitability Score (ESS), which is derived from the peptide identification probability, Empirical Observability Score (EOS), which is a measure of in how many samples a particular peptide is seen, relative to other peptides from the same protein and the number of times the peptide is observed, which is the total number of observations in all modified forms and charge states. The ESS is also adjusted for unfavorable sequence characteristics such as semi-tryptic or missed cleavage sites, or multiple genome locations. For peptides with no or a low number of observations in its database, PeptideAtlas reports a Predicted Suitability Score (PSS), which is derived by combining publicly available algorithms (Peptide Sieve, ADDIN EN.CITE <EndNote><Cite><Author>PeptideSieve</Author><RecNum>16334</RecNum><DisplayText><style face="superscript">35</style></DisplayText><record><rec-number>16334</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16334</key></foreign-keys><ref-type name="Web Page">12</ref-type><contributors><authors><author>PeptideSieve</author></authors></contributors><titles><title>PeptideSieve</title></titles><volume>November 19, 2013</volume><number>November 19, 2013</number><dates></dates><publisher> STEPP, ADDIN EN.CITE <EndNote><Cite><Author>STEPP</Author><RecNum>16335</RecNum><DisplayText><style face="superscript">36</style></DisplayText><record><rec-number>16335</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16335</key></foreign-keys><ref-type name="Web Page">12</ref-type><contributors><authors><author>STEPP</author></authors></contributors><titles><title>SVM Technique for Evaluating Proteotypic Peptides (STEPP) </title></titles><volume>November 19, 2013</volume><number>November 19, 2013</number><dates></dates><publisher> ESPP, ADDIN EN.CITE <EndNote><Cite><Author>Fusaro</Author><Year>2009</Year><RecNum>15488</RecNum><DisplayText><style face="superscript">12</style></DisplayText><record><rec-number>15488</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">15488</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Fusaro, V.A.</author><author>Mani, D.R.</author><author>Mesirov, J.P.</author><author>Carr, S.A.</author></authors></contributors><titles><title>Prediction of high-responding peptides for targeted protein assays by mass spectrometry</title><secondary-title>Nature Biotechnology</secondary-title></titles><pages>190-8</pages><volume>27</volume><number>2</number><dates><year>2009</year></dates><urls></urls></record></Cite></EndNote>12 APEX, ADDIN EN.CITE <EndNote><Cite><Author>Braisted</Author><Year>2008</Year><RecNum>13933</RecNum><DisplayText><style face="superscript">37</style></DisplayText><record><rec-number>13933</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">13933</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Braisted, John C </author><author>Kuntumalla, Srilatha </author><author>Vogel, Christine </author><author>Marcotte, Edward M </author><author>Rodrigues, Alan R</author><author>Wang, Rong </author><author>Huang, Shih-Ting </author><author>Ferlanti, Erik S </author><author>Saeed, Alexander I </author><author>Fleischmann, Robert D </author><author>Peterson, Scott N </author><author>Pieper, Rembert </author></authors></contributors><titles><title>The APEX Quantitative Proteomics Tool: Generating protein quantitation estimates from LC-MS/MS proteomics results</title><secondary-title>BMC Bioinformatics</secondary-title></titles><pages>529</pages><volume>9</volume><dates><year>2008</year></dates><urls></urls></record></Cite></EndNote>37 Detectability Predictor, ADDIN EN.CITE <EndNote><Cite><Author>Wedge</Author><Year>2007</Year><RecNum>16336</RecNum><DisplayText><style face="superscript">38</style></DisplayText><record><rec-number>16336</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16336</key></foreign-keys><ref-type name="Conference Proceedings">10</ref-type><contributors><authors><author>Wedge, David C. </author><author>Gaskell, Simon J. </author><author>Hubbard, Simon J.</author><author>Kell, Douglas B. </author><author>Lau, King Wai </author><author>Eyers, Claire </author></authors><secondary-authors><author>Thierens, Dirk</author><author>Beyer, Hans-Georg</author><author>Bongard, Josh</author><author>Branke, Jurgen</author><author>Clark, John Andrew</author><author>Cliff, Dave</author><author>Congdon, Clare Bates </author><author>Deb, Kalyanmoy </author><author>Doerr, Benjamin</author><author>Kovacs Tim</author><author>Kumar, Sanjeev </author><author>Miller, Julian F.</author><author>Moore, Jason</author><author>Neumann, Frank </author><author>Pelikan, Martin</author><author>Poli, Riccardo</author><author>Sastry, Kumara </author><author>Stanley, Kenneth Owen</author><author>Stutzle, Thomas</author><author>Watson, Richard A.</author><author>Wegener, Ingo </author></secondary-authors></contributors><titles><title>Peptide detectability following ESI mass spectrometry: prediction using genetic programming.</title><secondary-title>9th annual conference on Genetic and evolutionary computation (GECCO)</secondary-title></titles><pages>2219-2225</pages><volume>2</volume><dates><year>2007</year></dates><pub-location>London</pub-location><publisher>ACM Press</publisher><urls></urls><electronic-resource-num>doi:10.1145/1276958.1277382</electronic-resource-num></record></Cite></EndNote>38) to define the likelihood of observing a peptide for a protein. Our workflow considers whether a peptide is in gpmDB or PeptideAtlas, and whether a peptide has ESS or PSS scores or not, and integrates this information into the final selection of peptides. in silico predicted efficiency of trypsin digestion. We use the ExPASy PeptideCutter with the “sophisticated” model for trypsin in order to estimate the digestion efficiency. The cleavage probabilities in this model are based on a statistical model derived from the charts published by Keil. ADDIN EN.CITE <EndNote><Cite><Author>Keil</Author><Year>1992</Year><RecNum>15474</RecNum><DisplayText><style face="superscript">39</style></DisplayText><record><rec-number>15474</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">15474</key></foreign-keys><ref-type name="Book">6</ref-type><contributors><authors><author>Keil, B.</author></authors></contributors><titles><title>Specificity of proteolysis</title></titles><pages>335</pages><dates><year>1992</year></dates><pub-location>Berlin-Heidelberg-NewYork</pub-location><publisher>Springer-Verlag</publisher><urls></urls></record></Cite></EndNote>39 Although there are different studies and systems that outperform the Kiel model, this study has been the main reference model for comparing with newer methods. Due to the modularity of our software, different digestion rules or models can be included. At the implementation level, in order to consider possible peptides with missed cleavages, our tool runs the ExPASy PeptideCutter with increasing digestion efficiencies set at a minimum of 10% to a maximum of 98% , and includes all of the resulting peptides if they obey the other obligatory criteria. Peptide length. For specificity and for the efficient synthesis of the standard peptides used for accurate quantitation, the software considers peptides between 7 and 20 residues long, by default. Short peptides (<6) are less likely to be unique and might have protonated regions masked by the solvent in MS analysis. The upper limit of 20 to 25 residues, is the practical limit of efficient peptide synthesis, as well, larger peptides ionize and fragment less efficiently and their large fragment ions will fall outside the range of most triplequadrupole analyzers. In our software, the upper and lower limits can be specified by the user as inputs.Existence of the peptides in all of the isoforms of a protein. Depending on the context of the experiment, peptides can exist in all isoforms, in the canonical sequence only, or only in one single isoform. Our workflow supports all three cases, in that the user chooses the corresponding UniProt accession number. For example in case of Nuclear receptor coactivator 3 (UniProt accession number Q9Y6Q9) which has 5 isoforms, the software accepts Q9Y6Q9 as an input if the user requires a peptide which can measure all 5 isoforms, or Q9Y6Q9-1, Q9Y6Q9-2, Q9Y6Q9-3, Q9Y6Q9-4, or Q9Y6Q9-5 if only the canonical sequence (Q9Y6Q9-1) or any of the other isoforms (-2 to -5) is the particular isoform that needs to be measured. If the canonical sequence or only one isoform is selected, only that specific sequence is considered and any peptides shared with other isoforms are considered not unique. The user can control the behavior of this selection criterion by switching off the corresponding input. In this way peptides which are not in all isoforms of the protein can also be reported. This allows all of the observed peptides in MS/MS databases to be considered in the selection process, and therefore suggest the most sensitive MRM peptides, but the top selected peptides might not cover all existing isoforms of the protein. Protein biological annotation. When selecting and scoring the peptides, the software takes into account the biological information regarding protein annotation present in the UniProt knowledge database. Ideally, peptides should not be present in the signal peptide or propeptide sequences which will be cleaved off in the mature form of the protein. Considerations also need to be taken with regard to the processing of the mature form of the protein into multiple chains in order to avoid selecting a peptide which might be split between two separate processed chains. Our software extracts all annotations about a protein available in UniProt and labels any peptide which is annotated, for example if it is included in or overlaps with a signal or a propeptide. The user can also choose to include only those peptides which are part of the mature form of the protein in the final report.Absence of methionine (M) and cysteine (C). Both M and C are highly susceptible to oxidation. Oxidation of M to sulfoxide (+16 Da mass shift) and sulfone derivatives (+32 Da mass shift) will split the total peptide MRM signal over multiple peptide forms therefore reducing the assay sensitivity and requiring MRM quantitation of all of the different species. Oxidation of C can also form insoluble aggregates through the formation of disulfide bonds. Preference given to those peptides which do not contain tryptophan (W), which like M, is prone to oxidation but to a much lesser extent.Absence of N-terminal glutamine (Q). Under acidic conditions, Q on the N-terminal cyclizes to form Pyroglutamate. ADDIN EN.CITE <EndNote><Cite><Author>Lai</Author><Year>1999</Year><RecNum>16337</RecNum><DisplayText><style face="superscript">40</style></DisplayText><record><rec-number>16337</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16337</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Lai, M.C.</author><author>Topp, E.M.</author></authors></contributors><titles><title>Solid-state chemical stability of proteins and peptides.</title><secondary-title>Journal of Pharmaceutical Science</secondary-title></titles><pages>489-500</pages><volume>88</volume><number>5</number><dates><year>1999</year></dates><urls></urls></record></Cite><Cite><Author>Lai</Author><Year>1999</Year><RecNum>16337</RecNum><record><rec-number>16337</rec-number><foreign-keys><key app="EN" db-id="ffas9prt8ar50feddrovxa22xd2re5datrps" timestamp="0">16337</key></foreign-keys><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Lai, M.C.</author><author>Topp, E.M.</author></authors></contributors><titles><title>Solid-state chemical stability of proteins and peptides.</title><secondary-title>Journal of Pharmaceutical Science</secondary-title></titles><pages>489-500</pages><volume>88</volume><number>5</number><dates><year>1999</year></dates><urls></urls></record></Cite></EndNote>40Absence of aspartic acid paired with proline or glycine (DP and DG). These combinations can undergo hydrolysis and cause peptide cleavage under acidic conditions. Absence of asparagine-glycine (NG) and glutamine-glycine (QG). These are prone to deamidation at neutral and alkaline pH.Absence of sequential proline (P) and serine (S) residues. P or S strings can cause significant deletions during peptide synthesis resulting in low yield. This is especially important for proline residues which can undergo cis/trans isomerization and reduce peptide purity.Absence of significant natural variants. These are non-homologous single nucleotide polymorphisms (SNPs) that occur in the studied population at a high enough frequency to affect the biological question of interest. Amino acid substitutions that occur at a frequency higher than 1% are considered as polymorphisms. Peptides containing such substitutions, or flanking sites that will have affected lysine (K) or arginine (R) residues (for tryptic cleavage) are considered during peptide selection. Natural variants such as disease mutations are usually a less frequently observed issue, however, depending on the population being studied the significance of such modifications can be considered when choosing peptides for MRM analysis. The software checks the frequency of these allele changes for each SNP in the NCBI database of SNPs and considers it in scoring the peptides. Absence of significant post-translationally modified residues (PTMs). PTMs, such as phosphorylation, glycosylation, and acetylation, need to be avoided on the peptide of interest as these, like with oxidation, would mean that different forms of the same peptide would need to be monitored for an accurate quantitation. Such modifications need also to be avoided on the flanking K/R residues (next to the N-terminal of peptide of interest) because they will affect tryptic cleavage of that peptide. Depending on their frequency, modifications may affect the production of the tryptic peptide and therefore affect its MRM measurement. In general, a possible PTM site should be avoided, but if the choice of peptide is limited, the frequency of the occurrence of the PTM may need to be considered. The probability of a PTM’s occurrence can be roughly estimated by the number of literature references supporting the modification (as indicated in the UniProtKB database, for instance). PTMs with supporting experimental evidence are more significant than "potential" modification sites suggested solely on the basis of sequence motifs such as NXT/S for N-glycosylation, for example. Our tool can be set to avoid any referenced and potential modifications, or to penalize peptides containing these during the score calculation. The tool penalizes peptides containing modifications with referenced evidence more than potential sites based on non-experimental data as indicated in UniProt. These criteria can be modified and readjusted according to the needs of the experiment. The workflow contains two filtering processors and these can be modified to include, exclude, or perform additional advanced selection criteria testing ScoringFor the purpose of ranking the peptides, we have introduced a simple scoring mechanism that combines all of the selection criteria – the v-score. This scoring system is not meant to be an absolute measure of how suitable the peptide is for an MRM experiment -- it is a subjective scoring mechanism that assists the researcher by allowing him to see the effect of the selection criteria on the choice of peptides, thereby enabling him to adjust his choices by putting higher emphasis on desirable criteria, and reducing unnecessary criteria. The current v-score is specific to the protein -- it is not meant to compare peptides across proteins, but only within a single protein. It is a reproducible value and is computed as the sum of the matrix entry times the corresponding weight: vscoreP=iMP,i×wiwhere vscoreP is the score of the peptide P, M is the matrix of all peptides and criteria fulfillment (see Table 1) and w is the weighting (see Table 2).Testing and validation We asked a researcher with expertise in selecting peptides to evaluate 55 proteins. The average speed of the peptide selection by the expert was 8 proteins per day. In comparison, our software was able to compile the results for the same 50 proteins within around 1 hour , running on a local workstation with Xeon E5-1630 processor and 4 GB of memory. The manually selected peptides were almost always a subset of the peptides chosen by the software. Differences can be related to human error, disregarding some selection criteria (for instance information about natural variants from dbSNP data), intended toleration in choosing peptides (for instance not considering the existence of peptides in all isoforms), or the availability of new data that alter a choice made in the past (for instance adding new reviewed proteins in UniProt). The most important difference is however the order of the peptides. The software would order the peptide reflecting the weightings specified in Table 2 with most appropriate peptide on the top. The manually chosen peptides did not have any ordering in regard to which peptide is more/less favorable. A key difference is inherent in our method because of its design as a scientific workflow -- it can be scaled up and run in parallel on multiple levels. Thus, not only can different inputs (proteins) be run in parallel, but also different tests and data/information retrieval steps of a single input. To test the performance of the tool we ran a list of the 254 proteins encoded by human chromosome 21 from UniProt build June 2013. Our tool reported peptides for 248 proteins out of 254 in 28 hours. 6 proteins failed because of errors in the online repositories we are dependent on (mainly ExPASy PeptideCutter failed in case of very short proteins and incomplete protein annotations in UniProt). Approximately 40 proteins were reported as having no matching peptides. For these proteins, a second iteration could have be performed with loosened selection criteria -- for instance, peptides containing 6-25 amino acids could have been selected instead of 7-20 amino acids. We believe that this 28 hours could be reduced by using a faster server and by running more than 20 requests in parallel. However, most of the processing time is spent in retrieving the results of the different queries sent to the external repositories. A list of proteins of a specific chromosome, disease, or pathway can be done likewise in a similar way. To test the robustness of the tool and the scale-up ability, we ran a list of 1141 proteins encoded by human chromosome 6, CONCLUSIONSHere we present a method and a scientific workflow for generating a list of the most appropriate surrogate peptides for target proteins to be analyzed by LC/MRM-MS. Before the development of this software, generating these lists was a cumbersome process, in which experts retrieved information from different online repositories and used their own reasoning to find the most appropriate peptides. Our scientific workflow integrates information from different data sources including UniProt, Global Proteome Machine, NCBI’s dbSNP, and PeptideAtlas, but is not limited to these sources. Our software can be extended to consider additional data repositories and the user's own peptide libraries. The workflow can easily be adjusted to specific laboratory demands, such as choosing shorter peptides, putting emphasis on different selection criteria, or including more (or own instrument specific) data repositories, which makes the method general and useful for a wide audience. All data retrieved from online repositories are stored locally for a later verification or inspection of the reported results if needed. In order to make the software accessible to a wide audience, we have implemented a Web interface where the user can run the software from the browser. The results of the scientific workflow are only as good as the data available in the repositories. Our tool in the current version does not handle malformed XML files generated by online repositories or erroneous HTML results generated by the Web tools used. In such situations, the tool can still be useful as an indicator of these cases, where advanced expertise is needed to manually select appropriate peptides for the proteins of interest. The scientific workflow and all code used in this work are available under the Creative Commons Attribution-ShareAlike 3.0 Unported License (CC BY-SA) on bioinfomratics/MRMPeptidePicker/. Access to the online implementation of the tool is available at MRMPeptidePicker/.ASSOCIATED CONTENTAUTHOR INFORMATIONCorresponding Authors*Yassene Mohammed: Phone: +31-71-5268745. Fax: +31-71-5266907. E-mail: yassene@ or y.mohammed@lumc.nl.*Magnus Palmblad: Phone: +31-71-5269526. Fax: +31-71-5266907. E-mail: n.m.palmblad@lumc.nl*Christoph Borchers: Phone: +1-250-4833221. Fax: +1-250-483-3238 E-mail: christoph@ Competing Financial InterestsThe authors declare no competing financial interest.ACKNOWLEDGMENTSWe are grateful to Genome Canada and Genome BC for providing Science and Technology Innovation Centre funding and support for the UVic-Genome BC Proteomics Centre. This work was also supported by the Dutch Organization for Scientific Research (De Nederlandse Organisatie voor Wetenschappelijk Onderzoek, NWO) grants NRG-2010.06 and VI-917.11.398. REFERENCES ADDIN EN.REFLIST 1.Percy, A. J.; Chambers, A. G.; Yang, J.; Hardie, D.; Borchers, C. H., Advances in Multiplexed MRM-based Protein Biomarker Quantitation Toward Clinical Utility. Biochimica et Biophysica Acta 2013, pii: S1570-9639(13)00239-2. .2.UniProt_Consortium, The Universal Protein Resource (UniProt) 2009. Nucleic Acids Research 2009, 37, D169–D174.3.Sherry, S. T.; Ward, M.; Sirotkin, K., dbSNP-database for single nucleotide polymorphisms and other classes of minor genetic variation. Genome research 1999, 9, (8), 677-9.4.ExPASy_Bioinformatics_Portal Peptide Cutter. (July 2012), 5.PeptideAtlas, . 2010.6.The_Global_Proteome_Machine_Organization The Global Proteome Machine. (November 19, 2013), 7.Oinn, T.; Addis, M.; Ferris, J.; Marvin, D.; Senger, M.; Greenwood, M.; Carver, T.; Glover, K.; Pocock, M. R.; Wipat, A.; Li, P., Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 2004, 20, (17), 3045-3054.8.Goecks, J.; Nekrutenko, A.; Taylor, J., Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol 2010, 11, (8), R86.9.Maheshwari, K.; Montagnat, J. In Scientific Workflow Development Using Both Visual and Script-Based Representation, Services (SERVICES-1), 2010 6th World Congress on, 5-10 July 2010, 2010; 2010; pp 328-335.10.Altintas, I.; Berkley, C.; Jaeger, E.; Jones, M.; Ludascher, B.; Mock, S., Kepler: An Extensible System for Design and Execution of Scientific Workflows. In Proceedings of the 16th International Conference on Scientific and Statistical Database Management, IEEE Computer Society: 2004; p 423.11.Langley, G. J.; Herniman, J. M.; Davies, N. L.; Brown, R., Simplified sample preparation for the analysis of oligonucleotides by matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry. Rapid Commun. Mass Spectrom. 1999, 13, 1717-1723.12.Fusaro, V. A.; Mani, D. R.; Mesirov, J. P.; Carr, S. A., Prediction of high-responding peptides for targeted protein assays by mass spectrometry. Nature Biotechnology 2009, 27, (2), 190-8.13.Mallick, P.; Schirle, M.; Chen, S. S.; Flory, M. R.; Lee, H.; Martin, D.; Ranish, J.; Raught, B.; Schmitt, R.; Werner, T.; Kuster, B.; Aebersold, R., Computational prediction of proteotypic peptides for quantitative proteomics. Nature Biotechnology 2007, 12, (1), 125-31.14.Sanders, W. S.; Bridges, S. M.; McCarthy, F. M.; Nanduri, B.; Burgess, S. C., Prediction of peptides observable by mass spectrometry applied at the experimental set level. BMC Bioinformatics 2007, 1, (8 Suppl 7), S23.15.Boja, E. S.; Rodriguez, H., Mass spectrometry-based targeted quantitative proteomics: Achieving sensitive and reproducibledetection of proteins. Proteomics 2012, 12, 1093–1110.16.Deutsch, E. W.; Lam, H.; Aebersold, R., PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows. EMBO Reports 2008, 9, 429–434.17.PABST Peptide Atlas Best SRM Transition tool. (November 19, 2013), 18.Brusniak, M.-Y. K.; Kwok, S.-T.; Christiansen, M.; Campbell, D.; Reiter, L.; Picotti, P.; Kusebauch, U.; Ramos, H.; Deutsch, E. W.; Chen, J.; Moritz, R. L.; Aebersold, R., ATAQS: A computational software tool for high throughput transition optimization and validation for selected reaction monitoring mass spectrometry. BMC Bioinformatics 2011, 12, (78).19.Mead, J. A.; Bianco, L.; Ottone, V.; Barton, C.; Kay, R. G.; Lilley, K. S.; Bond, N. J.; Bessant, C., MRMaid, the web-based tool for designing multiple reaction monitoring ( MRM ) transitions. Molecular and Cellular Proteomics 2009, 8 (4), 696-705.20.Walsh, G. M.; Lin, S.; Evans, D. M.; Khosrovi-Eghbal, A.; Beavis, R. C.; Kast, J., Implementation of a data repository-driven approach for targeted proteomics experiments by multiple reaction monitoring. Journal of Proteomics 2009, 72, 838-852.21.Institute_for_Systems_Biology SRMAtlas. 22.TIQAM TIQAM (Targeted Identification for Quantitative Analysis by MRM). (November 19, 2013), 23.Skyline_SRM/MRM_Builder Skyline Targeted Proteomics Environment. (2011), 24.Sherwood, C.; Eastham, A.; Peterson, A.; Eng, J. K.; Shteynberg, D.; Mendoza, L.; Deutsch, E.; Risler, J.; Lee, L. W.; Tasman, N.; Aebersold, R. L., Henry ; Martin, D. B., MaRiMba: a Software Application for Spectral Library-Based MRM Transition List Assembly. J. Proteome Res. 2009, 8, (10), 4396–4405.25.Cham (Mead), J. A.; Bianco, L.; Bessant, C., Free computational resources for designing selected reaction monitoring transitions. Proteomics 2010, 10, (6), 1106–1126.26.Fan, J.; Mohareb, F.; Jones, A. M.; Bessant, C., MRMaid: The SRM Assay Design Tool for Arabidopsis and Other Species. Frontiers in Plant Science 2012, 3, (164).27.PRIDE PRIDE (Proteomics Identifications Database). 28.Fan, J.; Mohareb, F.; Bond, N. J.; Lilley, K. S.; Bessant, C., MRMaid 2.0: Mining PRIDE for Evidence-Based SRM Transitions. OMICS: A Journal of Integrative Biology 2012, 16, (9), 483-488.29.Vizcaíno, J. A.; Foster, J. M.; Martens, L., Proteomics data repositories: providing a safe haven for your data and acting as a springboard for further research. Journal of Proteomics 2010, 73, (11), 2136-46.30.Bereman, M. S.; MacLean, B.; Tomazela, D. M.; Liebler, D. C.; MacCoss, M. J., The development of selected reaction monitoring methods for targeted proteomics via empirical refinement. Proteomics 2012, 12, (8), 1134-41.31.Arnold, K.; Gosling, J.; Holmes, D., The Java Programming Language. 4 ed.; Addison-Wesley Professional: 2005.32.Urbanek, S. Rserve - Binary R server, Rserve/index.html. Rserve/index.html (January 21, 2013), 33.Clark, J.; DeRose, S. XML Path Language (XPath). 34.Magrane, M.; Consortium, U., UniProt Knowledgebase: a hub of integrated protein data. Database : the journal of biological databases and curation 2011, 2011, bar009.35.PeptideSieve PeptideSieve. (November 19, 2013), 36.STEPP SVM Technique for Evaluating Proteotypic Peptides (STEPP) (November 19, 2013), 37.Braisted, J. C.; Kuntumalla, S.; Vogel, C.; Marcotte, E. M.; Rodrigues, A. R.; Wang, R.; Huang, S.-T.; Ferlanti, E. S.; Saeed, A. I.; Fleischmann, R. D.; Peterson, S. N.; Pieper, R., The APEX Quantitative Proteomics Tool: Generating protein quantitation estimates from LC-MS/MS proteomics results. BMC Bioinformatics 2008, 9, 529.38.Wedge, D. C.; Gaskell, S. J.; Hubbard, S. J.; Kell, D. B.; Lau, K. W.; Eyers, C. In Peptide detectability following ESI mass spectrometry: prediction using genetic programming., 9th annual conference on Genetic and evolutionary computation (GECCO), London, 2007; Thierens, D.; Beyer, H.-G.; Bongard, J.; Branke, J.; Clark, J. A.; Cliff, D.; Congdon, C. B.; Deb, K.; Doerr, B.; Tim, K.; Kumar, S.; Miller, J. F.; Moore, J.; Neumann, F.; Pelikan, M.; Poli, R.; Sastry, K.; Stanley, K. O.; Stutzle, T.; Watson, R. A.; Wegener, I., Eds. ACM Press: London, 2007; pp 2219-2225.39.Keil, B., Specificity of proteolysis. Springer-Verlag: Berlin-Heidelberg-NewYork, 1992; p 335.40.Lai, M. C.; Topp, E. M., Solid-state chemical stability of proteins and peptides. Journal of Pharmaceutical Science 1999, 88, (5), 489-500. Table 1: An example of the output of the scientific workflow shown in REF _Ref356576364 \h \* MERGEFORMAT Figure 1.Figure Legends Figure 1: A logical workflow of peptide selection for an MRM experiment as implemented in PeptidePicker. The first step, after retrieving all proteins information and in silico digestion, is to filter all peptides according to the obligatory criteria. At the next level, all information about the peptides that is available from PeptideAtlas and Global Proteome Machine gpmDB, and the proteome of the organism as it appears in UniProt and the NCBI’s dbSNP are retrieved. The protein annotations available from UniProt are also reflected onto the peptides at this level, i.e., determining which peptide contains a modification site, whether the peptide is in the signal/chain part of the protein or if it overlaps two sections that will be cleaved, etc.. Finally, all of the information is integrated into one table/spreadsheet and the final scoring is generated by penalizing unwanted criteria and rewarding desirable criteria.Figure 2: A scientific workflow for the automatic generation of the most appropriate surrogate peptides to represent target proteins in LC/MRM-MS experiments. The user provides the protein accession number (or a list thereof) as input. Other inputs are also used to control the selection if the desired criteria are different from the default (see table 2). Some parameters need to be set once, such as the output directory or the path to the html2xml executable. Uniprot_RetrieveXML retrieves the full description of the protein as an XML file from Uniprot; extract_name reports the protein's recommended name, find_isoforms indicates whether the protein has isoforms, and, if so, reports their accession numbers. Both of the latter processors are XPath parsers. The find_mod_pos algorithm is an R script that extracts the positions of any modifications present in the protein. Peptidecutter uses ExPASy online tool PeptideCutter to in silico digest the protein (and its isoforms, if required). The next four processors html2xml, extract_peptides, extract_lpmm, extract_lpmm2 extract the information reported by the PeptideCutter into a table that includes the peptides and their lengths, masses, positions within the proteins, and calculated digestion efficiencies. The find_peptides algorithm filters the reported peptides according to the different acceptance criteria (see Peptide Acceptance Criteria section); mark_mod indicates any possible modifications in the reported peptides from the step before; intersect_peptides finds those peptides which are present in all isoforms when applicable. The two processors, test_observed_GPM, and test_observed_PeptideAtlas, test whether the reported peptides were previously observed and whether they have entries in the Global Proteome Machine or in PeptideAtlas. In addition, test_LabPeptideDataset finds whether the peptides are in the laboratory's own dataset. The test_unique algorithm reports whether the peptides is unique in the (human) proteome. The final step, produce_report, summarizes all findings in a Comma Separated Values (CSV) report that can easily be imported or further processed by other software. Figure 3: A screen shot of the Web interface of the tool. (A) shows the input page where the user needs only to insert the accession number of the protein of interest, and choose the organism. The user also controls the selection criteria by switching these on and off as indicated in the figure and in table 2. (B) shows the default results page where the peptides are listed in descending order, according to their v-score which integrates all of the retrieved information from the different repositories with the objective of choosing the appropriate peptides. The result page also includes links to the full extensive table of all information as html page for viewing in the browser or for downloading as CSV.Figure 1Figure 2Figure 3Table 2CriterionTypeExplanationWeight length_”n”_”m”_AAMandatory with user inputThe length of the peptides in the number of amino acids, where n is the lower and m in the higher limits. Both are integer and normally useful values are n=(6,7) and m=(20,25), but any values where n<m are allowed.1dig_efficiencyMandatoryThe digestion efficiency is considered to be the product of both side digestion probability as obtained from ExPASy digestion tool Peptide Cutter. The tool can be replaced with other online or local tools 1/100no_MMandatoryIs true, i.e. 1 if M is not present in the peptide1no_CMandatoryIs true, i.e. 1 if C is not present in the peptide1no_PPUser inputIs true, i.e. 1 if no P strings (2 or more) are present in the peptide1no_N.term.QOptionalIs true, i.e. 1 if no Q is present on the N-terminal3no_DGOptionalIs true, i.e. 1 if DG is not present in the peptide2no_DPOptionalIs true, i.e. 1 if DP is not present in the peptide2no_NGOptionalIs true, i.e. 1 if NG is not present in the peptide2no_QGOptionalIs true, i.e. 1 if QG is not present in the peptide2no_SSSUser inputIs true, i.e. 1 if no SSS strings (3 or more) are present in the peptide1no_WUser inputIs true, i.e. 1 if W is not present in the peptide1no_internal_R_KUser inputIs true if the peptide does not include internal R or K1no_mc_only_PUser inputIs true, i.e. 1 if the peptide does not include R or K except when it is followed by P 1No_mc_KeilUser inputIs true, i.e. 1 if the peptide does not include missed cleavages according to Keil rules1no_modOptionalIs true if the peptide does not include post -translational modifications4no_near_NCterm_modOptionalIs true if the peptide does not include post translational modifications near the N or C terminal1in_all_isoUser input Is true of the peptide is present in all isoforms of the protein3uniqueMandatoryIs true if the peptide is observed only once in the proteome 5gpmOptionalIs true if the peptide is previously observed in MS/MS experiments according to gpmDB 10peptideatlasOptionalIs true if the peptide is previously observed in MS/MS experiments according to PeptideAtlas10prot_in_peptideatlasOptionalIs true if the protein in in PeptideAtlas1PSSOptionalIs the value of the PeptideAtlas PSS score if available, otherwise is “NA” aka 05ESSOptionalIs the value of the PeptideAtlas ESS score if available, otherwise is “NA” aka 010PA_n_obsOptionalThe number of observations of the peptides as reported in PeptideAtlas10/ max(PA_n_obs)het_scoreOptionalThe value of the heterozygosity score as reported by dbSNP. Only peptides with heterozygosity value of less than 0.02 are considered.-1het_std_errOptionalThe value of the standard error in measuring the heterozygosity score as reported by dbSNP-1mod_by_similarity_potentialOptionalIs true if the reported post translational modification has no experimental evidence-1mod_with_evidenceIs true if the reported post translational modification has reported experimental evidence -4 ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download