Machine learning can differentiate venom toxins from other proteins having non-toxic physiological functions

Ranko Gacesa; David J. Barlow; Paul F. Long

doi:10.7717/peerj-cs.90

Machine learning can differentiate venom toxins from other proteins having non-toxic physiological functions

Ranko Gacesa, David J. Barlow, Paul F. Long^*

^*Corresponding author for this work

Institute of Pharmaceutical Science

Research output: Contribution to journal › Article › peer-review

38 Citations (Scopus)

218 Downloads (Pure)

Abstract

Ascribing function to sequence in the absence of biological data is an ongoing challenge in bioinformatics. Differentiating the toxins of venomous animals from homologues having other physiological functions is particularly problematic as there are no universally accepted methods by which to attribute toxin function using sequence data alone. Bioinformatics tools that do exist are difficult to implement for researchers with little bioinformatics training. Here we announce a machine learning tool called 'ToxClassifier' that enables simple and consistent discrimination of toxins from non-toxin sequences with > 99% accuracy and compare it to commonly used toxin annotation methods. 'ToxClassifer' also reports the best-hit annotation allowing placement of a toxin into the most appropriate toxin protein family, or relates it to a non-toxic protein having the closest homology, giving enhanced curation of existing biological databases and new venomics projects. 'ToxClassifier' is available for free, either to download (https://github.com/rgacesa/ToxClassifier) or to use on a web-based server (http://bioserv7.bioinfo.pbf.hr/ToxClassifier/).

Original language	English
Article number	e90
Number of pages	20
Journal	PeerJ
Volume	2016
Issue number	10
DOIs	https://doi.org/10.7717/peerj-cs.90
Publication status	Published - 10 Oct 2016

Keywords

Animal venom
Automatic annotation
Biological function
Functional prediction
Protein sequences

Access to Document

10.7717/peerj-cs.90Licence: CC BY

Machine learning can differentiate_GACESA_Accepted8Sep2016_GOLD VoRFinal published version, 844 KBLicence: CC BY

Cite this

@article{283a90dff3a44a72b3f8ec65a26d9fab,

title = "Machine learning can differentiate venom toxins from other proteins having non-toxic physiological functions",

abstract = "Ascribing function to sequence in the absence of biological data is an ongoing challenge in bioinformatics. Differentiating the toxins of venomous animals from homologues having other physiological functions is particularly problematic as there are no universally accepted methods by which to attribute toxin function using sequence data alone. Bioinformatics tools that do exist are difficult to implement for researchers with little bioinformatics training. Here we announce a machine learning tool called 'ToxClassifier' that enables simple and consistent discrimination of toxins from non-toxin sequences with > 99% accuracy and compare it to commonly used toxin annotation methods. 'ToxClassifer' also reports the best-hit annotation allowing placement of a toxin into the most appropriate toxin protein family, or relates it to a non-toxic protein having the closest homology, giving enhanced curation of existing biological databases and new venomics projects. 'ToxClassifier' is available for free, either to download (https://github.com/rgacesa/ToxClassifier) or to use on a web-based server (http://bioserv7.bioinfo.pbf.hr/ToxClassifier/).",

keywords = "Animal venom, Automatic annotation, Biological function, Functional prediction, Protein sequences",

author = "Ranko Gacesa and Barlow, {David J.} and Long, {Paul F.}",

year = "2016",

month = oct,

day = "10",

doi = "10.7717/peerj-cs.90",

language = "English",

volume = "2016",

journal = "PeerJ",

issn = "2167-8359",

publisher = "PeerJ",

number = "10",

}

TY - JOUR

T1 - Machine learning can differentiate venom toxins from other proteins having non-toxic physiological functions

AU - Gacesa, Ranko

AU - Barlow, David J.

AU - Long, Paul F.

PY - 2016/10/10

Y1 - 2016/10/10

N2 - Ascribing function to sequence in the absence of biological data is an ongoing challenge in bioinformatics. Differentiating the toxins of venomous animals from homologues having other physiological functions is particularly problematic as there are no universally accepted methods by which to attribute toxin function using sequence data alone. Bioinformatics tools that do exist are difficult to implement for researchers with little bioinformatics training. Here we announce a machine learning tool called 'ToxClassifier' that enables simple and consistent discrimination of toxins from non-toxin sequences with > 99% accuracy and compare it to commonly used toxin annotation methods. 'ToxClassifer' also reports the best-hit annotation allowing placement of a toxin into the most appropriate toxin protein family, or relates it to a non-toxic protein having the closest homology, giving enhanced curation of existing biological databases and new venomics projects. 'ToxClassifier' is available for free, either to download (https://github.com/rgacesa/ToxClassifier) or to use on a web-based server (http://bioserv7.bioinfo.pbf.hr/ToxClassifier/).

AB - Ascribing function to sequence in the absence of biological data is an ongoing challenge in bioinformatics. Differentiating the toxins of venomous animals from homologues having other physiological functions is particularly problematic as there are no universally accepted methods by which to attribute toxin function using sequence data alone. Bioinformatics tools that do exist are difficult to implement for researchers with little bioinformatics training. Here we announce a machine learning tool called 'ToxClassifier' that enables simple and consistent discrimination of toxins from non-toxin sequences with > 99% accuracy and compare it to commonly used toxin annotation methods. 'ToxClassifer' also reports the best-hit annotation allowing placement of a toxin into the most appropriate toxin protein family, or relates it to a non-toxic protein having the closest homology, giving enhanced curation of existing biological databases and new venomics projects. 'ToxClassifier' is available for free, either to download (https://github.com/rgacesa/ToxClassifier) or to use on a web-based server (http://bioserv7.bioinfo.pbf.hr/ToxClassifier/).

KW - Animal venom

KW - Automatic annotation

KW - Biological function

KW - Functional prediction

KW - Protein sequences

UR - http://www.scopus.com/inward/record.url?scp=84994589205&partnerID=8YFLogxK

U2 - 10.7717/peerj-cs.90

DO - 10.7717/peerj-cs.90

M3 - Article

SN - 2167-8359

VL - 2016

JO - PeerJ

JF - PeerJ

IS - 10

M1 - e90

ER -

Machine learning can differentiate venom toxins from other proteins having non-toxic physiological functions

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this