2021-07-07T07:09:42Z
2021-07-07T07:09:42Z
2021
The massive amount of data generated from genome sequencing brings tons of newly identified mutations, whose pathogenic/non-pathogenic effects need to be evaluated. This has given rise to several mutation predictor tools that, in general, do not consider the specificities of the various protein groups. We aimed to develop a predictor tool dedicated to membrane proteins, under the premise that their specific structural features and environment would give different responses to mutations compared to globular proteins. For this purpose, we created TMSNP, a database that currently contains information from 2624 pathogenic and 196 705 non-pathogenic reported mutations located in the transmembrane region of membrane proteins. By computing various conservation parameters on these mutations in combination with annotations, we trained a machine-learning model able to classify mutations as pathogenic or not. TMSNP (freely available at http://lmc.uab.es/tmsnp/) improves considerably the prediction power of commonly used mutation predictors trained with globular proteins.
Spanish Ministerio de Ciencia, Innovación y Universidades [SAF2015-74627-JIN, SAF2016-77830-R, PI19/00348]. Funding for open access charge: Ministerio de Ciencia e Innovación y Universidades [PI19/00348].
Article
Published version
English
Oxford University Press
NAR Genom Bioinform. 2021;3(1):lqab008
info:eu-repo/grantAgreement/ES/1PE/SAF2015-74627-JIN
info:eu-repo/grantAgreement/ES/1PE/SAF2016-77830-R
© The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
http://creativecommons.org/licenses/by-nc/4.0/