AI/ML

The predictive performance of short-linear motif features in the prediction of calmodulin-binding proteins.





Icon for BioMed Central Icon for PubMed Central Related Articles

The predictive performance of short-linear motif features in the prediction of calmodulin-binding proteins.

BMC Bioinformatics. 2018 Nov 20;19(Suppl 14):410

Authors: Li Y, Maleki M, Carruthers NJ, Stemmer PM, Ngom A, Rueda L

Abstract
BACKGROUND: The prediction of calmodulin-binding (CaM-binding) proteins plays a very important role in the fields of biology and biochemistry, because the calmodulin protein binds and regulates a multitude of protein targets affecting different cellular processes. Computational methods that can accurately identify CaM-binding proteins and CaM-binding domains would accelerate research in calcium signaling and calmodulin function. Short-linear motifs (SLiMs), on the other hand, have been effectively used as features for analyzing protein-protein interactions, though their properties have not been utilized in the prediction of CaM-binding proteins.
RESULTS: We propose a new method for the prediction of CaM-binding proteins based on both the total and average scores of known and new SLiMs in protein sequences using a new scoring method called sliding window scoring (SWS) as features for the prediction module. A dataset of 194 manually curated human CaM-binding proteins and 193 mitochondrial proteins have been obtained and used for testing the proposed model. The motif generation tool, Multiple EM for Motif Elucidation (MEME), has been used to obtain new motifs from each of the positive and negative datasets individually (the SM approach) and from the combined negative and positive datasets (the CM approach). Moreover, the wrapper criterion with random forest for feature selection (FS) has been applied followed by classification using different algorithms such as k-nearest neighbors (k-NN), support vector machines (SVM), naive Bayes (NB) and random forest (RF).
CONCLUSIONS: Our proposed method shows very good prediction results and demonstrates how information contained in SLiMs is highly relevant in predicting CaM-binding proteins. Further, three new CaM-binding motifs have been computationally selected and biologically validated in this study, and which can be used for predicting CaM-binding proteins.

PMID: 30453876 [PubMed – indexed for MEDLINE]

Source link




Related posts

Your social media photos could be training facial recognition AI without your consent

Newsemia

Snomed2Vec: Random Walk and Poincar'e Embeddings of a Clinical Knowledge Base for Healthcare Analytics. (arXiv:1907.08650v1 [cs.LG])

Newsemia

Mixed-type data analysis IV: Representing multivariate ordinal data

Newsemia

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

Privacy & Cookies Policy