Prediction of bacterial virulent proteins with composition moment vector feature encoding method
Abstract
Prediction of bacterial virulent proteins is critical for vaccine development and understanding of virulence mechanisms in pathogens. For this purpose, a number of feature encoding methods based on sequences and evolutionary information of a given protein have been proposed and applied with some classifier algorithms so far. In this paper, we performed composition moment vector (CMV), which includes information about both composition and position of amino acid in the protein sequence to predict bacterial virulent proteins. The tests were validated in three different independent datasets. Experimental results show that CMV feature encoding method leads to better classification performance in terms of accuracy, sensitivity, f-measure and the Matthews correlation coefficient (MCC) scores on diverse classifiers.