计算生物学

Vol.4 No.1 (March 2014)

基于支持向量机的整体分类器算法 预测酶蛋白质中四类简单超二级结构
Prediction of Four Kinds of Supersecondary Structures in Enzymes by Using Ensemble Classifier Based on SVM

 

作者:

高苏娟 , 胡秀珍 :内蒙古工业大学理学院,呼和浩特

 

关键词:

酶蛋白质超二级结构矩阵打分支持向量机整体分类器Enzyme Supersecondary Structure Scoring Function Support Vector Machine Ensemble Classifier

 

摘要:

酶是一种具有催化功能的蛋白质,研究酶蛋白质中的超二级结构对研究酶的结构及功能有重要作用。本文从酶蛋白质序列出发,首次对酶蛋白质中的四类简单超二级结构进行研究。以位点氨基酸及其紧邻关联为参数,选取五种序列片段截取方式,采用7-交叉检验,使用矩阵打分方法预测的结果不理想;将矩阵打分值作为特征参数输入支持向量机,并用整体分类器进行加权融合,得到了较好的预测结果,预测总精度达到72.64%Matthew’s相关系数在0.57以上,因此,基于支持向量机的整体分类器方法是一种有效的预测酶蛋白质中超二级结构的方法。

Enzymes are a kind of protein that has catalytic function. The study of supersecondary structures in enzymes plays an important role in the structure and function of enzymes. Based on enzyme sequence information, four kinds of supersecondary structures in enzymes were researched for the first time. Amino acids of sites and dipeptide components of sites were selected as parameters, for five selections of the best fixed-length pattern, the predictive results in 7-fold cross-validation were not ideal by using scoring function method; scores were selected as input parameters of support vector machine (SVM); the results were fused with weighted factors by using ensemble classifier; the better performance was obtained; the overall prediction accuracy was 72.64% and the Matthews correlation coefficient was above 0.57. Therefore, ensemble classifier based on SVM is an effective method to predict four kinds of supersecondary structures in enzymes.

文章引用:

高苏娟 , 胡秀珍 (2014) 基于支持向量机的整体分类器算法 预测酶蛋白质中四类简单超二级结构 。 计算生物学, 4, 1-11. doi: 10.12677/HJCB.2014.41001

 

参考文献

分享
Top