多种信息融合的细胞凋亡蛋白质的亚细胞定位预测
Prediction of Apoptosis Protein Subcellular Localization Based on Hybrid Feature Parameters

作者: 薛济先 , 陈颖丽 * , 翟媛媛 :内蒙古大学物理科学与技术学院,内蒙古 呼和浩特;

关键词: 细胞凋亡蛋白mRNA二级结构亚细胞定位Apoptosis Protein mRNA Secondary Structure Subcellular Localization

摘要:
研究表明mRNA的序列和结构特性与蛋白质的亚细胞定位有一定关系。本文提取了细胞凋亡蛋白质的两种mRNA信息:三阅读框3-mer mRNA序列频数信息、mRNA二级结构-序列模式信息,并结合细胞凋亡蛋白质的氨基酸物理化学性质、氨基酸黏性特征和进化信息,构成特征向量来表示mRNA和蛋白质序列,利用支持向量机算法,对四种不同亚细胞位置的细胞凋亡蛋白质进行预测。研究发现融合mRNA信息与氨基酸信息后预测效果更佳,在Jackknife检验下,预测总精度达到82.18%,且独立测试集预测总精度达到78.26%。结果表明,mRNA的序列和结构特性有助于细胞凋亡蛋白质的亚细胞定位预测。

Abstract: Studies have shown that sequence and structure characteristics of the mRNA have a certain re-levance with subcellular localization of protein. In this article, it extracted two mRNA information of apoptosis proteins: the three reading frame 3-mer mRNA sequence frequency information and mRNA secondary structure-sequence mode information, and to construct feature vector which indicate mRNA and amino acid sequence with physicochemical properties, stickiness and evolutionary information of apoptosis proteins. Meanwhile, by using support vector machine algorithm, apoptosis proteins of four different subcellular localizations were predicted. The study found that the hybrid of mRNA and AAs information promoted prediction result, and the overall prediction access rate achieved 82.18% while 78.26% for independent test datasets by the Jackknife test. Prediction results show that sequence and structure characteristics of the mRNA contribute to prediction of the subcellular localization of apoptosis proteins.

文章引用: 薛济先 , 陈颖丽 , 翟媛媛 (2016) 多种信息融合的细胞凋亡蛋白质的亚细胞定位预测。 计算生物学, 6, 62-71. doi: 10.12677/HJCB.2016.63008

参考文献

[1] 屈二军, 胡建业, 陈兰英. 细胞凋亡与疾病研究进展[J]. 临床和实验医学杂志, 2008(8): 177-178.

[2] Zhirnov, O.P., Konakova, T.E., Wolff, T., et al. (2002) NS1 Protein of Influenza A Virus Down-Regulates Apoptosis. Journal of Virology, 76, 1617-1625.
http://dx.doi.org/10.1128/jvi.76.4.1617-1625.2002

[3] Reed, J.C. and Paternostro, G. (1999) Postmitochondrial Regulation of Apoptosis during Heart Failure. Proceedings of the National Academy of Sciences of the USA, 96, 7614-7616.
http://dx.doi.org/10.1073/pnas.96.14.7614

[4] Xue, C.H., Li, F., He, T., et al. (2005) Classification of Real and Pseudo microRNA Precursors Using Local Structure- Sequence Features and Support Vector Machine. BMC Bioinformatics, 6, 310.
http://dx.doi.org/10.1186/1471-2105-6-310

[5] Hofacker, I.L., Fontana, W., Stadler, P.F., et al. (1994) Fast Folding and Comparison of RNA Secondary Structures. Monatshefte für Chemie/Chem Mon, 125, 167-188.

[6] Gao, Q.-B., Wang, Z.-Z., Yan, C. and Du, Y.-H. (2005) Prediction of Protein Subcellular Location Using a Combined Feature of Sequence. FEBS Letters, 579, 3444-3448.
http://dx.doi.org/10.1016/j.febslet.2005.05.021

[7] Lio, P. and Vannucci, M. (2000) Wavelet Change-Point Prediction of Transmembrane Proteins. Bioinformatics, 16, 376-382.
http://dx.doi.org/10.1093/bioinformatics/16.4.376

[8] Kawashima, S., Ogata, H. and Kanehisa, M. (2000) AAindex: Amino Acid Index Database. Nucleic Acids Research, 28, 374.
http://dx.doi.org/10.1093/nar/28.1.374

[9] Chou, K.-C. and Cai, Y.-D. (2006) Predicting of Protease Type in a Hybridization Space. Biochemical and Biophysical Research Communications, 339, 1015-1020.
http://dx.doi.org/10.1016/j.bbrc.2005.10.196

[10] Chou, K.-C. and Cai, Y.-D. (2006) Predicting Protein-Protein Interactions from Sequence in a Hybridization Space. Journal of Proteome Research, 5, 316-322.
http://dx.doi.org/10.1021/pr050331g

[11] Chou, K.-C. and Cai, Y.-D. (2004) Predicting Enzyme Family Class in a Hybridization Space. Protein Science, 13, 2857-2863.
http://dx.doi.org/10.1110/ps.04981104

[12] Amos-Binks, A., et al. (2011) Binding Site Prediction for Protein-Protein Interactions and Novel Motif Discovery Using Re-Occurring Polypeptide Sequences. BMC Bioinformatics, 12, 225.
http://dx.doi.org/10.1186/1471-2105-12-225

[13] Gromiha, M.M. and Selvaraj, S. (2004) Inter-Residue Interactions in Protein Folding and Stability. Progress in Biophysics & Molecular Biology, 86, 235-277.
http://dx.doi.org/10.1016/j.pbiomolbio.2003.09.003

[14] Levy, E.D., De, S. and Teichmann, S.A. (2012) Cellular Crowding Imposes Global Constraints on the Chemistry and Evolution of Proteomes. Proceedings of the National Academy of Sciences of the USA, 109, 20461-20466.
http://dx.doi.org/10.1073/pnas.1209312109

[15] Petersen, B., Petersen, T.N., Andersen, P., Nielsen, M. and Lundegaard, C. (2009) A Generic Method for Assignment of Reliability Scores Applied to Solvent Accessibility Predictions. BMC Structural Biology, 9, 51.
http://dx.doi.org/10.1186/1472-6807-9-51

[16] Schaffer, A.A., Aravind, L., Madden, T.L., et al. (2001) Improving the Accuracy of PSI-BLAST Protein Database Searches with Composition-Based Statistics and Other Refinements. Nucleic Acids Research, 29, 2994-3005.
http://dx.doi.org/10.1093/nar/29.14.2994

[17] Chou, K.C. (2001) Prediction of Protein Cellular Attributes Using Pseudo-Amino Acid Composition. Proteins, 43, 246-255.
http://dx.doi.org/10.1002/prot.1035

[18] Chang, C.C. and Lin, C.J. (2011) LIBSVM: A Library for Support Vector Machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2, 27.

[19] Chou, K.C. and Elrod, D.W. (1999) Protein Subcellular Location Prediction. Protein Engineering, 12, 107-118.
http://dx.doi.org/10.1093/protein/12.2.107

[20] Chou, K. and Zhang, C. (1995) Prediction of Protein Structural Classes. Critical Reviews in Biochemistry and Molecular Biology, 30, 275-349.
http://dx.doi.org/10.3109/10409239509083488

分享
Top