基于半监督学习的黄萎病互作基因对的预测
The Prediction of Interaction Gene for Greensickness Based on Semi-Supervised Learning

作者: 张伟娟 , 张红梅 , 陈 峰 :河南工业大学信息科学与工程学院,河南 郑州;

关键词: 生物信息技术黄萎病典型相关分析半监督Bio-IT Greensickness Canonical Correlation Analysis Semi-Supervised Learning

摘要:
黄萎病(Greensickness)属于不可治愈性病害,每年均造成巨大的经济损失。为了培育抗病植株、得到更多的互作基因对,逐次进行生物实验排除是不现实的。为了在已知少量关联基因的情况下挖掘更多的可靠基因对,本文主要使用统计技术典型相关分析(Canonical Correlation Analysis, CCA)和数据挖掘技术半监督学习(Semi-Supervised Learning, SSL)等生物信息技术对相关基因进行学习,最终实现对关联基因的预测。研究结果能够有效地指导黄萎病抗病研究的方向、精确研究范围、提高研究速度。

Abstract: Greensickness is an incurable disease, which can cause huge economic losses every year. Using a series of biological experiments to develop resistant plants and get more interaction gene is un-realistic. In order to dig more reliable genes under the premise of only having very few interaction gene, the paper mainly uses bio-IT, such as statistical techniques of Canonical Correlation Analysis and data mining techniques of Semi-Supervised Learning to study the related gene and gets the prediction of the interaction gene at the end. The result of the study can effectively guide the direction of the greensickness research, narrow the scope of the research and improve the research speed.

文章引用: 张伟娟 , 张红梅 , 陈 峰 (2015) 基于半监督学习的黄萎病互作基因对的预测。 计算生物学, 5, 11-16. doi: 10.12677/HJCB.2015.51002

参考文献

[1] 张保龙, 承泓良, 杨郁文 (2012) 棉花抗黄萎病研究进展. 中国农业科学技术出版社, 北京.

[2] 宋学贞, 杨国正 (2013) 棉花抗黄萎病育种研究进展. 中国农业学报, 21, 16-22.

[3] 张志 (2010) 我国棉花抗枯、黄萎病育种存在的问题及对策. 河南农业, 19, 13-14.

[4] 陶果, 信吉阁 (2013) 肖晶等.基因敲除技术最新研究进展及其应用. 安徽农业科学, 29, 11605-11608.

[5] 袁莉, 付博, 陈香美等 (2004) 三种基因转导方法在不同代龄复制性衰老细胞中的比较研究. 中国生物化学与分子生物学报, 2, 257-263.

[6] 张骞, 盛军 (2008) 基因芯片技术的发展和应用. 中国医学科学院学报, 3, 344-347.

[7] 李霞, 李亦学, 廖飞 (2010) 生物信息学. 人民卫生出版社, 北京.

[8] Zhou, Z.H., Zhan, D.C. and Yang, Q. (2007) Semi-supervised learning with very few labeled training ex-amples. Proceedings of the National Conference on Artificial Intelligence, Canada, 2007, 675-680.

[9] 孙权森, 曾生根, 王平安等 (2005) 典型相关分析的理论及其在特征融合中的应用. 计算机学报, 9, 1524-1533.

[10] 杨剑, 王钰, 钟宁 (2007) 流形上的Laplacian半监督回归. 计算机研究与发展, 7, 1121-1127.

[11] Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. (1990) Basic local alignment search tool. Journal of Molecular Biology, 215, 403-410.

[12] Belkin, M. and Niyogi, P. (2003) Laplacian Eigenmaps for dimensionality reduction and data repre-sentation. Neural Computation, 15, 1373-1396.

[13] 肖宇, 于剑 (2008) 基于近邻传播算法的半监督聚类. 软件学报, 11, 2803-2813.

[14] Kawchuk, L.M., Hachey, J., Lynch, D.R., Kulcsar, F., van Rooijen, G., Waterer, D.R., et al. (2001) Tomato Ve disease resistance genes encode cell surface-like receptors. Proceedings of the National Academy of Sciences of the United States of America, 98, 6511-6515.

[15] de Jonge, R., van Esse, H.P., Maruthachalam, K., Bolton, M.D., Santhanam, P., Saber, M.K., et al. (2012) Tomato immune receptor Ve1 recognizes effector of multiple fungal pathogens uncovered by genome and RNA sequencing. Proceedings of the National Academy of Sciences of the United States of America, 109, 5110-5115.

[16] Zhang, Y., Wang, X., Yang, S., Chi, J., Zhang, G.Y. and Ma, Z.Y. (2011) Cloning and characterization of a Verticillium wilt resistance gene from Gossypium barbadense and functional analysis in Arabidopsis thaliana. Plant Cell Reports, 30, 2085-2096.

[17] Ayres, M.D., Howard, S.C., Kuzio, J., Lopez-Ferber, M. and Possee, R.D. (1994) The complete DNA sequence of Autographa californica nuclear polyhedrosis virus. Virology, 202, 586-605.

分享
Top