High Throughput Sequencing Methods and Applications of Read Mapping Algorithm
Abstract: Chromatin immunoprecipitation followed by sequencing (ChIP-Seq) has become a tool for studying DNA-binding proteins profile and histone modifications. At the same time, it also arouse requirement for ef-fective computational method to map short DNA reads and to compare mapping profiles, which are crucial for uncovering biological mechanisms. In this article, we introduced the principle of new generation se-quencing and corresponding data format, then give an overview of current method of reads mapping and peak identification. Finally, we demonstrated the application of this method in histone modifications analysis and transcription factor binding sites identification.
文章引用: 李慧丽 , 何风 , 杨航 , 郑焱 , 吴晓明 (2011) 高通量测序及读序映射算法的应用。 生物医学， 1， 1-5. doi: 10.12677/hjbm.2011.11001
 M. L. Li, W. Wang, and Z. H. Lu. Genomic analysis of DNA-protein interaction by chromatin immunoprecipitation. Hereditas, 2010, 32(3): 219-228.
 C. Chen, H. Wan, and Q. Zhou. The next generation sequencing technology and its application in cancer research. Chinese Jour-nal of Lung Cancer, 2010, 13(2): 154-159.
 Browser UG. UCSC Genome Browser: Wiggle Track Format (WIG)[URL]. http://genome.ucsc.edu/goldenPath/help/wiggle.html, 2011-7-16 /2011-7-16.
 Welcome Trust Sanger Institute, Genome Research Limited. GFF (General Feature Format) Specifications Document—Welcome Trust Sanger Institute [URL]. http://www.sanger.ac.uk/resources/software/gff/spec.html, 2011 -4-19/2011-7-16.
 H. Jiang, W. H. Wong. SeqMap: Mapping massive amount of oligonucleotides to the genome. Bioinformatics, 2008, 24(20): 2395-2396.
 H. Li, J. Ruan, and R. Durbin. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Research, 2008, 18(11): 1851-1858.
 B. Langmead, C. Trapnell, M. Pop, et al. Ultrafast and mem-ory-efficient alignment of short DNA sequences to the human genome. Genome Biol., 2009, 10(3): R25.
 S. M. Rumble, P. Lacroute, A. V. Dalca, et al. SHRiMP: Accu-rate mapping of short color-space reads. PLoS Comput Biol, 2009, 5(5): Article ID e1000386.
 B. D. Ondov, A. Varadarajan, K. D. Passalacqua, et al. Efficient mapping of Applied Biosystems SOLiD sequence data to a ref-erence genome for functional genomic applications. Bioinfor-matics, 2008, 24(23): 2776-2777.
 H. Ji, H. Jiang, W. Ma, et al. An integrated software system for analyzing ChIP-Chip and ChIP-Seq data. Nat Biotechnol, 2008, 26(11): 1293-1300.
 D. S. Johnson, A. Mortazavi, R. M. Myers, et al. Genome-wide mapping of in vivo protein-DNA interactions. Science, 2007, 316(5830): 1497-1502.
 Y. Zhang, T. Liu, C. A. Meyer, et al. Model-based analysis of ChIP-Seq (MACS). Genome Biology, 2008, 9(9): R137.
 Z. S. Qin, J. Yu, J. Shen, et al. HPeak: An HMM-based algo-rithm for defining read-enriched regions in ChIP-Seq data. BMC Bioinformatics, 2010, 11: 369.
 A. P. Fejes, G. Robertson, M. Bilenky, et al. FindPeaks 3.1: A tool for identifying areas of enrichment from massively parallel short-read sequencing technology. Bioinformatics, 2008, 24(15): 1729-1730.
 R. Jothi, S. Cuddapah, A. Barski, et al. Genome-wide identifica-tion of in vivo protein-DNA binding sites from ChIP-Seq data. Nucleic Acids Res., 2008, 36(16): 5221-5231.
 A. Barski, S. Cuddapah, K. Cui, et al. High-resolution profiling of histone methylations in the human genome. Cell, 2007, 129(4): 823-837.
 T. S. Mikkelsen, M. Ku, D. B. Jaffe, et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Na-ture, 2007, 448(7153): 553-560.
 G. Robertson, M. Hirst, M. Bainbridge, et al. Genome-wide profiles of STAT1 DNA association using chromatin immuno-precipitation and massively parallel sequencing. Nature Methods, 2007, 4(8): 651-657.
 J. Eid, A. Fehr, J Gray, et al. Real-time DNA sequencing from single polymerase molecules. Science, 2009, 323(5910): 133-138.