Image Cluster Method Based on Ensemble Locality Sensitive Clustering
Abstract: To overcome the weakness of k-means in image clustering especially visual image clustering, we proposed an Ensemble Locality Sensitive Clustering method. It first determined the number of clusters of dataset, then generated the multiple clustering resolutions based on Exact Euclidean Locality Sensitive Hashing algorithm, at last, cluster ensemble methods were applied to get final partition. The experiments on synthetic dataset and image dataset show that new method reaches the same level with k-means combined with cluster ensemble about clustering accuracy on synthetic data set, and slightly less accuracy on image dataset. But the advantage of new method is its clustering time is faster than k-means, and it is suitable for incremental clustering. Therefore, Ensemble Locality Sensitive Clustering is a promising clustering method for high dimension image data.
文章引用: 彭天强 , 高毫林 (2016) 集成式位置敏感聚类方法。 人工智能与机器人研究， 5， 23-34. doi: 10.12677/AIRR.2016.52003
 Cao, Y. and Wu, J. (2002) Projective ART for Clustering Data Sets in High Dimensional Spaces. Neural Networks, 15, 105-120.
 Agrawal, R., Gehrke, J., Gunopulos, D. and Raghavan, P. (1998) Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications. Proceedings of SIGMOD Record ACM Special Interest Group on Management of Data, 94-105. http://dx.doi.org/10.1145/276304.276314
 Schclara, A., Rokachb, L. and Amit, A. (2013) Ensembles of Classifiers Based on Dimensionality Reduction. Pattern Analysis and Applications Journal, 5, 1305-4345.
 Dasgupta, S. and Sinha, K. (2013) Randomized Partition Trees for Exact Nearest Neighbor Search. Proceedings of Workshop and Conference Proceedings, 30, 1-21.
 Baraniuk, R.G., Davenport, M., De Vore, R. and Wakin, M.B. (2008) A Simple Proof of the Restricted Isometry Principle for Random Matrices. Constructive Approximation, 28, 253-263.
 Fowler, J.E. and Du, Q. (2012) Anomaly Detection and Reconstruction from Random Projections. IEEE Transaction on Image Processing, 21, 184-195.
 Schulman, L.J. (2000) Clustering for Edge-Cost Minimization. Proceedings of Annual ACM Symposium Theory of Computing, 547-555.
 Balcan, M.-F., Blum, A. and Vempala, S. (2006) Kernels as Features: On Kernels, Margins, and Low-Dimensional Mappings. Machine Learning, 65, 79-94.
 Shi, Q., Petterson, J., Dror, G., Langford, J., Smola, A.J. and Vishwanathan, S.V.N. (2009) Hash Kernels for Structured Data. Journal of Machine Learning Research, 10, 2615-2637.
 Andoni and Indyk, P. (2008) Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions. Communications of the ACM, 51, 117-122.
 Jegou, H., Douze, M. and Schmid, C. (2010) Improving Bag-of-Features for Large Scale Image Search. International Journal of Computer Vision, 87, 316-336. http://dx.doi.org/10.1007/s11263-009-0285-2
 Liu, Z., Liu, T. and David, G. (2010) Effective and Scalable Video Copy Detection. Proceedings of the ACM SIGMM International Conference on Multimedia Information Retrieval, ACM, New York, 119-128.
 Ravichandran, D., Pantel, P. and Hovy, E. (2005) Randomized Algorithms and NLP: Using Locality Sensitive Hash Function for High Speed Noun Clustering. Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Stroudsburg, 622-629. http://dx.doi.org/10.3115/1219840.1219917
 Blum, A. (2006) Random Projection, Margins, Kernels, and Fea-ture-Selection. Proceedings of the 2005 International Conference on Subspace, Latent Structure and Feature Selection, LNCS, 52-68.
 Shi, Q.F., Shen, C.H., Hill, R. and van den Hengel, A. (2012) Is Margin Preserved after Random Projection. Proceedings of International Conference on Machine Learning, Edinburgh.
 Pons, S.V. and Sulcloper, J.R. (2011) A Surver of Clustering Ensemble Algorithms. International Journal of Pattern Recognition and Artificial Intelligence, 25, 337-372. http://dx.doi.org/10.1142/S0218001411008683
 Topchy, A.P., Jain, A.K. and Punch, W.F. (2005) Clustering Ensembles: Models of Consensus and Weak Partitions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 1866-1881. http://dx.doi.org/10.1109/TPAMI.2005.237
 Topchy, A., Minaei-Bidgoli, B., Jain, A.K. and Punch, W.F. (2004) Adaptive Clustering Ensembles. Proceedings of the 17th International Conference, Washington DC, 272–275. http://dx.doi.org/10.1109/icpr.2004.1334105
 Strehl and Ghosh, J. (2002) Cluster Ensembles: A Knowledge Reuse Framework Multiple Partitions. Journal of Machine Learning Research, 583-617.