计算机科学与应用

Vol.6 No.7 (July 2016)

基于MapReduce的朴素贝叶斯垃圾短信过滤研究
Research on Naive Bayesian Spam SMS Filtering Based on MapReduce

 

作者:

赵彩迪 , 朱有产 , 符佳慧 :华北电力大学,河北 保定

 

关键词:

垃圾短信短信过滤朴素贝叶斯MapReduceSpam SMS SMS Filter Naive Bayesian MapReduce

 

摘要:

针对海量短信文本的挖掘过滤需要很大的存储空间以及更强的计算能力,提出一种基于MapReduce的朴素贝叶斯的垃圾短信过滤方法;基于改进的朴素贝叶斯垃圾短信分类算法,利用MapReduce模型并行化对海量数据处理的优势进行短信文本的训练和测试。实验表明:利用计算集群实现海量垃圾短信过滤在召回率、查准率方面有所提高,垃圾短信过滤效率随着集群规模的扩增而提升较快。

The massive text mining filter requires a lot of storage space and stronger computing ability, so a spam message filtering method of MapReduce-based Bayesian is proposed. Based on the improved Naive Bayesian spam SMS classification algorithm, taking the advantage of MapReduce model pa-rallelization on massive data processing is used to train and test SMS text. Results show that using compute cluster to achieve massive spam filtering can improve the efficiency of recalling and pre-cision, and with the expansion of cluster size spam SMS filtering efficiency improve faster.

文章引用:

赵彩迪 , 朱有产 , 符佳慧 (2016) 基于MapReduce的朴素贝叶斯垃圾短信过滤研究。 计算机科学与应用, 6, 443-450. doi: 10.12677/CSA.2016.67054

 

参考文献

分享
Top