计算机科学与应用

Vol.6 No.10 (October 2016)

多值决策表的最小决策树生成
Minimal Decision Tree Generation for Multi-Label Decision Tables

 

作者:

乔莹 , 许美玲 , 钟发荣 , 曾静 , 莫毓昌 :浙江师范大学,浙江 金华

 

关键词:

多值决策表决策树动态规划算法Multi-Label Decision Tables Decision Trees Dynamic Programming Algorithm

 

摘要:

决策树技术在数据挖掘的分类领域应用极其广泛,可以从普通决策表(每行记录包含一个决策值)中挖掘有价值的信息,但是要从多值决策表(每行记录包含多个决策值)中挖掘潜在的信息则比较困难。多值决策表中每行记录包含多个决策值,多个决策属性用一个集合表示。针对已有的启发式算法,如贪心算法,由于性能不稳定的特点,该算法获得的决策树规模变化较大,本文基于动态规划的思想,提出了使决策树规模最小化的算法。该算法将多值决策表分解为多个子表,通过多值决策表的子表进行构造最小决策树,进而对多值决策表进行数据挖掘。

Decision tree is a widely used classification in data mining. It can discover the essential knowledge from the common decision tables (each row has a decision). However, it is difficult to do data mining from the multi-label decision tables (each row has a set of decisions). In a multi-label decision tables, each row contains several decisions, and several decision attributes are represented using a set. By testing the existing heuristic algorithms, such as greedy algorithms, their performance is not stable, i.e., the size of the decision tree might become very large. In this paper, we propose a dynamic programming algorithm to minimize the size of the decision trees for a multi- label decision table. In our algorithm, the multi-label decision table is divided into several subtables, and the decision tree is constructed by using all subtables of the multi-label decision table, then useful information can be discovered from the multi-label decision tables.

文章引用:

乔莹 , 许美玲 , 钟发荣 , 曾静 , 莫毓昌 (2016) 多值决策表的最小决策树生成。 计算机科学与应用, 6, 617-628. doi: 10.12677/CSA.2016.610076

 

参考文献

分享
Top