# 一种提高SVM分类能力的同步优化算法A Synchronous Optimization Algorithm for Increasing Accuracy of SVM Classification

Abstract: Support vector machines (SVM), which is a popular method for pattern classification, has been recently adopted in range of problems. In training procedure of SVM, feature selection and parameter optimization are two main factors that impact on classification accuracy. In order to improve the classification accuracy by optimizing parameter and choosing feature subset for SVM, a new algorithm is proposed through combining Bat Algorithm (BA) with SVM, termed BA + SVM. For assessing the performance of BA + SVM, 10 public data-sets are employed to test the classification accuracy rate. Compared with grid algorithm, conventional parameter optimization method, our study concludes that BA + SVM has a higher classification accuracy with fewer input features for support vector classification.

1. 引言

2. 相关工作

2.1. SVM分类器

$〈\omega \cdot {x}_{i}〉+b=0,\text{\hspace{0.17em}}\text{\hspace{0.17em}}i=1,2,\cdots ,m$ (1)

$\begin{array}{l}\underset{1\le i\le m}{\mathrm{min}}\frac{1}{2}{\omega }^{\text{T}}\omega +C\underset{i=1}{\overset{m}{\sum }}{\xi }_{i}\\ \text{subjectto}:{y}_{i}\left(〈\omega \cdot {x}_{i}〉+b\right)-1\ge 0\end{array}$ (2)

$f\left(x\right)=sign\left(\underset{i=1}{\overset{m}{\sum }}{y}_{i}{\alpha }_{i}^{*}〈{x}_{i}\cdot x〉+{b}^{*}\right)$ (3)

$f\left(x,{\alpha }_{i}^{*},{b}^{*}\right)=sign\left(\underset{i=1}{\overset{m}{\sum }}{y}_{i}{\alpha }_{i}^{*}k\left({x}_{i},x\right)+{b}^{*}\right)$ (4)

2.2. 蝙蝠算法

${f}_{i}={f}_{\mathrm{min}}+\left({f}_{\mathrm{max}}-{f}_{\mathrm{min}}\right)\beta$ (5)

${v}_{i}^{t+1}={v}_{i}^{t}+\left({x}_{i}^{t}-{x}_{*}\right){f}_{i}$ (6)

${x}_{i}^{t+1}={x}_{i}^{t}+{v}_{i}^{t+1}$ (7)

${x}_{new}={x}_{old}+\epsilon {A}^{t}$ (8)

${A}_{i}^{t+1}=\alpha {A}_{i}^{t}$ (9)

${r}_{i}^{t+1}={r}_{i}^{0}\left[1-\mathrm{exp}\left(-\gamma t\right)\right]$ (10)

2.3. 特征选择

3. 基于BA的SVM特征选择和参数优化

3.1. 蝙蝠位置的表示

3.2. 蝙蝠位置的更新标准

3.3. 适应度函数

$fi{t}_{i}={\omega }_{A}\cdot ac{c}_{i}+{\omega }_{F}\cdot \left(1-\frac{\underset{j=1}{\overset{n}{\sum }}{f}_{j}}{n}\right)$ (11)

${\omega }_{A}$ 是SVM的分类准确率权重， ${\omega }_{F}$ 是所选特征数量的权重，用户可根据需要进行适当调整。如果选择了特征j， ${f}_{j}=1$ ，否则 ${f}_{j}=0$$ac{c}_{i}$ 表示SVM分类准确率，由公式(12)给出。 $ac$$uc$ 分别表示正确分类的样本数和不正确分类的样本数。

$ac{c}_{i}=\frac{cc}{cc+uc}×100%$ (12)

4. BA + SVM参数优化和特征选择算法

BA + SVM参数优化和特征选择的流程图如图1所示，详细的实验步骤如下：

Table 1. The composition of the location of the bat i

Figure 1. Flow chart: BA + SVM parameter optimization and feature selection

5. 实验结果

5.1. 平台和数据集

5.2. 评估方法

Table 2. UCI machine learning library data set

Table 3. Classification of the two categories of issues

TP和FN分别表示正样本的正确分类率和正样本的不正确分类率，是两个重要的性能指标，计算公式表示如下：

$\text{TP}=\frac{#\text{TruePositive}}{#\text{FalseNegative}+#\text{TruePositive}}$ (13)

$\text{TN}=\frac{#\text{TureNegative}}{#\text{TrueNegative}+#\text{FalsePositive}}$ (14)

$\text{Averageaccuracy}=\frac{#\text{TruePositive}+#\text{TruePositive}}{#\text{TestingSample}}$ (15)

5.3. 实验结果

Table 4. Experimental design

Table 5. Comparison of classification results between BA + SVM and SVM and PSO + SVM

Table 6. Comparison of BA optimization and grid optimization without feature selection

6. 结论

Figure 2. Line chart: Comparison of three experimental results on 4 data sets

