# 函数型线性判别分析Linear Discriminant Analysis for Functional Data

Abstract: In this paper, functional linear discriminant analysis method is proposed for the classification problem of input as functional data. By introducing the functional norm to measure the distance within-class and between-class, an optimization model of functional linear discriminant analysis is constructed. Furthermore, by using the basis function method to transform the infinite dimensional function space into a finite dimensional optimization model, then this model is easy to solve. Since the data is functional, the first derivative or the second derivative of the function can be found. The classification result can be further improved by using the data after the derivative. Finally, the numerical experiments show the feasibility and effectiveness of the functional linear discriminant analysis method.

1. 引言

2. 相关工作

${S}_{w}=\underset{x\in {X}_{1}}{\sum }\left(x-{\mu }_{1}\right){\left(x-{\mu }_{1}\right)}^{T}+\underset{x\in {X}_{2}}{\sum }\left(x-{\mu }_{2}\right){\left(x-{\mu }_{2}\right)}^{T}$ (1)

${S}_{b}=\left({\mu }_{1}-{\mu }_{2}\right){\left({\mu }_{1}-{\mu }_{2}\right)}^{T}$ (2)

$\mathrm{max}J=\frac{{w}^{T}{S}_{b}w}{{w}^{T}{S}_{w}w}$ (3)

3. 函数型线性判别分析

$\mathrm{max}J=\frac{{‖\int w\left(t\right){\mu }_{1}\left(t\right)-\int w\left(t\right){\mu }_{2}\left(t\right)‖}_{2}^{2}}{Var\int w\left(t\right){x}_{1}\left(t\right)+Var\int w\left(t\right){x}_{2}\left(t\right)}$ (4)

${\stackrel{^}{x}}_{j}\left(t\right)=\underset{k=1}{\overset{K}{\sum }}{c}_{jk}{\varphi }_{k}\left(t\right)$ (5)

${\stackrel{^}{x}}_{j}\left(t\right)={C}_{j}^{T}\Phi ={\Phi }^{T}{C}_{j}$ (6)

$\begin{array}{c}\mathrm{var}\int w\left(t\right){x}_{j1}\left(t\right)={N}^{-1}\int w\left(t\right){x}_{j1}\left(t\right)\int w\left(t\right){x}_{j1}\left(t\right)\\ ={N}^{-1}\int {\text{d}}^{T}\Phi {\Phi }^{T}\stackrel{¯}{C}\int {\left({\text{d}}^{T}\Phi {\Phi }^{T}\stackrel{¯}{C}\right)}^{T}\\ ={N}^{-1}{\text{d}}^{T}\int \Phi {\Phi }^{T}\stackrel{¯}{C}{\stackrel{¯}{C}}^{T}\int \Phi {\Phi }^{T}\text{d}\\ ={\text{d}}^{T}J{V}_{0}J\text{d}\end{array}$ (7)

$\begin{array}{c}\mathrm{var}\int w\left(t\right){x}_{j2}\left(t\right)={N}^{-1}\int w{x}_{j2}\int w{x}_{j2}\\ ={N}^{-1}\int {\text{d}}^{T}\Phi {\Phi }^{T}\stackrel{^}{C}\int {\left({\text{d}}^{T}\Phi {\Phi }^{T}\stackrel{^}{C}\right)}^{T}\\ ={N}^{-1}{\text{d}}^{T}\int \Phi {\Phi }^{T}\stackrel{^}{C}{\stackrel{^}{C}}^{T}\int \Phi {\Phi }^{T}\text{d}\\ ={\text{d}}^{T}J{V}_{1}J\text{d}\end{array}$ (8)

$\begin{array}{c}{‖\int w\left(t\right){\mu }_{1}\left(t\right)-\int w\left(t\right){\mu }_{2}\left(t\right)‖}_{2}^{2}=\int w\left(t\right)\left({\mu }_{1}\left(t\right)-{\mu }_{2}\left(t\right)\right)\text{d}t\int w\left(t\right)\left({\mu }_{1}\left(t\right)-{\mu }_{2}\left(t\right)\right)\text{d}t\\ =\int {\text{d}}^{T}\Phi {\Phi }^{T}m\int {\left({\text{d}}^{T}\Phi {\Phi }^{T}m\right)}^{T}\\ ={d}^{T}\int \Phi {\Phi }^{T}m{m}^{T}\int \Phi {\Phi }^{T}\text{d}\\ ={d}^{T}JVJ\text{d}\end{array}$ (9)

$\mathrm{max}J\left(d\right)=\frac{{d}^{T}JVJd}{{d}^{T}J\left({V}_{0}+{V}_{1}\right)Jd}$ (10)

$\begin{array}{cc}\underset{d}{\mathrm{min}}& -{d}^{T}JVJd\\ \text{s}\text{.t}& {d}^{T}J\left({V}_{0}+{V}_{1}\right)Jd=1\end{array}$ (11)

$L\left(d,\lambda \right)=-{d}^{T}JVJd+\lambda \left[{d}^{T}J\left({V}_{0}+{V}_{1}\right)Jd-1\right]$ (12)

$\frac{\partial L}{\partial d}=-2JVJd+2\lambda J\left({V}_{0}+{V}_{1}\right)Jd=0$ (13)

$JVJd=\lambda J\left({V}_{0}+{V}_{1}\right)Jd$ (14)

$L=J\left({V}_{0}+{V}_{1}\right)J$ ，即求 ${L}^{-1}JVJd=\lambda d$ 的一般特征值问题。

1) 选择基函数，求解样本函数的系数 $\stackrel{¯}{C},\stackrel{^}{C}$

2) 计算协方差矩阵 ${V}_{0},{V}_{1},V$ 和基矩阵 $J$

3) 计算(14)式得出 $\lambda$

4) 代入 ${f}_{j}=\int wx=\int w\left(t\right){x}_{j}\left(t\right)\text{d}t$ 并计算 $\mathrm{min}\text{\hspace{0.17em}}{‖{f}_{j}-{\mu }_{i}\left(t\right)‖}^{2}$

4. 数值实验

4.1. 人工数据的数值实验

(a) (b)

Figure 1. (a) Curves of 50 different types of artificial data sets, (b) Curves fitted by basis function method

Figure 2. Classification of 50 curves using functional linear discriminant analysis

(a) (b)

Figure 3. (a) Curves of 50 different types of artificial data sets, (b) Curve fitted by basis function method

Table 1. Data set details

4.2. Spectrometric数据集

(a) (b)

Figure 4. First-order derivation of spectrometric data

(a) (b)

Figure 5. Second-order derivation of spectrometric data

Table 2. Data set details

Figure 6. Take 20% of the spectrometric data after the first-order derivation classification

Figure 7. Take 20% of the spectrometric data after the second-order derivation classification

5. 结论

