﻿ 二元逻辑回归模型中的一阶近似刀切Liu估计

# 二元逻辑回归模型中的一阶近似刀切Liu估计A First-Order Approximated Jackknifed Liu Estimator in Binary Logistic Regression Model

Abstract: In order to solve the problem of multicollinearity in the binary logistic regression model, we combine the advantages of the first-order approximated Liu estimator and the jackknife procedure, and propose a new estimator, namely the first-order approximated jackknifed Liu estimator. The research obtained the sufficient and necessary or sufficient conditions for the new estimator to be superior to the first-order approximated maximum likelihood estimator, the first-order approximated Liu estimator and the first-order approximated jackknifed ridge estimatior under the bias, mean square error matrix or mean square error criterion. Furthermore, Monte Carlo simulation and empirical analysis are used to explore the first-order approximated jackknifed Liu estimator’s performance in the sense of bias and mean square error.

1. 引言

${\pi }_{i}=\frac{\mathrm{exp}\left({{x}^{\prime }}_{i}\beta \right)}{1+\mathrm{exp}\left({{x}^{\prime }}_{i}\beta \right)}$, $i=1,\cdots ,n$ (1)

${{x}^{\prime }}_{i}$ 是n × p样本资料矩阵X的第i行元素组成的向量。 $\beta \text{\hspace{0.17em}}={\left({\beta }_{1},\cdots ,{\beta }_{p}\right)}^{\prime }$ 为p × 1的系数向量。 ${\pi }_{i}\text{\hspace{0.17em}}=P\left({y}_{i}=1|{x}_{i}\right)$ 是在 ${x}_{i}$ 的条件下 ${y}_{i}=1$ 的概率。

$L\left(\beta \right)=\underset{i=1}{\overset{n}{\sum }}\left[{y}_{i}{{x}^{\prime }}_{i}\beta -\mathrm{ln}\left(1+\mathrm{exp}\left({{x}^{\prime }}_{i}\beta \right)\right)\right]$ (2)

$\frac{\partial L\left(\beta \right)}{\partial \beta }={X}^{\prime }\left(y-\pi \right)=0$ (3)

${\stackrel{^}{\beta }}^{\left(m\right)}={\stackrel{^}{\beta }}^{\left(m-1\right)}+{\left({X}^{\prime }{\stackrel{^}{V}}^{\left(m-1\right)}X\right)}^{-1}{X}^{\prime }\left(y-{\stackrel{^}{\pi }}_{i}^{\left(m-1\right)}\right)$ (4)

${\stackrel{^}{\beta }}_{MLE}={\left({X}^{\prime }\stackrel{^}{V}X\right)}^{-1}{X}^{\prime }\stackrel{^}{V}\stackrel{^}{z}$ (5)

$\stackrel{^}{\beta }\left(k\right)={\left(X\stackrel{^}{V}X+kI\right)}^{-1}X\stackrel{^}{V}X{\stackrel{^}{\beta }}_{MLE}$, $k>0$ (6)

k为岭参数。

Månsson等 [2] 提出的Liu估计(LE)，表达式如下：

$\stackrel{^}{\beta }\left(d\right)={\left({X}^{\prime }\stackrel{^}{V}X+I\right)}^{-1}\left({X}^{\prime }\stackrel{^}{V}X+dI\right){\stackrel{^}{\beta }}_{MLE}$, $0 (7)

${\stackrel{^}{\beta }}^{\left(1\right)}\left(k\right)={\left(X{\stackrel{^}{V}}^{\left(0\right)}X+kI\right)}^{-1}X{\stackrel{^}{V}}^{\left(0\right)}X{\stackrel{^}{\beta }}^{\left(1\right)}\left(ML\right)$, $k>0$ (8)

${\stackrel{^}{\beta }}^{\left(1\right)}\left(ML\right)={\left(X{\stackrel{^}{V}}^{\left(0\right)}X\right)}^{-1}X{\stackrel{^}{V}}^{\left(0\right)}{\stackrel{^}{z}}^{\left( 0 \right)}$

Özkale [4] 提出了一阶近似Liu估计(FAL)，表达式如下：

${\stackrel{^}{\beta }}^{\left(1\right)}\left(d\right)={\left({X}^{\prime }{\stackrel{^}{V}}^{\left(0\right)}X+I\right)}^{-1}\left({X}^{\prime }{\stackrel{^}{V}}^{\left(0\right)}X+dI\right){\stackrel{^}{\beta }}^{\left(1\right)}\left(ML\right)$, $0 (9)

${\stackrel{˜}{\beta }}^{\left(1\right)}\left(k\right)=\left(I-{k}^{2}{\left(X{\stackrel{^}{V}}^{\left(0\right)}X+kI\right)}^{-2}\right){\stackrel{^}{\beta }}^{\left(1\right)}\left(ML\right)$, $k>0$ (10)

2. 提出的估计

${\stackrel{^}{\beta }}_{-i}^{\left(1\right)}\left(d\right)={\left({{X}^{\prime }}_{-i}{\stackrel{^}{V}}_{-i}^{\left(0\right)}{X}_{-i}+I\right)}^{-1}\left({{X}^{\prime }}_{-i}{\stackrel{^}{V}}_{-i}^{\left(0\right)}{\stackrel{^}{z}}_{-i}^{\left(0\right)}+d{\stackrel{^}{\beta }}_{-i}^{\left(1\right)}\left(ML\right)\right)$ (11)

$\begin{array}{c}{\stackrel{^}{\beta }}_{-i}^{\left(1\right)}\left(d\right)={\stackrel{^}{\beta }}^{\left(1\right)}\left(d\right)-\frac{1}{1-{h}_{ii}}{\left({X}^{\prime }{\stackrel{^}{V}}^{\left(0\right)}X+I\right)}^{-1}{x}_{i}{\stackrel{^}{v}}_{i}^{\left(0\right)}\left({\stackrel{^}{z}}_{i}^{\left(0\right)}-{{x}^{\prime }}_{i}{\stackrel{^}{\beta }}^{\left(1\right)}\left(d\right)\right)\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}-d\frac{1}{\left(1-{h}_{ii}\right)\left(1-{h}_{i}\right)}{\left({X}^{\prime }{\stackrel{^}{V}}^{\left(0\right)}X+I\right)}^{-1}{x}_{i}{\stackrel{^}{v}}_{i}^{\left(0\right)}{{x}^{\prime }}_{i}{\left({X}^{\prime }{\stackrel{^}{V}}^{\left(0\right)}X+I\right)}^{-1}\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\cdot {\left({X}^{\prime }{\stackrel{^}{V}}^{\left(0\right)}X\right)}^{-1}{x}_{i}{\stackrel{^}{v}}_{i}^{\left(0\right)}\left({\stackrel{^}{z}}_{i}^{\left(0\right)}-{{x}^{\prime }}_{i}{\stackrel{^}{\beta }}^{\left(1\right)}\left(ML\right)\right)\end{array}$ (12)

${Q}_{i}={\stackrel{^}{\beta }}^{\left(1\right)}\left(d\right)+n\left(1-{h}_{ii}\right)\left({\stackrel{^}{\beta }}^{\left(1\right)}\left(d\right)-{\stackrel{^}{\beta }}_{-i}^{\left(1\right)}\left(d\right)\right)$ (13)

${\stackrel{˜}{\beta }}^{\left(1\right)}\left(d\right)={n}^{-1}\sum {Q}_{i}$ (14)

$\begin{array}{c}{\stackrel{˜}{\beta }}^{\left(1\right)}\left(d\right)=\left\{\left(I-{\left({X}^{\prime }{\stackrel{^}{V}}^{\left(0\right)}X+I\right)}^{-1}{X}^{\prime }{\stackrel{^}{V}}^{\left(0\right)}X\right){\left({X}^{\prime }{\stackrel{^}{V}}^{\left(0\right)}X+I\right)}^{-1}\left({X}^{\prime }{\stackrel{^}{V}}^{\left(0\right)}X+dI\right)\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{\left({X}^{\prime }{\stackrel{^}{V}}^{\left(0\right)}X+I\right)}^{-1}{X}^{\prime }{\stackrel{^}{V}}^{\left(0\right)}X\right\}{\stackrel{^}{\beta }}^{\left(1\right)}\left(ML\right)\end{array}$ (15)

3. 一阶近似刀切Liu估计的性质

$bias\left(\stackrel{^}{\beta }\right)=E\left(\stackrel{^}{\beta }\right)-\beta$ (16)

${‖Bias\left(\stackrel{^}{\beta }\right)‖}^{2}=bias{\left(\stackrel{^}{\beta }\right)}^{\prime }bias\left(\stackrel{^}{\beta }\right)$ (17)

$MSEM\left(\stackrel{^}{\beta }\right)=E\left(\stackrel{^}{\beta }-\beta \right){\left(\stackrel{^}{\beta }-\beta \right)}^{\prime }$ (18)

$MSE\left(\stackrel{^}{\beta }\right)=E\left(\stackrel{^}{\beta }-{\beta }^{\prime }\right)\left(\stackrel{^}{\beta }-\beta \right)$ (19)

$bias\left({\stackrel{^}{\beta }}^{\left(1\right)}\left(d\right)\right)=\left(d-1\right){\left({X}^{\prime }{\stackrel{^}{V}}^{\left(0\right)}X+I\right)}^{-1}{a}^{0}$ (20)

$bias\left({\stackrel{˜}{\beta }}^{\left(1\right)}\left(d\right)\right)=\left(d-1\right){\left({X}^{\prime }{\stackrel{^}{V}}^{\left(0\right)}X+I\right)}^{-2}{a}^{0}$ (21)

${‖Bias\left({\stackrel{^}{\beta }}^{\left(1\right)}\left(d\right)\right)‖}^{2}-{‖Bias\left({\stackrel{˜}{\beta }}^{\left(1\right)}\left(d\right)\right)‖}^{2}={\left({a}^{0}\right)}^{\prime }{G}_{1}{a}^{0}$

$bias\left({\stackrel{˜}{\beta }}^{\left(1\right)}\left(k\right)\right)=-{k}^{2}{\left({X}^{\prime }{\stackrel{^}{V}}^{\left(0\right)}X+kI\right)}^{-2}{a}^{0}$ (22)

${‖Bias\left({\stackrel{˜}{\beta }}^{\left(1\right)}\left(k\right)\right)‖}^{2}-{‖Bias\left({\stackrel{˜}{\beta }}^{\left(1\right)}\left(d\right)\right)‖}^{2}={\left({a}^{0}\right)}^{\prime }{G}_{2}{a}^{0}$

$\left(1-d\right){\left({a}^{0}\right)}^{\prime }{\left\{2\Lambda +4I+\left(d+1\right){\Lambda }^{-1}\right\}}^{-1}{a}^{0}<1$

${M}_{1}\left(d\right)={M}_{1}-{\left(d-1\right)}^{2}{\left(\Lambda +I\right)}^{-2}{a}^{0}{\left({a}^{0}\right)}^{\prime }{\left(\Lambda +I\right)}^{-2}$

$\begin{array}{c}{M}_{1}={\Lambda }^{-1}-\left\{\left(I-{\left(\Lambda +I\right)}^{-1}\Lambda \right){\left(\Lambda +I\right)}^{-1}\left(\Lambda +dI\right)+{\left(\Lambda +I\right)}^{-1}\Lambda \right\}{\Lambda }^{-1}\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\cdot \left\{\left(\Lambda +dI\right){\left(\Lambda +I\right)}^{-1}\left(I-{\left(\Lambda +I\right)}^{-1}\Lambda \right)+\Lambda {\left(\Lambda +I\right)}^{-1}\right\}\\ =\left(1-d\right){\left(\Lambda +I\right)}^{-2}\left\{2\Lambda +4I+\left(d+1\right){\Lambda }^{-1}\right\}{\left(\Lambda +I\right)}^{-2}\end{array}$

$\left(d-1\right){\left({a}^{0}\right)}^{\prime }{\left\{2{\Lambda }^{2}+\left(d+3\right)\Lambda +2dI\right\}}^{-1}{a}^{0}<1$

${M}_{2}\left(d\right)={M}_{2}+{\left(d-1\right)}^{2}{\left(\Lambda +I\right)}^{-1}{a}^{0}{\left({a}^{0}\right)}^{\prime }{\left(\Lambda +I\right)}^{-1}-{\left(d-1\right)}^{2}{\left(\Lambda +I\right)}^{-2}{a}^{0}{\left({a}^{0}\right)}^{\prime }{\left(\Lambda +I\right)}^{-2}$

$\begin{array}{c}{M}_{2}=\left\{{\left(\Lambda +I\right)}^{-1}\left(\Lambda +dI\right)\right\}{\Lambda }^{-1}\left\{\left(\Lambda +dI\right){\left(\Lambda +I\right)}^{-1}\right\}-\left\{\left(I-{\left(\Lambda +I\right)}^{-1}\Lambda \right){\left(\Lambda +I\right)}^{-1}\left(\Lambda +dI\right)\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{\left(\Lambda +I\right)}^{-1}\Lambda \right\}{\Lambda }^{-1}\left\{\left(\Lambda +dI\right){\left(\Lambda +I\right)}^{-1}\left(I-{\left(\Lambda +I\right)}^{-1}\Lambda \right)+\Lambda {\left(\Lambda +I\right)}^{-1}\right\}\\ =\left(d-1\right){\left(\Lambda +I\right)}^{-2}\left\{2{\Lambda }^{2}+\left(d+3\right)\Lambda +2dI\right\}{\left(\Lambda +I\right)}^{-2}\end{array}$

$MSE\left({\stackrel{˜}{\beta }}^{\left(1\right)}\left(d\right)\right)=\underset{i=1}{\overset{p}{\sum }}\frac{{\left({\lambda }_{i}{}^{2}+2{\lambda }_{i}+d\right)}^{2}+{\left(d-1\right)}^{2}{\left({a}_{i}^{0}\right)}^{2}{\lambda }_{i}}{{\lambda }_{i}{\left({\lambda }_{i}+1\right)}^{4}}$ (23)

$MSE\left({\stackrel{^}{\beta }}^{\left(1\right)}\left(d\right)\right)=\underset{i=1}{\overset{p}{\sum }}\frac{{\left({\lambda }_{i}+d\right)}^{2}+{\left(d-1\right)}^{2}{\left({a}_{i}^{0}\right)}^{2}{\lambda }_{i}}{{\lambda }_{i}{\left({\lambda }_{i}+1\right)}^{2}}$ (24)

$\begin{array}{l}MSE\left({\stackrel{˜}{\beta }}^{\left(1\right)}\left(d\right)\right)-MSE\left({\stackrel{^}{\beta }}^{\left(1\right)}\left(d\right)\right)\\ =\underset{i=1}{\overset{p}{\sum }}\frac{{\left({\lambda }_{i}^{2}+2{\lambda }_{i}+d\right)}^{2}+{\left(d-1\right)}^{2}{\left({a}_{i}^{0}\right)}^{2}{\lambda }_{i}}{{\lambda }_{i}{\left({\lambda }_{i}+1\right)}^{4}}-\underset{i=1}{\overset{p}{\sum }}\frac{{\left({\lambda }_{i}+d\right)}^{2}+{\left(d-1\right)}^{2}{\left({a}_{i}^{0}\right)}^{2}{\lambda }_{i}}{{\lambda }_{i}{\left({\lambda }_{i}+1\right)}^{2}}\\ =\underset{i=1}{\overset{p}{\sum }}\left\{\frac{{\left({\lambda }_{i}^{2}+2{\lambda }_{i}+d\right)}^{2}-{\left({\lambda }_{i}+1\right)}^{2}+{\left(d-1\right)}^{2}{\left({\lambda }_{i}+d\right)}^{2}}{{\lambda }_{i}{\left({\lambda }_{i}+1\right)}^{4}}+\frac{{\left(d-1\right)}^{2}{\left({a}_{i}^{0}\right)}^{2}}{{\left({\lambda }_{i}+1\right)}^{4}}-\frac{{\left(d-1\right)}^{2}{\left({a}_{i}^{0}\right)}^{2}}{{\left({\lambda }_{i}+1\right)}^{2}}\right\}\\ =\left(1-d\right)\underset{i=1}{\overset{p}{\sum }}\frac{2{\lambda }_{i}{}^{2}+\left(d+3\right){\lambda }_{i}+2d}{{\left({\lambda }_{i}+1\right)}^{4}}+{\left(d-1\right)}^{2}\underset{i=1}{\overset{p}{\sum }}\frac{{\left({a}_{i}^{0}\right)}^{2}}{{\left({\lambda }_{i}+1\right)}^{4}}-{\left(d-1\right)}^{2}\underset{i=1}{\overset{p}{\sum }}\frac{{\left({a}_{i}^{0}\right)}^{2}}{{\left({\lambda }_{i}+1\right)}^{2}}\\ =\left(1-d\right)\underset{i=1}{\overset{p}{\sum }}\frac{1}{{\left({\lambda }_{i}+1\right)}^{4}}{f}_{i}\left( d \right)\end{array}$

$1<{\left({a}_{i}^{0}\right)}^{2}\left({\lambda }_{i}+2\right)/\left(2{\lambda }_{i}+3\right)$$d\le \frac{{\left({a}_{i}^{0}\right)}^{2}-\left(2{\lambda }_{i}+3\right)/\left({\lambda }_{i}+2\right)}{{\left({a}_{i}^{0}\right)}^{2}+1/{\lambda }_{i}}$${f}_{i}\left(d\right)\le 0$。因为 $0$MSE\left({\stackrel{˜}{\beta }}^{\left(1\right)}\left(d\right)\right)-MSE\left({\stackrel{^}{\beta }}^{\left(1\right)}\left(d\right)\right)\le 0$。定理得证。

$0

$MSE\left(\stackrel{˜}{\beta }\left(d\right)\right)\le MSE\left(\stackrel{^}{\beta }\left(d\right)\right)$

$\mathrm{max}\left\{0,\frac{{\left({a}_{i}^{0}\right)}^{2}-\left(2{\lambda }_{i}+3\right)/\left({\lambda }_{i}+2\right)}{{\left({a}_{i}^{0}\right)}^{2}+1/{\lambda }_{i}}\right\}

$MSE\left({\stackrel{˜}{\beta }}^{\left(1\right)}\left(d\right)\right)>MSE\left({\stackrel{^}{\beta }}^{\left(1\right)}\left(d\right)\right)$

4. 蒙特卡罗模拟

${x}_{ij}={\left(1-{\rho }^{2}\right)}^{1/2}{z}_{ij}+\rho {z}_{ip+1}$, $i=1,\cdots ,n;j=1,\cdots ,p$ (25)

$MSE\left(\stackrel{^}{\beta }\right)=\frac{1}{2000}\underset{m=1}{\overset{2000}{\sum }}tr\left(MSEM\left({\stackrel{^}{\beta }}_{\left(m\right)}\right)\right)$ (26)

Table 1. Estimated MSE values of the MLE, FAE and FAJL when p = 4

Table 2. Estimated MSE values of the MLE, FAE and FAJL when p = 6

Table 3. The sum of squares of the bias values of the MLE, FAE and FAJL when p = 4

Table 4. The sum of squares of the bias values of the MLE, FAE and FAJL when p = 6

5. 实证分析

Table 5. Estimated MSE values of the MLE, FAE and FAJL

Table 6. The sum of squares of the bias values of the FAL and FAJL when p = 6

6. 结论

[1] Schaefer, R.L., Roi, L.D. and Wolfe, R.A. (1984) A Ridge Logistic Estimator. Communications in Statistics-Theory and Methods, 13, 99-113.
https://doi.org/10.1080/03610928408828664

[2] Månsson, K., Golam Kibria, B.M. and Shukur, G. (2012) On Liu Estimators for the Logit Regression Model. Economic Modelling, 29, 1483-1488.
https://doi.org/10.1016/j.econmod.2011.11.015

[3] LeCessie, S. and VanHouwelingen, J.C. (1992) Ridge Estimators in Logistic Regression. Journal of Applied Statistics, 41, 191-201.
https://doi.org/10.2307/2347628

[4] Revan Özkale, M. (2016) Iterative Algorithms of Biased Estimation Methods in Binary Logistic Regression. Statistical Papers, 57, 991-1016.
https://doi.org/10.1007/s00362-016-0780-9

[5] Quenouille, M.H. (1956) Notes on Bias in Estimation. Biometrika, 43, 353-360.
https://doi.org/10.1093/biomet/43.3-4.353

[6] Tukey, J.W. (1958) Bias and Confidence in Not Quite Large Samples (Abstract). Annals of Mathematical Statistics, 29, 614.
https://doi.org/10.1214/aoms/1177706635

[7] Revan Özkale, M. and Arıcan, E. (2019) A First-Order Approximated Jackknifed Ridge Estimator in Binary Logistic Regression. Computational Statistics, 34, 683-712.
https://doi.org/10.1007/s00180-018-0851-6

[8] Hinkley, V. (1977) Jackknifing in Unbalanced Situations. Technometrics, 19, 285-292.
https://doi.org/10.1080/00401706.1977.10489550

[9] Farebrother, R.W. (1976) Further Results on the Mean Square Error of Ridge Regression. Journal of the Royal Statistical Society B, 28, 248-250.
https://doi.org/10.1111/j.2517-6161.1976.tb01588.x

[10] Gary, C., McDonald, D. and, Galarneau, I. (1975) A Monte Carlo Evaluation of Some Ridge-Type Estimators. Journal of the American Statistical Association, 70, 407-416.
https://doi.org/10.1080/01621459.1975.10479882

[11] Kibria, B.M.G. (2003) Performance of Some New Ridge Regression Estimators. Communications in Statistics Simulation and Computation, 32, 419-435.
https://doi.org/10.1081/SAC-120017499

[12] Agresti, A. (2015) Foundations of Linear and Generalized Linear Models. Wiley, Hoboken.

[13] Heinze, G. and Schemper, M. (2002) A Solution to the Problem of Separation in Logistic Regression. Statistics in Medicine, 21, 2409-2419.
https://doi.org/10.1002/sim.1047

Top