# 人像抠图无监督语义精修算法Unsupervised Semantic Human Matting Refinement

Abstract: Aiming at the problems of human matting algorithm, which does not use trimap as a prior knowledge, such as redundant interference information, rough portrait edge contour, and easy confusion between objects carried by human body and the background, an unsupervised semantic matting algorithm for human matting is proposed. The algorithm is composed of human border sensing module and unsupervised semantic refinement module. The portraits border sensing module firstly uses the pedestrian detection model to identify all the portraits, and combines the border sensing algorithm to remove the redundant interference information. Unsupervised semantic refinement module uses unsupervised semantic segmentation model to extract features, and then uses semantic refinement algorithm to repair the portrait contour. Experiments show that in the self-made longterm portrait data set, the mainstream portrait matting algorithm is used as the baseline, and the unsupervised semantic refinement algorithm for portrait matting is added. The effect is significantly improved, and the objects carried by the human body can also be accurately identified. The outline is also clearer. At the same time, in the portrait data set, the effect is also improved to some extent, indicating that the algorithm is also generalized.

1. 引言

${I}_{i}={\alpha }_{i}{F}_{i}+\left(1-{\alpha }_{i}\right){B}_{i},\text{}{\alpha }_{i}\in \left[0,1\right]$ (1)

2. 人像抠图无监督语义精修算法

2.1. 人像边框感知模块

Input: $B=\left({x}_{1}\text{,}{x}_{2},\text{}{y}_{1},\text{}{y}_{2}\right)$ 表示人像边框，mask表示透明度遮罩。

Output: B'表示修正后的人像边框。

1) initialize set ${B}^{\prime }=0$

2) for $i←{x}_{1}$ to x2 do

3) while $mas{k}_{i,{y}_{1}}>0$

4) do ${y}_{1}←{y}_{1}+1$ end

5) while $mas{k}_{i,{y}_{\text{2}}}>0$

6) do ${y}_{\text{2}}←{y}_{\text{2}}-1$ end

7) end for

8) for $i←{y}_{2}$ to y1 do

9) while $mas{k}_{{x}_{1},i}>0$

10) do ${x}_{1}←{x}_{1}-1$ end

Figure 1. The flow chart

(a) 人像边框内修正算法 (b) 人像边框外修正算法

Figure 2. Portrait border sensing algorithm flow chart

11) while $mas{k}_{{x}_{2},i}>0$

12)do ${x}_{2}←{x}_{2}+1$ end

13) end for

14) ${B}^{\prime }←\left({x}_{1}，\text{}{x}_{2}，\text{}{y}_{1}，\text{}{y}_{2}\right)$

2.2. 无监督语义精修模块

Figure 3. Unsupervised portrait semantic segmentation network

${h}_{i}=\mathrm{Re}\text{LU}\left(\text{BN}\left(con{v}_{3×3}\left({h}_{i-1}\right)\right)\right)$ (2)

$G=\text{classification}\left(\text{BN}\left(con{v}_{1×1}\left({h}_{M}\right)\right)\right)$ (3)

$\text{Loss}=\frac{1}{N}\underset{i=1}{\overset{N}{\sum }}\left(-\underset{j=1}{\overset{k}{\sum }}{y}_{i,j}\mathrm{log}\frac{{\text{e}}^{{G}_{i,j}}}{{\sum }_{l=1}^{k}{\text{e}}^{{G}_{i,l}}}\right)$ (4)

RGB三通道图像经过无监督人像语义分割网络后每个像素点被分为q类。相同类别且连续的像素点构成区域Pk，其中 $k\in \left\{1,2,\cdots ,q\right\}$。人像精修算法的核心在于统计mask在每个区域Pk中前景与背景的个数，本文把像素点为黑色设为背景，像素点为白色设为前景。如果mask在区域Pk中前景的数量与背景的数量的比值大于θ，则把mask在区域Pk中的值都设置为前景的值，如果前景的数量与背景的数量的比值小于 $\left(\text{1}-\theta \right)$，则把mask在区域Pk中的值都设置为背景的值。其中θ为超参数。该算法可以修改mask中的属于背景信息的前景信息，同时可以细化人像及携带物品的边缘轮廓。

3. 实验设计及结果分析

3.1. 实验准备

3.2. 实验结果与分析

$\text{MSE}=\frac{\text{1}}{n}\underset{i=1}{\overset{n}{\sum }}{\left({x}_{i}-{y}_{i}\right)}^{2}$ (5)

$\text{SAD}=\frac{\text{1}}{n}\underset{i=1}{\overset{n}{\sum }}|{x}_{i}-{y}_{i}|$ (6)

Table 1. The results of two pre-classification algorithms in different hyperparameters θ

Table 2. Comparison of experimental results in the perspective portrait dataset

Table 3. Results of two module ablation experiments

Figure 4. Experimental renderings in the perspective portrait dataset

Table 4. Comparison of experimental results in the data set of half-length portraits

Figure 5. Experimental renderings of the bust portrait data set

4. 结束语

