The Research of Text-Independent Feature Extraction Based on Single Training Sample
作者: 郭建敏 ：陕西师范大学物理学与信息技术学院，陕西 西安;
Abstract: The existing speaker identification are based on Linear Predictive Coding Cepstral (LPCC) coeffi-cients, Mel-Frequency Cepstral Coefficients (MFCC), local normalized cepstral coefficients (LNCC) and wavelet packet transform (WPT) method; these features are sensitive to noisy and environmental sounds. This paper describes a novel robust text-independent feature extraction method using single training sample. In the proposed method, the features can reflect a person’s basic phonation characteristic and distinguish different speakers. This paper introduces the four methods in single training sample and compares them with the proposed method. Experimental results on speech databases in English and Chinese demonstrate that the proposed approach can implement feature extraction in speaker identification based on single training sample, and yields a better performance in single training sample.
文章引用: 郭建敏 (2016) 与文本无关的单训练样本特征点提取研究。 计算机科学与应用， 6， 384-392. doi: 10.12677/CSA.2016.66047
 Pohjalainen, J. and Räsänen, O. (2015) Feature Selection Methods and Their Combinations in High-Dimensional Classification of Speaker Likability, Intelligibility and Personality Traits. Computer Speech and Language, 29, 145-171.
Kinnunen, T. and Li. H.Z. (2010) An Overview of Text-Independent Speaker Recognition: From Features to Supervectors. Speech Communication, 52, 12-40.
Vijayasenan, D. and Valente, F. (2012) Multistream Speaker Diarization of Meetings Recordings beyond MFCC and TDOA Features. Speech Communication, 54, 55-67.
 王彪. 基于LPCC参数的语音识别系统[J]. 电子设计工程, 2012, 20(7).
 许昊, 张二华. 基于改进C0复杂度和MFCC相似度的端点检测[J]. 现代电子技术, 2015, 38(10).
 Madikeri, S. (2012) Effect of Feature Warping and Decorrelation on Mel Filter bank Slope for Speaker Recognition, IEEE, 978-1-4673.
R. Shantha Selva Kumari, S. Selva Nidhyananthan and Anand. G. (2012) Fused Mel Feature Sets Based Text-Inde- pendent Speaker Identification Using Gaussian Mixture Model. Procedia Engineering, 30, 319-326.
Ai, O.C. and Hariharan, M. (2012) Classiﬁcation of Speech Dysﬂuencies with MFCC and LPCC Features. Expert Systems with Applications, 39, 2157-2165.
 El-Henawy, I.M. and Khedr, W.I. (2014) Recognition of Phonetic Arabic Figures via Wavelet Based Mel Frequency Cepstrum Using HMMs. HBRC Journal, 10, 49-54.
Poblete, V. and Espic, F. (2015) A Perceptually-Motivated Low-Complexity Instantaneous Linear Channel Normalization Technique Applied to Speaker Verification. Computer Speech and Language, 31, 1-27.
Turner, C. and Joseph, A. (2015) A Wavelet Packet and Mel-Frequency Cepstral Coefficients-Based Feature Extraction Method for Speaker Identification. Procedia Computer Science, 61, 416-421.