kaldi中的xvector训练(aisell v1)

作一个记录。

总说明文档

（kaldi团队官方）

https://david-ryan-snyder.github.io/2017/10/04/model_sre16_v2.html

github地址

https://github.com/kaldi-asr/kaldi/pull/1896/

网络结构是基于18年的这篇论文的

X-VECTORS: ROBUST DNN EMBEDDINGS FOR SPEAKER RECOGNITION

The system, built for speaker recognition, consists of a TDNN with a

statistics pooling layer. It’s trained to classify a list of speakers

using a multiclass cross entropy objective. In the future, this will

probably be extended to include “same vs different” training. After

training, the last few layers of the network are removed, and

variable-length utterances are mapped to fixed-dimensional embeddings

that are used in a PLDA

原文链接：https://blog.csdn.net/weixin_43056919/article/details/87480205

你可能也喜欢