作一个记录。
总说明文档
(kaldi团队官方)
https://david-ryan-snyder.github.io/2017/10/04/model_sre16_v2.html
github地址
https://github.com/kaldi-asr/kaldi/pull/1896/
网络结构是基于18年的这篇论文的
X-VECTORS: ROBUST DNN EMBEDDINGS FOR SPEAKER RECOGNITION
The system, built for speaker recognition, consists of a TDNN with a
statistics pooling layer. It’s trained to classify a list of speakers
using a multiclass cross entropy objective. In the future, this will
probably be extended to include “same vs different” training. After
training, the last few layers of the network are removed, and
variable-length utterances are mapped to fixed-dimensional embeddings
that are used in a PLDA