Contextualized embeddings predict spoken durations of 7470 Mandarin monosyllabic words above chance at token level and enable back-transformation of normalized f0 contours to empirical millisecond-scale contours.
Montreal forced aligner: Trainable text-speech align- ment using kaldi
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Using embeddings to predict spoken word duration and pitch in Mandarin monosyllabic words
Contextualized embeddings predict spoken durations of 7470 Mandarin monosyllabic words above chance at token level and enable back-transformation of normalized f0 contours to empirical millisecond-scale contours.