Comparison of forced-alignment speech recognition and humans for generating reference VAD

Kraljevski, I.; Tan, Z-H; Bissiri, Maria Paola

Comparison of forced-alignment speech recognition and humans for generating reference VAD

Files

4422.pdf (242.1 KB)

Date

2015

Authors

Kraljevski, I.

Tan, Z-H

Bissiri, Maria Paola

Citation

Kraljevski, I., Tan, Z. & Bissiri, M. (2015) Comparison of forced-alignment speech recognition and humans for generating reference VAD, Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp. 2937-2941.

Abstract

This present paper aims to answer the question whether forced-alignment speech recognition can be used as an alternative to humans in generating reference Voice Activity Detection (VAD) transcriptions. An investigation of the level of agreement between automatic/manual VAD transcriptions and the reference ones produced by a human expert was carried out. Thereafter, statistical analysis was employed on the automatically produced and the collected manual transcriptions. Experimental results confirmed that forced-alignment speech recognition can provide accurate and consistent VAD labels.

URI

https://eresearch.qmu.ac.uk/handle/20.500.12289/4422

Collections

CASL

Full item page

Comparison of forced-alignment speech recognition and humans for generating reference VAD

Files

Date

Authors

Citation

Abstract

URI

Official URL

Collections