Browsing by Person "Richmond, Korin"

Now showing 1 - 5 of 5

Continuous speech recognition using articulatory data.
(2000) Wrench, Alan A.; Richmond, Korin
In this paper we show that there is measurable information in the articulatory system which can help to disambiguate the acoustic signal. We measure directly the movement of the lips, tongue, jaw, velum and larynx and parameterise this articulatory feature space using principal components analysis. The parameterisation is developed and evaluated using a speaker dependent phone recognition task on a specially recorded TIMIT corpus of 460 sentences. The results show that there is useful supplementary information contained in the articulatory data which yields a small but significant improvement in phone recognition accuracy of 2%. However, preliminary attempts to estimate the articulatory data from the acoustic signal and use this to supplement the acoustic input have not yielded any significant improvement in phone accuracy.
Predicting Tongue Shapes from a Few Landmark Locations
(2008-09) Qin, Chao; Carreira-Perpinan, Miguel A.; Richmond, Korin; Wrench, Alan A.; Renals, Steve
We present a method for predicting the midsagittal tongue contour from the locations of a few landmarks (metal pellets) on the tongue surface, as used in articulatory databases such as MOCHA and the Wisconsin XRDB. Our method learns a mapping using ground-truth tongue contours derived from ultrasound data and drastically improves over spline interpolation. We also determine the optimal locations of the landmarks, and the number of landmarks required to achieve a desired prediction error: 3-4 landmarks are enough to achieve 0.3-0.2 mm error per point on the tongue.
Recording speech articulation in dialogue: Evaluating a synchronized double Electromagnetic Articulography setup
(Elsevier, 2013-08-28) Geng, Christian C.; Turk, Alice; Scobbie, James M.; Macmartin, Cedric; Hoole, Philip; Richmond, Korin; Wrench, Alan A.; Pouplier, Marianne; Bard, Ellen Gurman; Campbell, Ziggy; Dickie, Catherine; Dubourg, Eddie; Hardcastle, William J.; Kainada, Evia; King, Simon; Lickley, Robin; Nakai, Satsuki; Renals, Steve; White, Kevin; Wiegand, Ronny; EPSRC
We demonstrate the workability of an experimental facility that is geared towards the acquisition of articulatory data from a variety of speech styles common in language use, by means of two synchronized electromagnetic articulography (EMA) devices. This approach synthesizes the advantages of real dialogue settings for speech research with a detailed description of the physiological reality of speech production. We describe the facility's method for acquiring synchronized audio streams of two speakers and the system that enables communication among control room technicians, experimenters and participants. Further, we demonstrate the feasibility of the approach by evaluating problems inherent to this specific setup: The first problem is the accuracy of temporal synchronization of the two EMA machines, the second is the severity of electromagnetic interference between the two machines. Our results suggest that the synchronization method used yields an accuracy of approximately 1 ms. Electromagnetic interference was derived from the complex-valued signal amplitudes. This dependent variable was analyzed as a function of the recording status - i.e. on/off - of the interfering machine's transmitters. The intermachine distance was varied between 1 m and 8.5 m. Results suggest that a distance of approximately 6.5 m is appropriate to achieve data quality comparable to that of single speaker recordings.
The Edinburgh Speech Production Facility DoubleTalk Corpus
(International Speech Communication Association, 2013-08-25) Scobbie, James M.; Turk, Alice; Geng, Christian; King, Simon; Lickley, Robin; Richmond, Korin
The DoubleTalk articulatory corpus was collected at the Edinburgh Speech Production Facility (ESPF) using two synchronized Carstens AG500 electromagnetic articulometers. The first release of the corpus comprises orthographic transcriptions aligned at phrasal level to EMA and audio data for each of 6 mixed-dialect speaker pairs. It is available from the ESPF online archive. A variety of tasks were used to elicit a wide range of speech styles, including monologue (a modified Comma Gets a Cure and spontaneous story-telling), structured spontaneous dialogue (Map Task and Diapix), a wordlist task, a memory-recall task, and a shadowing task. In this session we will demo the corpus with various examples.
UltraSuite: A Repository of Ultrasound and Acoustic Data from Child Speech Therapy Sessions
(International Speech Communication Association, 2018-06-17) Eshky, Aciel; Ribeiro, Manuel Sam; Cleland, Joanne; Richmond, Korin; Roxburgh, Zoe; Scobbie, James M.; Wrench, Alan A.
We introduce UltraSuite, a curated repository of ultrasound and acoustic data, collected from recordings of child speech therapy sessions. This release includes three data collections, one from typically developing children and two from children with speech sound disorders. In addition, it includes a set of annotations, some manual and some automatically produced, and software tools to process, transform and visualise the data.