Esther
Klabbers
mailto:klabbers@cslu.ogi.edu
http://www.bme.ogi.edu/~klabbers/
Alexander
Kain
mailto:kain@cslu.ogi.edu
http://cslu.bme.ogi.edu/~kain/
There will be several code-writing assignments. Please comment your code, and provide transcripts of example runs and figures of results, if possible. Submit your code and documentation as a single archive file (.tar, .tgz, .bz2, .zip) by email.
Participate in discussions and ask questions.
To enhance your presentation skills, there will be an opportunity to review and present a relevant paper from the field. The length of the presentation should be about 10 minutes, and be prepared for up to 5 minutes of Q&A following your talk.
There will be no midterm or final exams.
|
# |
Date |
Instructor |
Topic |
Assignment |
References |
Presentations |
Presenter |
|
1 |
01/10 |
Esther |
Introduction / Class setup / History of TTS / experience expectations of students |
|
|
|
|
|
2 |
01/12 |
Esther |
Utterance structure / Review spectrograms/ waveform/ wavesurfer |
|
Taylor & Black |
|
|
|
3 |
01/17 |
Esther |
Tokenization (addresses, abbreviations, disambiguation) |
#1: tokenization |
Richard Sproat, Alan Black, Stanley Chen, Shankar Kumar, Mari Ostendorf, and Christopher Richards. "Normalization of non-standard words." Computer Speech and Language, 15(3), 287-333, 2001. |
|
|
|
4 |
01/19 |
Esther |
Word Pronunciation (dictionary, letter-to-sound methods – WFST / rules / HMM) |
#2: letter to sound rules |
A. Black, K. Lenzo, & V. Pagel , “Issues in building general letter to sound rules”, Proceedings SSW3, Jenolan Caves, Australia |
|
|
|
5 |
01/24 |
Esther |
Word Syllabification |
|
G. Kiraz & B. Moebius, “Multilingual syllabification using weighted finite-state transducers” T. Borowski, “Structure preservation and the syllable coda in English” |
A. van den Bosch and W. Daelemans - Data oriented methods for grapheme-to-phoneme conversion |
Ken Anderson |
|
6 |
01/26 |
Esther |
Phrase |
|
C. Oliveira, L. Moutinho, A. Teixeira – On European Portuguese Automatic Syllabification |
G. Kiraz & B. Moebius – Multilingual syllabification using weighted finite-state transducers |
Nathan Bodenstab |
|
7 |
01/30 |
Esther |
Word Emphasis |
|
J. Hirschberg & P. Prieto (1994) – Training intonational phrasing rules automatically for English and Spanish text-to-speech, Proc. 2nd ESCA/IEEE Workshop on Speech Synthesis, New Paltz, NY (hardcopy) |
A. Black & P. Taylor (1997) – Assigning phrase breaks from part-of-speech sequences, Proc. EUROSPEECH’97, Rhodes, Greece
|
Tanarat Dityam |
|
8 |
02/02 |
Esther |
Duration 1 |
|
|
|
|
|
9 |
02/07 |
Esther |
Duration 2 |
#3: duration |
|
|
|
|
10 |
02/09 |
Esther |
Intonation 1 |
|
|
R. Baker, R. Clark, M. White (2004) - Synthesising contextually appropriate intonation in limited domains, 5th ISCA Speech Synthesis Workshop, Pittsburgh, PA |
Adam Murakami |
|
11 |
02/14 |
Esther |
Intonation 2 |
#4: intonation |
|
A. Raux & A. Black (2003) – A unit selection approach to F0 modeling and its application to emphasis, ASRU 2003, St Thomas, US Virgin Is |
Chinten Shah |
|
12 |
02/16 |
Alex |
Text Selection and Recording |
#5: text-selection, due 02/28 |
|
|
|
|
13 |
02/21 |
Alex |
Unit Search |
|
Hunt96 |
Bernd Möbius, “Rare Events and Closed Domains: Two Delicate Concepts in Speech Synthesis”, International Journal of Speech Technology, vol. 6, no.1, pp. 57--71, 2003. |
Tomek Szegalowski |
|
14 |
02/23 |
Alex |
Joining Units and Pitch-Synchronous Overlap-Add |
|
|
|
|
|
15 |
02/28 |
Alex |
Review of Discrete Time Signal Processing |
#6: PSOLA implementation, due 03/07 |
|
|
|
|
16 |
03/02 |
Alex |
Linear Shift-Invariant Filters |
|
|
Paul Taylor and Alan W Black (1999). Speech Synthesis by Phonological Structure Matching, in Eurospeech99 |
Emily Tucker |
|
17 |
03/07 |
Alex |
Formant Synthesis |
|
|
|
|
|
18 |
03/09 |
Alex |
Linear Prediction of Speech |
|
|
|
|
|
19 |
03/14 |
Alex |
Evaluation |
|
|
|
|
|
20 |
03/16 |
Alex |
Research Directions |
|
van Santen: Synthesis of Prosody using Multi-level Unit Sequences |
|
Bruce White |