Getting Started

Festival Speech Synthesis System - http://www.cstr.ed.ac.uk/projects/festival/
build on os x with do_prompt capabilities - http://linguisticmystic.com/2011/07/15/using-festival-tts-on-os-x/
- http://permalink.gmane.org/gmane.science.tts.festvox/381
this is a class - http://www.speech.cs.cmu.edu/15-492/assignments/tts/index.html

Learning

Book - http://festvox.org/festvox/book1.html
short tutorial - http://festvox.org/festtut-2.0/
exercises and hints - http://festvox.org/festtut-2.0/exercises/

Training Voice Models

howto http://festvox.org/festvox/c3170.html#AEN3172
training text input - http://www.festvox.org/cmu_arctic/cmuarctic.data
useful tips http://festvox.org/index.html, including name for EMU speech database system http://www.shlrc.mq.edu.au/emu/
Building a CLUSTERGEN Statistical Parametric Synthesizer: http://festvox.org/festvox/c3170.html#AEN3172

Building a Unit Selection Cluster Voice

(from here http://festvox.org/festvox/x3082.html)

```
mkdir uw_uw_rdt
```

cd uw_uw_rdt

uniphone setup:

 $FESTVOXDIR/src/unitsel/setup_clunits uw us rdt uniphone

generate prompts and prompt files:

festival -b festvox/build_clunits.scm '(build_prompts_waves "etc/uniphone.data")'

record sound, using audacity. save as 16k, 16bit mono.
make labels:
```
./bin/make_labs prompt-wav/*.wav
```

build utterance structure:

festival -b festvox/build_clunits.scm '(build_utts "etc/uniphone.data")'

do pitch marking:
```
./bin/make_pm_wave etc/uniphone.data
```
find Mel Frequency Cepstral Coefficients:
```
./bin/make_mcep etc/uniphone.data
```

build cluster unit selection synth:

festival -b festvox/build_clunits.scm '(build_clunits "etc/uniphone.data")'

Using a Unit Selection Cluster Voice Synth

from uw_us_rdt directory:
```
festival festvox/uw_us_rdt_clunits.scm
```
in Scheme:
```
(voice_uw_us_rdt_clunits) 
```
```
(SayText "this is a little test.")
```

Building a CLUSTERGEN Statistical Parametric Synthesizer

adapted from http://festvox.org/festvox/c3170.html#AEN3172

mkdir uw_us_rdt_arctic

uw_us_rdt_arctic $FESTVOXDIR/src/clustergen/setup_cg uw us rdt_arctic

copy text into etc/txt.done.data. use some of the lines from here http://www.festvox.org/cmu_arctic/cmuarctic.data
copy audio files into wav/
use
```
bin/get_wavs
```
to copy files to power normalize and convert to proper format.

Building a Unit Selection Cluster Voice from TIMIT data

(from here http://festvox.org/festvox/x3082.html)

```
mkdir uw_uw_rdt_timit
```

cd uw_uw_rdt_timit

timit setup:

 $FESTVOXDIR/src/unitsel/setup_clunits uw us rdt timit

generate prompts and prompt files:

festival -b festvox/build_clunits.scm '(build_prompts_waves "etc/timit.data")'

record sound, using audacity. save as 16k, 16bit mono.
make labels:
```
./bin/make_labs prompt-wav/*.wav
```

build utterance structure:

festival -b festvox/build_clunits.scm '(build_utts "etc/timit.data")'

do pitch marking:
```
./bin/make_pm_wave etc/timit.data
```
find Mel Frequency Cepstral Coefficients:
```
./bin/make_mcep etc/timit.data
```

build cluster unit selection synth:

festival -b festvox/build_clunits.scm '(build_clunits "etc/timit.data")'

Improving Quality

Fix phoneme labeling - http://sourceforge.net/projects/wavesurfer/
tuning a voice - http://www.cstr.ed.ac.uk/emasters/summer_school_2005/tutorial3/tutorial.html

Using Voices

using meghan voice

To run the server:

open terminal:

cd /Users/murmur/Desktop/meghan festival_server -c meghans_special_sauce.scm

To kill the server:

Control-C

To run the client:

open a 2nd terminal window: cd /Users/murmur/Desktop/meghan festival_client myfile.txt --ttw --output client_test.wav

Other stuff (python):

import os os.popen("/Applications/festival_2.1/festival/src/main/festival_client /Users/murmur/Desktop/meghan/myfile.txt --ttw --output /Users/murmur/Desktop/meghan/client_test78.wav")

Using new Voices

Modify new voice so festival knows it's there

append to uw_us_rdt_clunits.scm in uw_us_rdt_clunits/festvox:


(proclaim_voice

'uw_us_rdt_clunits
'((language english)
  (gender male)
  (dialect american)
  (description
   "This is Robert Twomey trained on CLUNITS, TIMIT databse.")))

(provide 'uw_us_rdt_clunits)

=Install voice to festival directory

http://roberttwomey.com/downloads/uw_us_rdt_clunits.tar.gz
unzip file from festival root directory, it should install to the correct directory
copy your newly trained voice to festival/lib/voices/english/
the name of your new voice directory (ex: uw_us_rdt_clunits/) needs to match the voice file (ex: uw_us_rdt_clunits/festvox/uw_us_rdt_clunits.scm)

Configure festival to use your voice by default

to set your voice as default, add the following to festival/etc/siteinit.scm:

(autoload voice_uw_us_rdt_clunits "/Users/rtwomey/code/tts/festival/lib/voices/english/uw_us_rdt_clunits/festvox/uw_us_rdt_clunits" "American English male uw_us_rdt_clunits")

(set! voice_default 'voice_uw_us_rdt_clunits)

(voice_uw_us_rdt_clunits)
(lex.add.entry '("<break>" n (((pau pau) 0))))

the lex.add.entry line makes a new word in the lexicon
```
<break>
```
that adds a pause.
change directory cd to the folder containing your festvox files (trained model)
run festival_server and it will load your new voice by default
http://www.cstr.ed.ac.uk/projects/festival/manual/festival_24.html

Tuning phrasing, prosody, etc with SABLE

http://www.cstr.ed.ac.uk/projects/festival/manual/festival_10.html#SEC31

Festival TTS

Contents

Getting Started

Learning

Training Voice Models

Building a Unit Selection Cluster Voice

Using a Unit Selection Cluster Voice Synth

Building a CLUSTERGEN Statistical Parametric Synthesizer

Building a Unit Selection Cluster Voice from TIMIT data

Improving Quality

Using Voices

using meghan voice

To run the server:

Using new Voices

Modify new voice so festival knows it's there

=Install voice to festival directory

Configure festival to use your voice by default

Tuning phrasing, prosody, etc with SABLE

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools

Support