Difference between revisions of "UNTREF Speech Workshop"

From Robert-Depot
Jump to: navigation, search
(Introduction)
(Automatic Speech Recognition)
Line 19: Line 19:
 
='''Automatic Speech Recognition'''=
 
='''Automatic Speech Recognition'''=
 
https://engineering.purdue.edu/~ee649/notes/figures/ear.gif
 
https://engineering.purdue.edu/~ee649/notes/figures/ear.gif
 
==What is it?==
 
  
 
==Engines==
 
==Engines==

Revision as of 07:49, 21 September 2013

<<< back to Wiki Home

Introduction

7448315604_c91457ea3e_z.jpg

Conversing With Machines

A short 1-2 day workshop introducing speech recognition and speech synthesis techniques for the creation of interactive artwork. We use pre-compiled open-source tools (CMU Sphinx ASR, Festival TTS, Processing, Python) and focus on the demonstrable strengths and unexpected limitations of speech technologies as vehicles for creating meaning.

Saturday Sept 21, 2-6pm Centro Cultural de Borges UNTREF.


Background Reading:

Automatic Speech Recognition

ear.gif

Engines

Installing CMU Sphinx

  • Download from sourceforge: http://cmusphinx.sourceforge.net/wiki/download/
  • If using windows, you need the sphinxbase-0.8-win32.zip and pocketsphinx-0.8-win32.zip files. I already downloaded these for you. They are in the untref_speech folder.

Using sphinx

  • open a terminal. Windows, Run->Cmd.
  • change to the pocketsphinx directory.
    • cd Desktop\untref_speech\pocketsphinx-0.8-win32\bin\Release
  • run the pocketsphinx command to recognize english:
    • pocketsphinx_continuous.exe -hmm ..\..\model\hmm\en_US\hub4wsj_sc_8k -dict ..\..\model\lm\en_US\cmu07a.dic -lm ..\..\model\lm\en_US\hub4.5000.DMP
  • recognize spanish:
    • pocketsphinx_continuous.exe -hmm ..\..\model\hmm\es_MX\hub4_spanish_itesm.cd_cont_2500 -dict ..\..\model\lm\es_MX\h4.dict -lm ..\..\model\lm\es_MX\H4.arpa.Z.DMP
    • this should transcribe live from the microphone.

Language Models

Acoustic models versus language models.

Grammars versus Satistical Language Models.

Available language models. English, Mandarin, French, Spanish, German, Dutch and more: http://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/


Training your own Models

grammer is trivial.

slm, can use online tools. or try the sphinxtrain packages.

Programming with Speech Recognition

Processing. Sphinx4, the java interface.

Python or c++, command line, android. pocketsphinx.

Text To Speech Synthesis

voder-2.png

What is it?

Engines

Test them online

Voices

Installing Festival

Tutorial

Making a Voice

  • Portraiture?

Activity: Feedback Loop

saussure.gif

A Conversation.