Difference between revisions of "UNTREF Speech Workshop"

From Robert-Depot
Jump to: navigation, search
(Installing CMU Sphinx)
(Automatic Speech Recognition)
Line 18: Line 18:
='''Automatic Speech Recognition'''=
='''Automatic Speech Recognition'''=
==What is it?==
==What is it?==
*[http://cmusphinx.sourceforge.net/ CMU Sphinx] Open Source Toolkit For Speech Recognition Project by Carnegie Mellon University  
*[http://cmusphinx.sourceforge.net/ CMU Sphinx] Open Source Toolkit For Speech Recognition Project by Carnegie Mellon University  

Revision as of 06:36, 21 September 2013

<<< back to Wiki Home



How to Talk to Machines

A short 1-2 day workshop introducing speech recognition and speech synthesis techniques for the creation of interactive artwork. We use pre-compiled open-source tools (CMU Sphinx ASR, Festival TTS, Processing, Python) and focus on the demonstrable strengths and unexpected limitations of speech technologies as vehicles for creating meaning.

Saturday Sept 21, 2-6pm Centro Cultural de Borges UNTREF.

Background Reading:

Automatic Speech Recognition


What is it?


Installing CMU Sphinx

  • Download from sourceforge: http://cmusphinx.sourceforge.net/wiki/download/
  • If using windows, you need the sphinxbase-0.8-win32.zip and pocketsphinx-0.8-win32.zip files. I already downloaded these for you. They are in the untref_speech folder.

Using sphinx

  • open a terminal. Windows, Run->Cmd.
  • change to the pocketsphinx directory.
    • cd Desktop\untref_speech\pocketsphinx-0.8-win32\bin\Release
  • run the pocketsphinx command to recognize english:
    • pocketsphinx_continuous.exe -hmm ..\..\model\hmm\en_US\hub4wsj_sc_8k -dict ..\..\model\lm\en_US\cmu07a.dic -lm ..\..\model\lm\en_US\hub4.5000.DMP
  • recognize spanish:
    • pocketsphinx_continuous.exe -hmm ..\..\model\hmm\es_MX\hub4_spanish_itesm.cd_cont_2500 -dict ..\..\model\lm\es_MX\h4.dict -lm ..\..\model\lm\es_MX\H4.arpa.Z.DMP
    • this should transcribe live from the microphone.

Language Models

Acoustic models versus language models.

Grammars versus Satistical Language Models.

Available language models. English, Mandarin, French, Spanish, German, Dutch and more: http://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/

Training your own Models

grammer is trivial.

slm, can use online tools. or try the sphinxtrain packages.

Programming with Speech Recognition

Processing. Sphinx4, the java interface.

Python or c++, command line, android. pocketsphinx.

Text To Speech Synthesis

What is it?


Test them online


Installing Festival


Making a Voice

  • Portraiture?

Activity: Feedback Loop


A Conversation.