Solipsist Development
Proposal
Ultimately, I decided it would be good to use this project to begin to explore the project I intend to work on next quarter. I have been working with voice recognition technologies and Mel Bochner's text 'Serial Art, Systems, Solipsism', developing a device for performance and exchange between human and computer. The device consists of a microphone, Speech To Text system, and receipt printer. The system transcribes (and validates) whatever it hears only in terms of the words it knows. As is characteristic of voice recognition, face recognition, and other types of "machine perception, this truly is a solipsistic system ('denying the existence of anything outside the confines of it's own mind'). The challenge for me next term is to develop the sound component of this piece as performance/installation.
So, for this project the samples I used were recordings of myself reciting the text of the Bochner text as I (or another) would when interacting with the installation. The composition is broken into two types of material--actual words recited, and the pauses, gaps, and inhalations of breath between those words. My desire with the piece was to emphasize these pre- and non-verbal utterances (the physical mechanics of breath and voice) while erasing, obfuscating or overwriting the spoken text. While I think I produced some interesting material in the recordings of the silences (especially the first section of the piece) I am dissatisfied with what I accomplished my use of the spoken-word part of the text. What I would like to accomplish is a kind of redaction (similar to redacted text in classified documets) where knowledge some subterranean or obscured content is still clear, but it is impossible to decipher or extract. More on that later I suppose... Well, enough of this, it's time to listen to your pieces!
Technically, I am starting this quarter with speech-recognition and receipt printer control code implemented in Processing and java--so in a sense I have demonstrated technical feasibility. Details are in the Code section. The most readily apparent (and mundane) technical challenges for this project are the implementation of sound input and pre-processing with supercollider, and interfacing from supercollider to the speech recognition library and receipt printer. As part of this project is an investigation of the strengths and limits of speech recognition technology, I wish to delve more deeply into the mechanisms of speech recognition and the Sphinx-4 library over the course of the term. More substantial involvement with that technology is necessary to illuminate it's character and embedded values.
I will update the weekly Progress section as the quarter continues.
Equipment
- Hardware:
- receipt printer
- microphone
- motu box.
- mac mini.
- desk*
- desk lamp*
- sound-proof commercial glazing (windows)*
- Software:
- see Code section below.
*I do not have these items yet.
Open Questions
- Where else do we do this sort of projection and anthropomorphism? (projecting psychology or attributing intention to non-intelligent systems)
Timeline
Week 1 - 2
Introduction to course and project development.
Week 3 - Proposal - 4/12
- Write this.
- Meet with Juan.
- Find desk and desk lamp for "me vs. the computer" staging of microphone/speech recognition system.
Week 4 - 4/19
Work time.
Week 5 - MILESTONE 1 - 4/26
Working model of each of two tracks:
- Participant speaking to computer.
- Computer/printer speaking to itself (feedback loop). Interpreting printer sounds as speech. Or transforming them into spech.
Week 6 - 5/3
Realize that the best approach will combine elements of each of the two tracks above.
Week 7 - MILESTONE 2 - 5/10
- Experiments with the characterization of the system:
- Software agent? agency. towards what goals?
- Interruptions.
- Unexpected responses.
- Basic state modeling (emotional states, psychological states).
- Basic drive modeling (for novelty, entertainment, activity, rest, conversation on certain topics).
- Attention to possible choices of text.
I imagine these two go hand in hand--that the choice of particular texts will lend much of the character to the piece.
Week 8 - 5/17
Have others experience the system, try it out.
Week 9 - MILESTONE 3 - 5/24
Near-final form, near-final realization.
Viewer interaction tests.
Week 10 - 5/31
Final changes, improvements, last minute blitz.
Presentation - 6/7
Progress
Week 2
get system running again
Week 3
- Got the system running again. The hard to find OS X driver for my Airlink101 AC-USBS (Serial Adapter) was actually available from Prolific: http://www.prolific.com.tw/eng/downloads.asp?id=31. I think the airlink device must have the Prolific PL-2303 USB to I/O Port Controller inside. Finding a driver probably wouldn't be difficult if I bought some more recently manufactured usb-to-serial adapter, if people still manufacture those sorts of things. I need a serial adapter because the epson receipt printer has some old school serial connectivity on the back.
- Met with Juan.
Code
- CMU Sphinx-4 Automatic Speech Recognition (ASR) library - http://sourceforge.net/projects/cmusphinx/files/
- Sphinx-4 wrapper for processing: http://svn.roberttwomey.com/processing/libraries/sphinx/
- Example code for Grammar-based recognition: http://svn.roberttwomey.com/processing/sphinxBochner/
- Example code for Statistical Language Model (SLM) based recognition: http://svn.roberttwomey.com/processing/sphinxSLMTest/
- In intend to use Supercollider as the central software for sound-input and processing.
References
- Mel Bochner. "Serial Art, Systems, Solipsism."
- Natalie Jeremijenko.
- "If Things Can Talk, What Do They Say? If We Can Talk To Things, What Do We Say?" http://www.electronicbookreview.com/thread/firstperson/voicechip
- "Dialogue With A Monologuq: Voice Chips and the Products of Abstract Speech". http://www.topologicalmedialab.net/xinwei/classes/readings/Jeremijenko/VoiceChips.pdf
- Kelly Dobson - Machine Therapy:
- "explorations of what we interact with when we interact with machines... much more than the machine itself... our sense of self, agency in the interpersonal and political world, and our shared psychological, emotional, cultural, and perceptual approaches to the world." (from abstract http://dspace.mit.edu/handle/1721.1/44329)
- Blendie. sing to the blender to make it run. http://web.media.mit.edu/~monster/blendie/
- Machine Therapy sessions. http://web.media.mit.edu/~monster/machinetherapy/
- Machine Therapy. PhD Dissertation. 2007. http://dspace.mit.edu/handle/1721.1/44329
- William Gibson. "Count Zero". The idea of the "loa", voodoo gods / fragments of AIs set loose at the end of Neuromancer.
- Frances White. Valdrada. 1990. [1]
- Redaction Paintings. Jenny Holzer. 2007. [2]
- White Room #4 / Wittgenstein & my Brother Frank. 2005 William Pope.L.