You are currently browsing the archives for the “Gaming” category.

Metrix Create:Space + EEG Quadrocopter!

November 17, 2010 // Posted in Gaming, iPhone stuff, Security, Software  |  1 Comment

This past Sunday I was working on an EEG controlled AR Drone quadrocopter with my friend Andrew when the editor for the Metrix Create:Space blog decided to do an article about us.  You can read the article here.  The gentleman wearing the EEG headset in the article is Ian Gallagher (one of the authors of Firesheep) who happened to stop by and check out what we were working on.

You can find out more about the Neurosky Mindset here.

Here is some video of an early AR Drone test flight.

AR.Drone Test Flight at Metrix Create:Space from Andrew Becherer on Vimeo.

Design Outside the Box – Jesse Schnell (DICE 2010)

February 28, 2010 // Posted in Culture, Gaming  |  No Comments

I found this presentation on game design to be rather thought provoking.  Jesse Schnell touches on interesting concepts and compelling statistics in the current state of game design.  He also delves into a rather intriguing view of the future where gameplay is used as a method of large scale behavioural influence.  It moves rather quickly from one idea to another and is worth watching all the way through.

Google Audio: Searching for the right words

July 6, 2009 // Posted in Audio Software, Culture, Gaming, General, iPhone stuff, Software, Uncategorized (Tags: , , , , , ) |  No Comments

Back in February I wrote about search engines with speech recognition capabilities. Now Google has gotten into the mix with Google Audio Indexing.  Google Audio Indexing is an extension of their Speech Recognition group.

There are two big questions being addressed here. What information can be extracted from millions of hours of audio and how can that information be applied?


Generally speaking, the analysis of speech recognition consists of the following.

  • the topic(s) being discussed
  • the identities of the speaker(s)
  • the genders of the speakers
  • the emotional character of the speech
  • the amount and locations of speech versus non-speech (e.g. background noise or silence)

Speech is not the only data that can be extracted from uploaded video.  Music has tempo, key, lyrics, timbre, instrumentation and much more.  The search for sound effect reference material in web video is also currently limited by titles and keywords manually attached to videos.


Captions are not available

Captions are not available

Imagine that you’re hearing impaired but would like to watch and understand political speeches posted to YouTube.  By using automated speech recognition in combination with closed captions, any spoken word video posted to YouTube would always be accessable to you.

Imagine that there is a video in a language not familar to you that you’d like to understand.  Speech recognition combined with a translation service like BabelFish could help to bridge cultures worldwide.  For people who are also blind, this could also be combined with speech synthesis feature for even greater accessability.

Imagine that you’re a business that wants to track the emotions of people uploading videos mentioning your product.

Imagine that you’re a Karaoke lover who has stage fright and wants to practice at home.  Extracting lyrics from music videos and automatically adding them as closed captions would be a welcome feature to you.  This might not be possible with “Louie Louie” but would be useful for everything else.

Imagine that you’re a DJ looking for the perfect track to mix.  Being able to search for musical content with video according to key and tempo would be hugely valuable to you.  This could be accomplished by combining MixMeister’s functionality with Google Audio.  MixMeister extracts tempo and key from music libraries.

Imagine that you’re a musician looking to compose new music from sampled clips.  Music search tools would be hugely useful to you as well. is a website which presents videos composed in the same key.  Kutiman is a musician who composes new songs from uploaded video clips.  These are the sorts of projects that could be made using tempo and key detection in Google Audio.  See a sample of Kutiman’s work below.

Imagine that you’re a part of the largest search engine company in the world and want to organize the world’s information and make it universally accessible and useful.  What other features would you add?


With user generated content becoming the cornerstone of interactive media, tools and methods for parsing vast amounts of data will be essential.  Speech recognition will be a huge part of this.  Google is now working very hard at developing and refining their speech algorithms and is releasing multiple products to support this effort.  Here are a few examples of this.

Google Audio Indexing – Search through the audio content in web video.  Currently Google Audio Indexing is exclusive to the political channel of YouTube.

Google 411 – Google 411 is a voice activated search engine that can be dialed from any phone.

Google Mobile – Google Mobile has a built in speech recognition feature which is much faster than typing a search on a tiny touchscreen.

Google Voice – Google Voice is designed to consolidate phone numbers and also includes transcription services for voice mail.

As evidenced by a recent test by the New York Times, Google has a way to go before their algorithms are perfected.  Nonetheless, they are on the right track and have displayed a consistancy in their interest to develop speech recognition.

– Adam Smith-Kipnis

What is a Sound Designer?: Revisiting A 2006 interview

May 29, 2009 // Posted in Culture, Gaming, General  |  No Comments

I recently rediscovered an interview that I gave to an aspiring audio engineer in 2006. Three years later I’ve found that some of my views and opinions have changed and others have stayed the same… Read the rest of this entry »

How to talk dirty and influence gamers: Managing Voice Communication

February 28, 2009 // Posted in Audio Software, Culture, Gaming  |  No Comments

One of the biggest complaints I hear about audio in video games is the amount of homophobic and racist language used in voice chat during online game play.  As a Game Audio Designer, this is an issue that I want to fix.  We Audio Designers put our hearts and souls into developing rich audio experiences and so, we need to avoid creating situations where gamers feel compelled to turn down the volume.

A recent post in the blog of our local weekly newspaper The Stranger, suggested that we [game audio professionals] should “…get around to magically filtering the system’s voice chat,” so as to inhibit the sort of trash talking that goes on.

How can this be addressed?

A few options are available, muting offensive players, hosting “no-foul-language” or “foul language” rooms, compartmentalizing voice communication, enabling community moderation, using speech recognition to censor and having verbal abuse as a feature.  Each of these solutions comes with it’s own challenges.

1. Manually muting offensive players:

Giving each player the ability to selectively ignore other players makes it so that the annoying player only communicates with people that actually want to listen.  Selecting players to ignore could become tedious if there are too many annoying players.  Also, muting inherently reduces the potential amount of useful gamestate information that can be relayed.  For example, if you mute a teammate, you won’t be able to hear him when he tries to warn you about someone creeping up behind you.

2. Foul Language and No Foul Language Rooms:

Hosting a “no-foul-language” room creates a type of exclusivity that separates gamers from playing with one another.  Creating a “foul language” specific room might be undesirable by game companies wishing to avoid the perception of being complicit in facilitating morally offensive behavior.  Nonetheless, these are useful ways to manage expectations.

3. Compartmentalize players:

Reducing the number of players that can communicate with each other at any given point in time makes it less likely that an offending player will end up in your chat space although it doesn’t guarantee that you won’t hear any foul language.  Many games compartmentalize by making it so that players may only communicate with their own team.

4. Community moderation:

Community moderation puts the policing of behavior in the hands of the gaming community rather than the publisher.  Ebay has buyer and seller reviews, Linkedin has recommendations, Craigslist allows people to flag postings, anyone may edit Wikipedia and Xbox live has gamer reviews.

Community moderation can either be active or passive.  Active community moderation has tangible consequences.  These could be temporary removal of voice chat or banishment from rooms that require a certain percentage of positive reviews.  This could result in a lot of complaints from players who feel unfairly restricted, or could lead to abuse of rating systems by groups of people targeting a single individual for the purpose of hindering their gaming experience.

Passive community moderation is merely a review or ranking in a gamer card to establish reputation.   This would be useful to people who wanted to research their teammates and decide whether to play with them or not based off of the opinion of others.

5. Speech Rejection: “Magical filtering” using Speech Recognition

Speech recognition and voice communication have been in games for a while. Back in the day, SOCOM on PS2 had speech recognition enabling AI characters to respond to spoken commands. It would be a relatively simple process to add a delay and analyze phrases before broadcast and a keyword based volume control to that system. The problem is that this system would make using voice communication useless because of the lag.

Speech recognition software capable of running on a game console generally relies on compliance of the player to pronounce keywords the same way each time.  People would find ways around these roadblocks by changing the pitch or tempo of their voice, publishing the banned words and using different ones.

Additionally, all homophones to potentially foul language could get banned. The censors would need to ask themselves if words like cockeyed, titmouse and uranus should be included.  If this happened, the value of voice communication systems would be worsened and we might see Lenny Bruce styled protests and complaints about censorship, freedom of speech, etc…

6. Embracing verbal abuse as a regulated feature

According to Wikipedia:

In Monkey Island, Insult Swordfighting consists of a series of Call and Response exchanges, in which an insult must be countered with a witty retort. Should the responder counter with an appropriate retort, they win the right to call the next insult; fail to respond, and the caller gains an advantage. Win enough of these exchanges, and the duel is won.

A well known insult from The Secret of Monkey Island, in which the insults were written by author Orson Scott Card, is “You fight like a dairy farmer!” to which the correct response would be “How appropriate. You fight like a cow!”

While this is a very clever way to encourage more creative and family friendly banter, selective playback of pre-recorded dialog isn’t a full substitute for real time voice communication.

In conclusion:

All gamers have a common interest in participating in virtual spaces. While none of these solutions will single-handedly give the best experience to all gamers, combining them in the right ways can make the majority of gamers quite happy.

Processing power and memory budgets are best spent on creating fun rather than limitations.  To achieve success, game developers should focus on building environments that encourage participation by as many people as possible.

For those who are completely intolerant of the language on XBox live or any other gaming chat rooms, we can always take the headphones off and play with our friends.

– Adam Smith-Kipnis