You are currently browsing the archives for the “Audio Software” category.

Text-to-movie: Storyboard of the future?

February 6, 2010 // Posted in Audio Software, General, Software, Visualization  |  No Comments

Text-to-speech systems have been in development for a very long time.  One that I use frequently for creating placeholder dialog in games is the AT&T Labs Natural Voices text-to-speech demo.  It’s a free service that enables the user to type out any phrase and then download an audio file with that phrase spoken by any number of synthesized accents.  For the most part they sound very realistic!  The only caveat is that some words need to be spelled phonetically to translate well.

So it was with great pleasure that I discovered the website Xtranormal.com. Xtranormal has a web-based procedurally generated storyboard application as well as a downloadable version.  I’ve attached my first text-to-video project as an example.  It’s a parody of Stanley Kubrik’s 2001: A Space Odyssey.

Here is my source.  Note the icons that were dragged and dropped into the script to trigger animations, camera positions, facial expressions and even a sound effect!

2010_script What I like about this software is that it makes the process of concepting film very easy, and allows for the rapid expression of ideas.  Video is rendered within the browser as well!

There are a few bugs that they still need to work out, but I’d imagine that they will fix them soon.  If you move the mouse too much while editing in the browser, it will reset your project and there are some issues with formatting video during the publishing process.  Overall, this is a fun and useful set of tools for kids and professionals alike.

Share

What you see is what you hear: Pulp Fiction remixed

November 1, 2009 // Posted in Audio Software, Culture, Software, Visualization  |  No Comments

In the past two years or so, we’ve been seeing more and more videos where people sample video and audio in combination and sequence them in interesting ways.

This most recent video however, takes a very new approach.  Using Pulp Fiction as source material, the editor of this video created a 6 part montage of sound effects and musical samples from the film to build a new experience.  The brilliance of this is that by taking source material that is familiar to us all, the technique becomes accessible.

Considering that people enjoy making these vid-sequences and their popularity is growing, I’d imagine that we will begin to see apps that make the process of creating these videos fun and easy.

[link from Urlesque]

Share

What kind of music do you spin?

July 16, 2009 // Posted in Audio Software, Culture, General, iPhone stuff, Uncategorized (Tags: , , ) |  1 Comment

For DJs of electronic music, one of the biggest challenges is how to describe the music you play.  The Chickenhed Stylemaker machine is a solution to that problem.  It can also be used for inspiration.  Perhaps Chickenhed Stylemaker is the new Oblique Strategies?

Thanks to Jerry Abstract for the tip!

Share

Billie Tweets

July 9, 2009 // Posted in Audio Software, Culture, Visualization (Tags: , , , , ) |  No Comments

Billie Tweets is a Twitter based tribute to Michael Jackson by the open source software developers 9Astronauts.  It combines Michael Jackson’s video of Billie Jean with a karaoke style scrolling of lyrics highlighted word by word from recent tweets.  Each word in the song is pulled from a new “tweet” and displayed on the website in time with the music.  It’s a very forward thinking presentation and worth looking at.

billietweets

http://billietweets.com/

– Adam Smith-Kipnis

Share

Google Audio: Searching for the right words

July 6, 2009 // Posted in Audio Software, Culture, Gaming, General, iPhone stuff, Software, Uncategorized (Tags: , , , , , ) |  No Comments

Back in February I wrote about search engines with speech recognition capabilities. Now Google has gotten into the mix with Google Audio Indexing.  Google Audio Indexing is an extension of their Speech Recognition group.

There are two big questions being addressed here. What information can be extracted from millions of hours of audio and how can that information be applied?

Background

Generally speaking, the analysis of speech recognition consists of the following.

  • the topic(s) being discussed
  • the identities of the speaker(s)
  • the genders of the speakers
  • the emotional character of the speech
  • the amount and locations of speech versus non-speech (e.g. background noise or silence)

Speech is not the only data that can be extracted from uploaded video.  Music has tempo, key, lyrics, timbre, instrumentation and much more.  The search for sound effect reference material in web video is also currently limited by titles and keywords manually attached to videos.

Imagination

Captions are not available

Captions are not available

Imagine that you’re hearing impaired but would like to watch and understand political speeches posted to YouTube.  By using automated speech recognition in combination with closed captions, any spoken word video posted to YouTube would always be accessable to you.

Imagine that there is a video in a language not familar to you that you’d like to understand.  Speech recognition combined with a translation service like BabelFish could help to bridge cultures worldwide.  For people who are also blind, this could also be combined with speech synthesis feature for even greater accessability.

Imagine that you’re a business that wants to track the emotions of people uploading videos mentioning your product.

Imagine that you’re a Karaoke lover who has stage fright and wants to practice at home.  Extracting lyrics from music videos and automatically adding them as closed captions would be a welcome feature to you.  This might not be possible with “Louie Louie” but would be useful for everything else.

Imagine that you’re a DJ looking for the perfect track to mix.  Being able to search for musical content with video according to key and tempo would be hugely valuable to you.  This could be accomplished by combining MixMeister’s functionality with Google Audio.  MixMeister extracts tempo and key from music libraries.

Imagine that you’re a musician looking to compose new music from sampled clips.  Music search tools would be hugely useful to you as well. InBFlat.net is a website which presents videos composed in the same key.  Kutiman is a musician who composes new songs from uploaded video clips.  These are the sorts of projects that could be made using tempo and key detection in Google Audio.  See a sample of Kutiman’s work below.

Imagine that you’re a part of the largest search engine company in the world and want to organize the world’s information and make it universally accessible and useful.  What other features would you add?

Realization

With user generated content becoming the cornerstone of interactive media, tools and methods for parsing vast amounts of data will be essential.  Speech recognition will be a huge part of this.  Google is now working very hard at developing and refining their speech algorithms and is releasing multiple products to support this effort.  Here are a few examples of this.

Google Audio Indexing – Search through the audio content in web video.  Currently Google Audio Indexing is exclusive to the political channel of YouTube.

Google 411 – Google 411 is a voice activated search engine that can be dialed from any phone.

Google Mobile – Google Mobile has a built in speech recognition feature which is much faster than typing a search on a tiny touchscreen.

Google Voice – Google Voice is designed to consolidate phone numbers and also includes transcription services for voice mail.

As evidenced by a recent test by the New York Times, Google has a way to go before their algorithms are perfected.  Nonetheless, they are on the right track and have displayed a consistancy in their interest to develop speech recognition.

– Adam Smith-Kipnis

Share