Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Writing software to tell where sound comes from (directional listening) [closed]

I've been curious about this for some time so I thought maybe posting here I could get some good answers.

What I know so far:

Humans can use their two ears to get not only what sounds "sound like" but also where they are coming from. Pitch is the note we hear, and something like the human voice has various pitches overlaid (not a pure tone.)

What I'd like to know:

How do I go about writing a program that can know where sound is coming from? From a theoretical standpoint I'd need two microphones, then I would record the sound data coming to the microphones and store the audio data such that a split second of audio data can be put into a tuple like [streamA, streamB].

I feel like there might be a formulaic / mathematical way to calculate based on the audio where a sound comes from. I also feel like it's possible to take the stream data and train a learner (give it sample audio and tell it where the audio came from) and have it classify incoming audio in that way.

What's the best way to go about doing this / are there good resources from which I can learn more about the subject?

EDIT:

Example:

          front

left (mic) x ======== x (mic) right

          back

                            x (sound source should return "back" or "right" or "back right")

I want to write a program that can return front/back left/right for most of the sound it is hearing. From what I understand it should be simple to set up two microphones pointed "forward." Based on that I'm trying to figure out a way we can triangulate sound and know where in relation to the mics the source is.

like image 326
Sam Avatar asked Dec 29 '11 01:12

Sam


1 Answers

If you look into research papers on multi phase microphone arrays, specifically those used for underwater direction finding (ie, a big area of submarine research during the cold war - where is the motor sound coming from so we can aim the torpedoes?) then you'll find the technology and math required to find the location of a sound given two or more microphone inputs.

It's non-trivial, and not something that could be discussed so broadly here, though, so you aren't going to find an easy code snippet and/or library to do what you need.

The main issue is eliminating echos and shadows. A simplistic method would be to start with a single tone, filtering out everything but that tone, then measuring the phase difference between the two microphones of that tone. The phase difference will give you a lot of information about the location of the tone.

You can then choose whether you want to deal with echoes and multipath issues (many of which can be eliminated by removing all but the strongest tone) or move onto correlating sounds that consist of something other than a single tone - a person talking, or a glass break, for instance. Start small and easy, and expand from there.

like image 172
Adam Davis Avatar answered Sep 23 '22 23:09

Adam Davis