Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detect tap with pyaudio from live mic

How would I use pyaudio to detect a sudden tapping noise from a live microphone?

like image 560
a sandwhich Avatar asked Nov 11 '10 23:11

a sandwhich


Video Answer


1 Answers

One way I've done it:

  • read a block of samples at a time, say 0.05 seconds worth
  • compute the RMS amplitude of the block (square root of the average of the squares of the individual samples)
  • if the block's RMS amplitude is greater than a threshold, it's a "noisy block" else it's a "quiet block"
  • a sudden tap would be a quiet block followed by a small number of noisy blocks followed by a quiet block
  • if you never get a quiet block, your threshold is too low
  • if you never get a noisy block, your threshold is too high

My application was recording "interesting" noises unattended, so it would record as long as there were noisy blocks. It would multiply the threshold by 1.1 if there was a 15-second noisy period ("covering its ears") and multiply the threshold by 0.9 if there was a 15-minute quiet period ("listening harder"). Your application will have different needs.

Also, just noticed some comments in my code regarding observed RMS values. On the built in mic on a Macbook Pro, with +/- 1.0 normalized audio data range, with input volume set to max, some data points:

  • 0.003-0.006 (-50dB to -44dB) an obnoxiously loud central heating fan in my house
  • 0.010-0.40 (-40dB to -8dB) typing on the same laptop
  • 0.10 (-20dB) snapping fingers softly at 1' distance
  • 0.60 (-4.4dB) snapping fingers loudly at 1'

Update: here's a sample to get you started.

#!/usr/bin/python  # open a microphone in pyAudio and listen for taps  import pyaudio import struct import math  INITIAL_TAP_THRESHOLD = 0.010 FORMAT = pyaudio.paInt16  SHORT_NORMALIZE = (1.0/32768.0) CHANNELS = 2 RATE = 44100   INPUT_BLOCK_TIME = 0.05 INPUT_FRAMES_PER_BLOCK = int(RATE*INPUT_BLOCK_TIME) # if we get this many noisy blocks in a row, increase the threshold OVERSENSITIVE = 15.0/INPUT_BLOCK_TIME                     # if we get this many quiet blocks in a row, decrease the threshold UNDERSENSITIVE = 120.0/INPUT_BLOCK_TIME  # if the noise was longer than this many blocks, it's not a 'tap' MAX_TAP_BLOCKS = 0.15/INPUT_BLOCK_TIME  def get_rms( block ):     # RMS amplitude is defined as the square root of the      # mean over time of the square of the amplitude.     # so we need to convert this string of bytes into      # a string of 16-bit samples...      # we will get one short out for each      # two chars in the string.     count = len(block)/2     format = "%dh"%(count)     shorts = struct.unpack( format, block )      # iterate over the block.     sum_squares = 0.0     for sample in shorts:         # sample is a signed short in +/- 32768.          # normalize it to 1.0         n = sample * SHORT_NORMALIZE         sum_squares += n*n      return math.sqrt( sum_squares / count )  class TapTester(object):     def __init__(self):         self.pa = pyaudio.PyAudio()         self.stream = self.open_mic_stream()         self.tap_threshold = INITIAL_TAP_THRESHOLD         self.noisycount = MAX_TAP_BLOCKS+1          self.quietcount = 0          self.errorcount = 0      def stop(self):         self.stream.close()      def find_input_device(self):         device_index = None                     for i in range( self.pa.get_device_count() ):                  devinfo = self.pa.get_device_info_by_index(i)                print( "Device %d: %s"%(i,devinfo["name"]) )              for keyword in ["mic","input"]:                 if keyword in devinfo["name"].lower():                     print( "Found an input: device %d - %s"%(i,devinfo["name"]) )                     device_index = i                     return device_index          if device_index == None:             print( "No preferred input found; using default input device." )          return device_index      def open_mic_stream( self ):         device_index = self.find_input_device()          stream = self.pa.open(   format = FORMAT,                                  channels = CHANNELS,                                  rate = RATE,                                  input = True,                                  input_device_index = device_index,                                  frames_per_buffer = INPUT_FRAMES_PER_BLOCK)          return stream      def tapDetected(self):         print("Tap!")      def listen(self):         try:             block = self.stream.read(INPUT_FRAMES_PER_BLOCK)         except IOError as e:             # dammit.              self.errorcount += 1             print( "(%d) Error recording: %s"%(self.errorcount,e) )             self.noisycount = 1             return          amplitude = get_rms( block )         if amplitude > self.tap_threshold:             # noisy block             self.quietcount = 0             self.noisycount += 1             if self.noisycount > OVERSENSITIVE:                 # turn down the sensitivity                 self.tap_threshold *= 1.1         else:                         # quiet block.              if 1 <= self.noisycount <= MAX_TAP_BLOCKS:                 self.tapDetected()             self.noisycount = 0             self.quietcount += 1             if self.quietcount > UNDERSENSITIVE:                 # turn up the sensitivity                 self.tap_threshold *= 0.9  if __name__ == "__main__":     tt = TapTester()      for i in range(1000):         tt.listen() 
like image 114
Russell Borogove Avatar answered Sep 20 '22 14:09

Russell Borogove