get playing wav audio level as output

Tags:

I want to make a speaking mouth which moves or emits light or something when a playing wav file emits sound. So I need to detect when a wav file is speaking or when it is in a silence between words. Currently I'm using a pygame script that I have found

import pygame
pygame.mixer.init()
pygame.mixer.music.load("my_sentence.wav")
pygame.mixer.music.play()
while pygame.mixer.music.get_busy() == True:
    continue

I guess I could make some checking at the while loop to look the sounds output level, or something like that, and then send it to one of the gpio outputs. But I don't know how to achieve that.

Any help would be much appreciated

773

asked Jun 01 '15 06:06

cor

1 Answers

You'll need to inspect the WAV file to work out when the voice is present. The simplest way to do this is look for loud and quiet periods. Because sound works with waves, when it's quiet the values in the wave file won't change very much, and when it's loud they'll be changing a lot.

One way of estimating loudness is the variance. As you can see the the article, this can be defined as E[(X - mu)^2], which could be written average((X - average(X))^2). Here, X is the value of the signal at a given point (the values stored in the WAV file, called sample in the code). If it's changing a lot, the variance will be large.

This would let you calculate the loudness of an entire file. However, you want to track how loud the file is at any given time, which means you need a form of moving average. An easy way to get this is with a first-order low-pass filter.

I haven't tested the code below so it's extremely unlikely to work, but it should get you started. It loads the WAV file, uses low-pass filters to track the mean and variance, and works out when the variance goes above and below a certain threshold. Then, while playing the WAV file it keeps track of the time since it started playing, and prints out whether the WAV file is loud or quiet.

Here's what you might still need to do:

Fix all my deliberate mistakes in the code
Add something useful to react to the loud/quiet changes
Change the threshold and reaction_time to get good results with your audio
Add some hysteresis (a variable threshold) to stop the light flickering

I hope this helps!

import wave
import struct
import time

def get_loud_times(wav_path, threshold=10000, time_constant=0.1):
    '''Work out which parts of a WAV file are loud.
        - threshold: the variance threshold that is considered loud
        - time_constant: the approximate reaction time in seconds'''

    wav = wave.open(wav_path, 'r')
    length = wav.getnframes()
    samplerate = wav.getframerate()

    assert wav.getnchannels() == 1, 'wav must be mono'
    assert wav.getsampwidth() == 2, 'wav must be 16-bit'

    # Our result will be a list of (time, is_loud) giving the times when
    # when the audio switches from loud to quiet and back.
    is_loud = False
    result = [(0., is_loud)]

    # The following values track the mean and variance of the signal.
    # When the variance is large, the audio is loud.
    mean = 0
    variance = 0

    # If alpha is small, mean and variance change slower but are less noisy.
    alpha = 1 / (time_constant * float(sample_rate))

    for i in range(length):
        sample_time = float(i) / samplerate
        sample = struct.unpack('<h', wav.readframes(1))

        # mean is the average value of sample
        mean = (1-alpha) * mean + alpha * sample

        # variance is the average value of (sample - mean) ** 2
        variance = (1-alpha) * variance + alpha * (sample - mean) ** 2

        # check if we're loud, and record the time if this changes
        new_is_loud = variance > threshold
        if is_loud != new_is_loud:
            result.append((sample_time, new_is_loud))
        is_loud = new_is_loud

    return result

def play_sentence(wav_path):
    loud_times = get_loud_times(wav_path)
    pygame.mixer.music.load(wav_path)

    start_time = time.time()
    pygame.mixer.music.play()

    for (t, is_loud) in loud_times:
        # wait until the time described by this entry
        sleep_time = start_time + t - time.time()
        if sleep_time > 0:
            time.sleep(sleep_time)

        # do whatever
        print 'loud' if is_loud else 'quiet'

answered Oct 16 '22 07:10

Rodrigo Queiro

Related questions
                            
                                How to correlate two time series with gaps and different time bases?
                            
                                GUI development with IronPython and Visual Studio 2010
                            
                                How to create a custom Python exception type in C extension?
                            
                                What can change my floating point control word behind my back?
                            
                                Using the python multiprocessing module for IO with pygame on Mac OS 10.7
                            
                                Logic game: maximising (or minimising) the chances for two agents to meet
                            
                                UnknownTimezoneError Exception Raised with Python Application Compiled with Py2Exe
                            
                                Django. Thread safe update or create.
                            
                                scikits learn and nltk: Naive Bayes classifier performance highly different
                            
                                Python: curses key codes to readable (vim-like?) syntax
                            
                                Does it make sense to modify in-place AND return a copy?
                            
                                Using numpy.take for faster fancy indexing
                            
                                Python library for creating tree graphs out of nested Python objects (dicts)
                            
                                Why does PIP convert underscores to dashes
                            
                                Can one upload files using Python SimpleHTTPServer or cgi?
                            
                                How to prevent adding two arrays by broadcasting in numpy?
                            
                                Efficient k-means evaluation with silhouette score in sklearn
                            
                                How to exit the script in a unittest test case
                            
                                Python theano with index computed inside the loop
                            
                                Calling Scrapy from another file without threading

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

get playing wav audio level as output

Tags:

python

raspberry-pi

audio

cor

People also ask

1 Answers

Rodrigo Queiro

Recent Activity

Donate For Us