I am loading a wav with the scipy method wavefile.read() which gives me the samplerate and the audiodata I know that this audio data if stereo is stored as a multi-dimensional array such as <pre class="prettyprint"><code>audiodata[[left right] [left right] ... [left right]] </code></pre> I am then using this method to create a new array of mono audio data by taking (right+left)/2 <pre class="prettyprint"><code>def stereoToMono(audiodata) newaudiodata = [] for i in range(len(audiodata)): d = (audiodata[i][0] + audiodata[i][1])/2 newaudiodata.append(d) return np.array(newaudiodata, dtype='int16') </code></pre> and then i write this to file using <pre class="prettyprint"><code>wavfile.write(newfilename, sr, newaudiodata) </code></pre> This is producing a Mono wav file, however the sound is dirty and has clickd etc throughout what am I doing wrong?

First, what is the datatype of <code>audiodata</code>? I assume it's some fixed-width integer format and you therefore get overflow. If you convert it to a floating point format before processing, it will work fine: <pre class="prettyprint"><code>audiodata = audiodata.astype(float) </code></pre> Second, don't write your Python code element by element; vectorize it: <pre class="prettyprint"><code>d = (audiodata[:,0] + audiodata[:,1]) / 2 </code></pre> or even better <pre class="prettyprint"><code>d = audiodata.sum(axis=1) / 2 </code></pre> This will be vastly faster than the element-by-element loop you wrote.

Stereo to Mono wav in Python

Tags:

python

signal-processing

audio

I am loading a wav with the scipy method wavefile.read() which gives me the samplerate and the audiodata

I know that this audio data if stereo is stored as a multi-dimensional array such as

audiodata[[left right]
          [left right]
          ...
          [left right]]

I am then using this method to create a new array of mono audio data by taking (right+left)/2

def stereoToMono(audiodata)
    newaudiodata = []

    for i in range(len(audiodata)):
        d = (audiodata[i][0] + audiodata[i][1])/2
        newaudiodata.append(d)

    return np.array(newaudiodata, dtype='int16')

and then i write this to file using

wavfile.write(newfilename, sr, newaudiodata)

This is producing a Mono wav file, however the sound is dirty and has clickd etc throughout

what am I doing wrong?

393

asked May 22 '15 15:05

user2145312

1 Answers

First, what is the datatype of audiodata? I assume it's some fixed-width integer format and you therefore get overflow. If you convert it to a floating point format before processing, it will work fine:

audiodata = audiodata.astype(float)

Second, don't write your Python code element by element; vectorize it:

d = (audiodata[:,0] + audiodata[:,1]) / 2

or even better

d = audiodata.sum(axis=1) / 2

This will be vastly faster than the element-by-element loop you wrote.

138

answered Oct 07 '22 00:10

cfh

Related questions
                            
                                ImportError: No module names 'matplotlib' Python 3.3
                            
                                Script using multiprocessing module does not terminate
                            
                                Why does running Flask with Nginx require a WSGI wrapper?
                            
                                Python3 bytes to hex string
                            
                                genfromtxt returning NaN rows
                            
                                Convert freq string to DateOffset in pandas
                            
                                Django collecstatic boto broken pipe on large file upload
                            
                                Why does from __future__ import * raise an error?
                            
                                How to find error on slope and intercept using numpy.polyfit
                            
                                Best way to store python datetime.time in a sqlite3 column?
                            
                                Passing array range as argument to a function?
                            
                                Combining two numpy arrays to form an array with the largest value from each array
                            
                                How to add a prefix to an existing python logging formatter
                            
                                UndefinedVariableError when querying pandas DataFrame
                            
                                How to download files with Box API & Python
                            
                                Copy a generator
                            
                                100% area plot of a pandas DataFrame
                            
                                Slicing a list and group
                            
                                Upgrade permission denied: how to upgrade pip on Mac OS X? [closed]
                            
                                Expanding a block of numbers in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With