Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python change pitch of wav file [closed]

I need any python library to change pitch of my wav file without any raw audio data processing. I spent couple hours to find it, but only found some strange raw data processing code snippets and video, that shows real-time pitch shift, but without source code.

like image 307
Daniel Avatar asked May 14 '17 12:05

Daniel


People also ask

How do I change the pitch of my voice in Python?

Read the data, split it in left and right channel (assuming a stereo WAV file). Extract the frequencies using the Fast Fourier Transform built into numpy. Roll the array to increase the pitch.

How do you find the pitch of a sound in python?

Use the function by giving the input argument as the name of the wav file (type==str). Function returns the value of the pitch of the audio.


2 Answers

I recommend trying Librosa's pitch shift function: https://librosa.github.io/librosa/generated/librosa.effects.pitch_shift.html

import librosa
y, sr = librosa.load('your_file.wav', sr=16000) # y is a numpy array of the wav file, sr = sample rate
y_shifted = librosa.effects.pitch_shift(y, sr, n_steps=4) # shifted by 4 half steps
like image 28
Nic Scozzaro Avatar answered Oct 08 '22 02:10

Nic Scozzaro


Since a wav file basically is raw audio data, you won't be able to change the pitch without "raw audio processing".

Here is what you could do. You will need the wave (standard library) and numpy modules.

import wave
import numpy as np

Open the files.

wr = wave.open('input.wav', 'r')
# Set the parameters for the output file.
par = list(wr.getparams())
par[3] = 0  # The number of samples will be set by writeframes.
par = tuple(par)
ww = wave.open('pitch1.wav', 'w')
ww.setparams(par)

The sound should be processed in small fractions of a second. This cuts down on reverb. Try setting fr to 1; you'll hear annoying echos.

fr = 20
sz = wr.getframerate()//fr  # Read and process 1/fr second at a time.
# A larger number for fr means less reverb.
c = int(wr.getnframes()/sz)  # count of the whole file
shift = 100//fr  # shifting 100 Hz
for num in range(c):

Read the data, split it in left and right channel (assuming a stereo WAV file).

    da = np.fromstring(wr.readframes(sz), dtype=np.int16)
    left, right = da[0::2], da[1::2]  # left and right channel

Extract the frequencies using the Fast Fourier Transform built into numpy.

    lf, rf = np.fft.rfft(left), np.fft.rfft(right)

Roll the array to increase the pitch.

    lf, rf = np.roll(lf, shift), np.roll(rf, shift)

The highest frequencies roll over to the lowest ones. That's not what we want, so zero them.

    lf[0:shift], rf[0:shift] = 0, 0

Now use the inverse Fourier transform to convert the signal back into amplitude.

    nl, nr = np.fft.irfft(lf), np.fft.irfft(rf)

Combine the two channels.

    ns = np.column_stack((nl, nr)).ravel().astype(np.int16)

Write the output data.

    ww.writeframes(ns.tostring())

Close the files when all frames are processed.

wr.close()
ww.close()
like image 145
Roland Smith Avatar answered Oct 08 '22 01:10

Roland Smith