Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Programmatically increase the pitch of an array of audio samples

Hello kind people of the audio computing world,

I have an array of samples that respresent a recording. Let us say that it is 5 seconds at 44100Hz. How would I play this back at an increased pitch? And is it possible to increase and decrease the pitch dynamically? Like have the pitch slowly increase to double the speed and then back down.

In other words I want to take a recording and play it back as if it is being 'scratched' by a d.j.

Pseudocode is always welcomed. I will be writing this up in C.

Thanks,


EDIT 1

Allow me to clarify my intentions. I want to keep the playback at 44100Hz and so therefore I need to manipulate the samples before playback. This is also because I would want to mix the audio that has an increased pitch with audio that is running at a normal rate.

Expressed in another way, maybe I need to shrink the audio over the same number of samples somehow? That way when it is played back it will sound faster?


EDIT 2

Also, I would like to do this myself. No libraries please (unless you feel I could pick through the code and find something interesting).


EDIT 3

A sample piece of code written in C that takes 2 arguments (array of samples and pitch factor) and then returns an array of the new audio would be fantastic!


PS I've started a bounty on this not because I don't think the answers already given aren't valid. I just thought it would be good to get more feedback on the subject.



AWARD OF BOUNTY

Honestly I wish I could distribute the bounty over several different answers as they were quite a few that I thought were super helpful. Special shoutout to Daniel for passing me some code and AShelly and Hotpaw2 for putting in such detailed responses.

Ultimately though I used an answer from another SO question referenced by datageist and so the award goes to him.

Thanks again everyone!

like image 764
Eric Brotto Avatar asked Mar 01 '11 15:03

Eric Brotto


People also ask

How do you change the pitch of your voice in Python?

rfft(left), np. fft. rfft(right) 8 9 # Roll the array to increase the pitch.

Does resampling affect pitch?

The Resample function allows you to change the length, tempo and pitch of an event. If you resample to a higher sample rate, the event gets longer and the audio plays back at a slower speed with a lower pitch.


2 Answers

Take a look at the "Elephant" paper in Nosredna's answer to this (very similar) SO question: How do you do bicubic (or other non-linear) interpolation of re-sampled audio data?

Sample implementations are provided starting on page 37, and for reference, AShelly's answer corresponds to linear interpolation (on that same page). With a little tweaking, any of the other formulas in the paper could be plugged into that framework.

For evaluating the quality of a given interpolation method (and understanding the potential problems with using "cheaper" schemes), take a look at this page:

http://www.discodsp.com/highlife/aliasing/

For more theory than you probably want to deal with (with source code), this is a good reference as well:

https://ccrma.stanford.edu/~jos/resample/

like image 70
datageist Avatar answered Sep 16 '22 21:09

datageist


One way is to keep a floating point index into the original wave, and mix interpolated samples into the output wave.

//Simulate scratching of `inwave`: 
// `rate` is the speedup/slowdown factor. 
// result mixed into `outwave`
// "Sample" is a typedef for the raw audio type.
void ScratchMix(Sample* outwave, Sample* inwave, float rate)
{
   float index = 0;
   while (index < inputLen)
   {
      int i = (int)index;          
      float frac = index-i;      //will be between 0 and 1
      Sample s1 = inwave[i];
      Sample s2 = inwave[i+1];
      *outwave++ += s1 + (s2-s1)*frac;   //do clipping here if needed
      index+=rate;
   }

}

If you want to change rate on the fly, you can do that too.

If this creates noisy artifacts when rate > 1, try replacing *outwave++ += s1 + (s2-s1)*frac; with this technique (from this question)

*outwave++ = InterpolateHermite4pt3oX(inwave+i-1,frac);

where

public static float InterpolateHermite4pt3oX(Sample* x, float t)
{
    float c0 = x[1];
    float c1 = .5F * (x[2] - x[0]);
    float c2 = x[0] - (2.5F * x[1]) + (2 * x[2]) - (.5F * x[3]);
    float c3 = (.5F * (x[3] - x[0])) + (1.5F * (x[1] - x[2]));
    return (((((c3 * t) + c2) * t) + c1) * t) + c0;
}

Example of using the linear interpolation technique on "Windows Startup.wav" with a factor of 1.1. The original is on top, the sped-up version is on the bottom:

It may not be mathematically perfect, but it sounds like it should, and ought to work fine for the OP's needs..

like image 42
AShelly Avatar answered Sep 18 '22 21:09

AShelly