Analyze "whistle" sound for pitch/note

3 Answers

As many others have already said, FFT is the way to go here. I've written a little example in Java using FFT code from http://www.cs.princeton.edu/introcs/97data/. In order to run it, you will need the Complex class from that page also (see the source for the exact URL).

The code reads in a file, goes window-wise over it and does an FFT on each window. For each FFT it looks for the maximum coefficient and outputs the corresponding frequency. This does work very well for clean signals like a sine wave, but for an actual whistle sound you probably have to add more. I've tested with a few files with whistling I created myself (using the integrated mic of my laptop computer), the code does get the idea of what's going on, but in order to get actual notes more needs to be done.

1) You might need some more intelligent window technique. What my code uses now is a simple rectangular window. Since the FFT assumes that the input singal can be periodically continued, additional frequencies are detected when the first and the last sample in the window don't match. This is known as spectral leakage ( http://en.wikipedia.org/wiki/Spectral_leakage ), usually one uses a window that down-weights samples at the beginning and the end of the window ( http://en.wikipedia.org/wiki/Window_function ). Although the leakage shouldn't cause the wrong frequency to be detected as the maximum, using a window will increase the detection quality.

2) To match the frequencies to actual notes, you could use an array containing the frequencies (like 440 Hz for a') and then look for the frequency that's closest to the one that has been identified. However, if the whistling is off standard tuning, this won't work any more. Given that the whistling is still correct but only tuned differently (like a guitar or other musical instrument can be tuned differently and still sound "good", as long as the tuning is done consistently for all strings), you could still find notes by looking at the ratios of the identified frequencies. You can read http://en.wikipedia.org/wiki/Pitch_%28music%29 as a starting point on that. This is also interesting: http://en.wikipedia.org/wiki/Piano_key_frequencies

3) Moreover it might be interesting to detect the points in time when each individual tone starts and stops. This could be added as a pre-processing step. You could do an FFT for each individual note then. However, if the whistler doesn't stop but just bends between notes, this would not be that easy.

Definitely have a look at the libraries the others suggested. I don't know any of them, but maybe they contain already functionality for doing what I've described above.

And now to the code. Please let me know what worked for you, I find this topic pretty interesting.

Edit: I updated the code to include overlapping and a simple mapper from frequencies to notes. It works only for "tuned" whistlers though, as mentioned above.

package de.ahans.playground;

import java.io.File;
import java.io.IOException;
import java.util.Arrays;

import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.UnsupportedAudioFileException;

public class FftMaxFrequency {

    // taken from http://www.cs.princeton.edu/introcs/97data/FFT.java.html
    // (first hit in Google for "java fft"
    // needs Complex class from http://www.cs.princeton.edu/introcs/97data/Complex.java
    public static Complex[] fft(Complex[] x) {
        int N = x.length;

        // base case
        if (N == 1) return new Complex[] { x[0] };

        // radix 2 Cooley-Tukey FFT
        if (N % 2 != 0) { throw new RuntimeException("N is not a power of 2"); }

        // fft of even terms
        Complex[] even = new Complex[N/2];
        for (int k = 0; k < N/2; k++) {
            even[k] = x[2*k];
        }
        Complex[] q = fft(even);

        // fft of odd terms
        Complex[] odd  = even;  // reuse the array
        for (int k = 0; k < N/2; k++) {
            odd[k] = x[2*k + 1];
        }
        Complex[] r = fft(odd);

        // combine
        Complex[] y = new Complex[N];
        for (int k = 0; k < N/2; k++) {
            double kth = -2 * k * Math.PI / N;
            Complex wk = new Complex(Math.cos(kth), Math.sin(kth));
            y[k]       = q[k].plus(wk.times(r[k]));
            y[k + N/2] = q[k].minus(wk.times(r[k]));
        }
        return y;
    }   

    static class AudioReader {
        private AudioFormat audioFormat;

        public AudioReader() {}

        public double[] readAudioData(File file) throws UnsupportedAudioFileException, IOException {
            AudioInputStream in = AudioSystem.getAudioInputStream(file);
            audioFormat = in.getFormat();
            int depth = audioFormat.getSampleSizeInBits();
            long length = in.getFrameLength();
            if (audioFormat.isBigEndian()) {
                throw new UnsupportedAudioFileException("big endian not supported");
            }
            if (audioFormat.getChannels() != 1) {
                throw new UnsupportedAudioFileException("only 1 channel supported");
            }

            byte[] tmp = new byte[(int) length];
            byte[] samples = null;      
            int bytesPerSample = depth/8;
            int bytesRead;
            while (-1 != (bytesRead = in.read(tmp))) {
                if (samples == null) {
                    samples = Arrays.copyOf(tmp, bytesRead);
                } else {
                    int oldLen = samples.length;
                    samples = Arrays.copyOf(samples, oldLen + bytesRead);
                    for (int i = 0; i < bytesRead; i++) samples[oldLen+i] = tmp[i];
                }
            }

            double[] data = new double[samples.length/bytesPerSample];

            for (int i = 0; i < samples.length-bytesPerSample; i += bytesPerSample) {
                int sample = 0;
                for (int j = 0; j < bytesPerSample; j++) sample += samples[i+j] << j*8;
                data[i/bytesPerSample] = (double) sample / Math.pow(2, depth);
            }

            return data;
        }

        public AudioFormat getAudioFormat() {
            return audioFormat;
        }
    }

    public class FrequencyNoteMapper {
        private final String[] NOTE_NAMES = new String[] {
                "A", "Bb", "B", "C", "C#", "D", "D#", "E", "F", "F#", "G", "G#"
            };
        private final double[] FREQUENCIES;
        private final double a = 440;
        private final int TOTAL_OCTAVES = 6;
        private final int START_OCTAVE = -1; // relative to A

        public FrequencyNoteMapper() {
            FREQUENCIES = new double[TOTAL_OCTAVES*12];
            int j = 0;
            for (int octave = START_OCTAVE; octave < START_OCTAVE+TOTAL_OCTAVES; octave++) {
                for (int note = 0; note < 12; note++) {
                    int i = octave*12+note;
                    FREQUENCIES[j++] = a * Math.pow(2, (double)i / 12.0);
                }
            }
        }

        public String findMatch(double frequency) {
            if (frequency == 0)
                return "none";

            double minDistance = Double.MAX_VALUE;
            int bestIdx = -1;

            for (int i = 0; i < FREQUENCIES.length; i++) {
                if (Math.abs(FREQUENCIES[i] - frequency) < minDistance) {
                    minDistance = Math.abs(FREQUENCIES[i] - frequency);
                    bestIdx = i;
                }
            }

            int octave = bestIdx / 12;
            int note = bestIdx % 12;

            return NOTE_NAMES[note] + octave;
        }
    }

    public void run (File file) throws UnsupportedAudioFileException, IOException {
        FrequencyNoteMapper mapper = new FrequencyNoteMapper();

        // size of window for FFT
        int N = 4096;
        int overlap = 1024;
        AudioReader reader = new AudioReader(); 
        double[] data = reader.readAudioData(file);

        // sample rate is needed to calculate actual frequencies
        float rate = reader.getAudioFormat().getSampleRate();

        // go over the samples window-wise
        for (int offset = 0; offset < data.length-N; offset += (N-overlap)) {
            // for each window calculate the FFT
            Complex[] x = new Complex[N];
            for (int i = 0; i < N; i++) x[i] = new Complex(data[offset+i], 0);
            Complex[] result = fft(x);

            // find index of maximum coefficient
            double max = -1;
            int maxIdx = 0;
            for (int i = result.length/2; i >= 0; i--) {
                if (result[i].abs() > max) {
                    max = result[i].abs();
                    maxIdx = i;
                }
            }
            // calculate the frequency of that coefficient
            double peakFrequency = (double)maxIdx*rate/(double)N;
            // and get the time of the start and end position of the current window
            double windowBegin = offset/rate;
            double windowEnd = (offset+(N-overlap))/rate;
            System.out.printf("%f s to %f s:\t%f Hz -- %s\n", windowBegin, windowEnd, peakFrequency, mapper.findMatch(peakFrequency));
        }       
    }

    public static void main(String[] args) throws UnsupportedAudioFileException, IOException {
        new FftMaxFrequency().run(new File("/home/axr/tmp/entchen.wav"));
    }
}

answered Oct 11 '22 16:10

ahans

i think this open-source platform suits you http://code.google.com/p/musicg-sound-api/

answered Oct 11 '22 17:10

Frank Frank

Well, you could always use fftw to perform the Fast Fourier Transform. It's a very well respected framework. Once you've got an FFT of your signal you can analyze the resultant array for peaks. A simple histogram style analysis should give you the frequencies with the greatest volume. Then you just have to compare those frequencies to the frequencies that correspond with different pitches.

answered Oct 11 '22 18:10

Benson

Related questions
                            
                                How to write FLAC files in java
                            
                                iOS: Displaying currently played track info in LockScreen?
                            
                                play sound file in PyQt
                            
                                HTML5 audio won't play on mobile devices
                            
                                Python NumPy - FFT and Inverse FFT?
                            
                                Python audiolab install, unable to install (or find) libsndfile on Mac OSX
                            
                                How can I get DFT/FFT output frequencies in Hertz?
                            
                                Create valid wav file header for streams in memory
                            
                                In Swift, how do I let other apps continue to play audio when my app is open?
                            
                                iPhone App - WAV sound files don't play
                            
                                Handling changed audio device event in c#
                            
                                play two sounds simultaneously in Windows 8 Metro App (C#/XAML)
                            
                                Backwards and cross-browser compatible audio playing
                            
                                MediaRecorder: Record from Multiple Microphones
                            
                                I have a .mp3 file. How can I seperate the human voice from the rest of the sound in C?
                            
                                Lower or fade sound in android
                            
                                Play a sound file from internet when a button is clicked [duplicate]
                            
                                How to set audio equalizer in Android
                            
                                How can I produce real-time audio output from music made with Music21?
                            
                                vuejs with HTML5 audio property

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Analyze "whistle" sound for pitch/note

Tags:

audio

voice

Boris

People also ask

3 Answers

ahans

Frank Frank

Benson

Recent Activity

Donate For Us