Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

uwp AudioGraph audio processing

I am working on a winodws IoT project that controls a led strip based on an audio input. Now do I have some code that gets the audio in and writes it to a buffer with the AudioGraph API, but I don't know how I can process the audio to some usefull data.

my code so far:

private async void MainPage_Loaded(object sender, RoutedEventArgs eventArgs)
{
        try
        {
            // Initialize the led strip
            //await this.pixelStrip.Begin();

            sampleAggregator.FftCalculated += new EventHandler<FftEventArgs>(FftCalculated);
            sampleAggregator.PerformFFT = true;

            // Create graph
            AudioGraphSettings settings = new AudioGraphSettings(AudioRenderCategory.Media);
            settings.DesiredSamplesPerQuantum = fftLength;
            settings.DesiredRenderDeviceAudioProcessing = Windows.Media.AudioProcessing.Default;
            settings.QuantumSizeSelectionMode = QuantumSizeSelectionMode.ClosestToDesired;

            CreateAudioGraphResult result = await AudioGraph.CreateAsync(settings);
            if (result.Status != AudioGraphCreationStatus.Success)
            {
                // Cannot create graph
                return;
            }
            graph = result.Graph;

            // Create a device input node using the default audio input device
            CreateAudioDeviceInputNodeResult deviceInputNodeResult = await graph.CreateDeviceInputNodeAsync(MediaCategory.Other);

            if (deviceInputNodeResult.Status != AudioDeviceNodeCreationStatus.Success)
            {
                return;
            }

            deviceInputNode = deviceInputNodeResult.DeviceInputNode;

            frameOutputNode = graph.CreateFrameOutputNode();
            frameOutputNode.Start();
            graph.QuantumProcessed += AudioGraph_QuantumProcessed;

            // Because we are using lowest latency setting, we need to handle device disconnection errors
            graph.UnrecoverableErrorOccurred += Graph_UnrecoverableErrorOccurred;

            graph.Start();
        }
        catch (Exception e)
        {
            Debug.WriteLine(e.ToString());
        }
    }

    private void AudioGraph_QuantumProcessed(AudioGraph sender, object args)
    {
        AudioFrame frame = frameOutputNode.GetFrame();
        ProcessFrameOutput(frame);
    }

    unsafe private void ProcessFrameOutput(AudioFrame frame)
    {
        using (AudioBuffer buffer = frame.LockBuffer(AudioBufferAccessMode.Write))
        using (IMemoryBufferReference reference = buffer.CreateReference())
        {
            byte* dataInBytes;
            uint capacityInBytes;
            float* dataInFloat;

            // Get the buffer from the AudioFrame
            ((IMemoryBufferByteAccess)reference).GetBuffer(out dataInBytes, out capacityInBytes);

            dataInFloat = (float*)dataInBytes;


        }
    }

So I end with my buffer as a float. But how can i change this to usefull data that makes it possible to create something like a spectrum analyzer?

Edit:

Maybe I have to make this question less specific for the audiograph. I use an API to get my audio input. The data I get from the API is a byte* and I can cast it to a float* How can I change it from the byte* or the float* to some other data that I can use to create some color codes.

I thaught about doing some FFT analysis on the float* to get 164 leds*3(rgb) = 492 bins. And process this data further to get some values between 0 and 255.

So how can I process this float* or byte* to get this usefull data? Or how do I start?

like image 770
A. Bouthoorn Avatar asked Jan 10 '16 14:01

A. Bouthoorn


1 Answers

That data is interleaved IEEE float, so it alternates channel data as you step through the array, and the data range for each sample is from -1 to 1. For example, a mono signal only has one channel, so it won't interleave data at all; but a stereo signal has two channels of audio, and so:

dataInFloat[0]

is the first sample of data from the left channel and

dataInFloat[1]

is the first sample of data from the right channel. Then,

dataInFloat[2]

is the second sample of data from the left channel. and they just keep going back and forth. All the other data you'll end up caring about is in windows.media.mediaproperties.audioencodingproperties

So, just knowing this, you (essentially) can immediately get the overall volume of the signal directly from this data by looking at the absolute value of each sample. You'll definitely want to average it out over some amount of time. You can even just attach EQ effects to different nodes, and make seperate Low, Mids, and High analyzer nodes and never even get into FFT stuff. BUT WHAT FUN IS THAT? (it's actually still fun)

And then, yeah, to get your complex harmonic data and make a truly sweet visualizer, you want to do an FFT on it. People enjoy using AForge for learning scenarios, like yours. See Sources/Imaging/ComplexImage.cs for usage, Sources/Math/FourierTransform.cs for implemenation

Then you can easily get your classic bin data and do the classic music visualizer stuff or get more creative or whatever! technology is awesome!

like image 76
andymule Avatar answered Oct 06 '22 16:10

andymule