So I'm currently trying to take audio from an external microphone (that's actually on a robot in this case) and stream it into Unity to be played in a scene. I'm fairly certain this audio is encoded in the mp3 format with a sample rate of 16000 Hz and a bitrate of 192 kHz.
I'm able to get this audio as a byte array (that seems to always be Little Endian) in Unity, and I'd like to convert to a float array with each value ranging from -1.0f to +1.0f so that I can use AudioClip.SetData to play it in the Unity scene. My problem is that I'm so far unable to do this.
My first attempt was based on this StackOverflow answer: create AudioClip from byte[] which uses the following function for conversion:
private float[] ConvertByteToFloat(byte[] array) {
float[] floatArr = new float[array.Length / 4];
for (int i = 0; i < floatArr.Length; i++) {
if (BitConverter.IsLittleEndian) {
Array.Reverse(array, i * 4, 4);
}
floatArr[i] = BitConverter.ToSingle(array, i * 4) / 0x80000000;
}
return floatArr;
}
I then invoked this like so:
scaledAudio = ConvertByteToFloat(audioData);
AudioClip audioClip = AudioClip.Create("RobotAudio", scaledAudio.Length, 1, 16000, false);
audioClip.SetData(scaledAudio, 0);
AudioSource.PlayClipAtPoint(audioClip, robot.transform.position);
But the result was a lot of static, and on logging some outputs, I realized that I was getting a bunch of NaN's...
I read somewhere that mp3 audio could extracted using the BitConverter.ToInt16()
function, so I changed the ConvertByteToFloat
function accordingly like so:
private float[] ConvertByteToFloat16(byte[] array) {
float[] floatArr = new float[array.Length / 2];
for (int i = 0; i < floatArr.Length; i++) {
if (BitConverter.IsLittleEndian) {
Array.Reverse(array, i * 2, 2);
}
floatArr[i] = (float) (BitConverter.ToInt16(array, i * 2) / 32767f);
}
return floatArr;
}
[Note: the result is divided by 32767f because I read this is the maximum value that can occur and I want to scale it down to between -1.0f and 1.0f]
The numbers from this look much more promising. They are indeed all between -1.0f and 1.0f. But when I attempt to play the audio with Unity, all I hear is static.
The issue almost definitely seems to be in the conversion of the byte[] to the float[], but I could've made a mistake in setting the data or the player for the AudioClip or the AudioSource.
Any help/suggestions are MUCH appreciated!
[Additional resources: The byte[] that I got into unity comes from here: https://github.com/ros-drivers/audio_common/blob/master/audio_capture/src/audio_capture.cpp There is a related script that takes the data encoded by this capture program and plays it (https://github.com/ros-drivers/audio_common/blob/master/audio_play/src/audio_play.cpp). This works just fine - so if I could replicate the decoding functionality of the audio_play script in that second link, it seems like I'll be good to go!]
In the file you linked, it says during the setup that it encodes the data as encoded mp3 format (line number on left).
21 >> // Need to encoding or publish raw wave data
22 >> ros::param::param<std::string>("~format", _format, "mp3");
This means you have two options.
Change the output format from the C++ library to export a raw wave file format.
21 >> // Need to encoding or publish raw wave data
22 >> ros::param::param<std::string>("~format", _format, "wave");
Reading through the code if you change line 22's third constructor argument to "wave", it will export the data as .wav format, and will therefore not require decoding in Unity. This will require you re-compiling your C++ code if this is an option. Please note that the audio data (in wave format) will be slightly larger in memory (than mp3).
See line 98 -> 109 of the audio_capture.cpp file for where it checks wave or mp3 formatting.
Otherwise you could try decode the mp3 data in Unity. This is most likely going to involve using an mp3 library (the first one I found was MP3Sharp). Otherwise there's a Unity asset called uAudio that states to do realtime mp3 compression/decompression; this might be simpler than using a generic mp3 decoder as it's already been designed for Unity.
I would not recommend writing your own mp3 decoder unless just for the sake of a challenge, or for learning purposes.
All ideas aside, my first attempt would be to re-compile your C++ library with the argument as "wave" as stated above!
I hope this helps :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With