Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read Ogg or MP3 audio files in a TensorFlow graph?

I've seen image decoders like tf.image.decode_png in TensorFlow, but how about reading audio files (WAV, Ogg, MP3, etc.)? Is it possible without TFRecord?

E.g. something like this:

filename_queue = tf.train.string_input_producer(['my-audio.ogg'])
reader = tf.WholeFileReader()
key, value = reader.read(filename_queue)
my_audio = tf.audio.decode_ogg(value)
like image 423
Carl Thomé Avatar asked Dec 12 '16 21:12

Carl Thomé


People also ask

How to read an audio file in TensorFlow Io?

In TensorFlow IO, class tfio.audio.AudioIOTensor allows you to read an audio file into a lazy-loaded IOTensor: In the above example, the Flac file brooklyn.flac is from a publicly accessible audio clip in google cloud.

Is it possible to play MP3 files in TensorFlow pipeline?

The answer from @sygi is unfortunately not supported in TensorFlow 2.x. An alternative solution would be to use some external library (e.g. pydub or librosa) to implement the mp3 decoding step, and integrate it in the pipeline through the use of tf.py_function.

Can I use TensorFlow to make music?

TensorFlow also has additional support for audio data preparation and augmentation to help with your own audio-based projects. Consider using the librosa library—a Python package for music and audio analysis.

Which is better Ogg or MP3?

If you want to choose a format between Ogg and MP3, it depends on how you’re going to use the audio file. If you want to keep the file size small, then both Ogg and MP3 can meet your need. There’re some areas that Ogg is better at like the sound quality, open-source characteristics and changeable bit rate.


1 Answers

Yes, there are special decoders, in the package tensorflow.contrib.ffmpeg. To use it, you need to install ffmpeg first.

Example:

audio_binary = tf.read_file('song.mp3')
waveform = tf.contrib.ffmpeg.decode_audio(audio_binary, file_format='mp3', samples_per_second=44100, channel_count=2)
like image 123
sygi Avatar answered Sep 20 '22 21:09

sygi