Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Import wav file in Tensorflow 2

Using Python 3.7 and Tensorflow 2.0, I'm having a hard time reading wav files from the UrbanSounds dataset. This question and answer are helpful because they explain that the input has to be a string tensor, but it seems to be having a hard time getting past the initial metadata encoded in the file, and getting to the real data. Do I have to preprocess the string before being able to load it as a float32 tensor? I already had to preprocess the data by downsampling it from 24-bit wav to 16-bit wav, so the data-input pipeline is turning out to be much more cumbersome than I would have expected. The required downsampling is particularly frustrating. Here's what I'm trying so far:

import tensorflow as tf  # this is TensorFlow 2.0

path_to_wav_file = '/mnt/d/Code/UrbanSounds/audio/fold1/101415-3-0-2.wav'
# Turn the wav file into a string tensor
input_data = tf.io.read_file(path_to_wav_file)
# Convert the string tensor to a float32 tensor
audio, sampling_rate = tf.audio.decode_wav(input_data)

This is the error I get at the last step:

2019-10-08 20:56:09.124254: W tensorflow/core/framework/op_kernel.cc:1546] OP_REQUIRES failed at decode_wav_op.cc:55 : Invalid argument: Header mismatch: Expected fmt  but found junk
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/anaconda3/envs/tf2/lib/python3.7/site-packages/tensorflow/python/ops/gen_audio_ops.py", line 216, in decode_wav
    _six.raise_from(_core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: Header mismatch: Expected fmt  but found junk [Op:DecodeWav]

And here is the beginning of that string tensor. I'm no expert on wav files, but I think the part after "fmt" is where the actual audio data starts. Before that I think it's all metadata about the file.

data.numpy()[:70]
b'RIFFhb\x05\x00WAVEjunk\x1c\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00fmt \x10\x00\x00\x00\x01\x00\x01\x00D\xac\x00\x00\x88X\x01\x00\x02\x00'
like image 812
Alex Avatar asked Oct 09 '19 01:10

Alex


1 Answers

It seems like your error has to do with TensorFlow expecting the fmt part as the beginning.

The code of TensorFlow for the processing can be found here: https://github.com/tensorflow/tensorflow/blob/c9cd1784bf287543d89593ca1432170cdbf694de/tensorflow/core/lib/wav/wav_io.cc#L225

There's also an open issue, awaiting response from TensorFlow's team which roughly covers the same error you've provided. https://github.com/tensorflow/tensorflow/issues/32382

Other libraries just skip the Junk part, so it works with them.

like image 179
devnull Avatar answered Oct 09 '22 13:10

devnull