I am trying to do sound analysis on a file in Python, and I have a sound file from a show that is high definition and it is very large (2.39 GB). However, whenever I try to open this using the wave module, I get the following error:
wave.Error: unknown format: 65534
I got this file by converting a .ts file into a .wav file. I used the same method on standard definition shows and it worked just fine. I am able to do some analysis using
data = np.memmap(audioclip,dtype='h',mode='r')
however, this does not get accurate results, as it thinks the audioclip is 3 hours long when it is only one hour long. Any help would be appreciated, I have similar issues with different error codes, however those have not been much help to this issue. Thank you so much!
Disclaimer: I don't really know that much about python.
I googled wave.py and found the following link: http://www.opensource.apple.com/source/python/python-3/python/Lib/wave.py
If you look for the function named _read_fmt_chunk
you'll see the source of the error message. In short, the wave module only supports WAVE_FORMAT_PCM. Format 65534
is a format called WAVE_FORMAT_EXTENSIBLE
defined by Microsoft and is used for multi-channel wave files. It's pretty uncommon.
I think you have a few options:
WAVE_FORMAT_EXTENSIBLE
wave.py
to support WAVE_FORMAT_EXTENSIBLE
- assuming the SubFormat
field is PCM or IEEE_FLOAT that wouldn't be a big deal. From that perspective it just increases the size of the header. If it is another SubFormat
then you'll need to run an appropriate decoder before you can even get to PCM.WAVE_FORMAT_EXTENSIBLE
.wav file to one which is not. sox
may be able to handle this.Regarding the second part of your question. It's not clear from your question how you are determining the duration of the file. But if you make incorrect assumptions about the number of channels that could be throwing you off.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With