x
is a numpy.float32
array, with values from -200
to 0
. These are dB (decibel) values.
When I do (as recommended here):
Image.fromarray(x, mode='F')
I get a greyscale or sometimes nearly black image.
How to map a float in [-200, 0] to a 24-bit RGB byte array (using a colormap) that can be read with the Python module PIL with Image.fromarray(x, mode='RGB')
?
The required .wav audio file is here, for which we want to plot the spectrogram.
Here is some code to test:
import scipy, numpy as np
import scipy.io.wavfile as wavfile
import numpy as np
from PIL import Image
def stft(x, fftsize=1024, overlap=4):
hop = fftsize / overlap
w = scipy.hanning(fftsize+1)[:-1]
return np.array([np.fft.rfft(w*x[i:i+fftsize]) for i in range(0, len(x)-fftsize, hop)])
def dB(ratio):
return 20 * np.log10(ratio+1e-10)
def magnitudedB(frame, fftsize=1024):
w = scipy.hanning(fftsize+1)[:-1]
ref = np.sum(w) / 2
return dB(np.abs(frame) / ref)
sr, x = wavfile.read('test.wav')
x = np.float32(x) / 2**15
s = magnitudedB(stft(x)).astype(np.float32).transpose()[::-1,]
print "Max %.1f dB, Min %.1f dB" % (np.max(s), np.min(s))
im = Image.fromarray(s+200, mode='F')
im.show()
Notes:
The colormap is greyscale, how to get another colormap? like this one
My only requirement is that the output image can be read into a Tkinter frame / canvas (it works well with PIL's im = Image.fromarray(...)
then ImageTk.PhotoImage(image=im)
) or wxPython frame / canvas.
But for PIL, the input is RGB, while it's BGR for cv2.
Based on the answer here, you can use matplotlib colormaps to transform the numpy array before converting to an image.
#im = Image.fromarray(s+200, mode='F')
from matplotlib import cm
s = (s + 200)/200.0 # input data should range from 0-1
im = Image.fromarray(cm.jet(s, bytes=True))
im.show()
You should probably set the scaling appropriately based on your min/max values.
Sample output:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With