How to feed sound as input to neural networks? [closed]

Tags:

I am planning to build a software which can classify a piece of music as good or bad using artificial neural networks. For this, I need to convert audio into some numerical values to feed to NN as input. So for training the NN, I first downloaded billboard hot 100 songs (which I believe should classify as good music), and also downloaded some bad noise audio files (which will classify as bad music). Then I converted them to .wav format and then split each file into multiple .wav files of length 2 seconds each. I was planning to use fast fourier transform to convert these audio clippings to frequency - amplitude pairs, but the problem is, even if we use a 2 second clip, its FFT would generate array of about 100,000 such pairs. And doing this to thousands of audio files would generate too big dataset with too many features.
I wanted to know is there any way we could shorten this dataset, while keeping the 'essence of music' in it so that better predictions can be made? Or should I use some other algorithm/ process?

901

asked Feb 27 '18 12:02

Tarun Khare

1 Answers

At first, you can extract the various audio features like:

1) Compactness.
2) Magnitude spectrum.
3) Mel-frequency cepstral coefficients.
4) Pitch.
5) Power Spectrum.
6) RMS.
7) Rhythm.
8) Spectral Centroid.
9) Spectral Flux.
10) Spectral RollOff Point.
11) Spectral Variability.
12) Zero Crossings.

After generating the feature set you have two options:

A) Aggregate the particular feature of a song by taking mean [and/or variance], concatenate the whole features for a song, then feed into the Artifical Neural Network and perform the classification task.

B) Use the Recurrent Neural Network for the classification task.

177

answered Oct 05 '22 10:10

Someone

Related questions
                            
                                Imbalanced Dataset Using Keras
                            
                                Bird's eye view perspective transformation from camera calibration opencv python
                            
                                Correct way to implement piecewise function in pandas / numpy
                            
                                Construct (N+1)-dimensional diagonal matrix from values in N-dimensional array
                            
                                pandas groupby and adding new column
                            
                                PyAutoGui - Press key for X seconds
                            
                                invalid group reference when using re.sub()
                            
                                Join related models in django rest framework
                            
                                Numpy print a 1d array as a column
                            
                                Change all column names in chained operation
                            
                                parallel dask for loop slower than regular loop?
                            
                                How to find a letter in an Image with python
                            
                                Executing SQL query with psycopg2
                            
                                Mini batch training for inputs of variable sizes
                            
                                find the Hamming distance between two DNA strings
                            
                                Feature matching with flann in opencv
                            
                                Collectstatic - permission denied, pythonanywhere bash terminal
                            
                                How to load a SQLite3 extension in SQLAlchemy?
                            
                                Get data set as numpy array from TFRecordDataset
                            
                                Printing utf8 strings in Sublime Text's console with Windows

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to feed sound as input to neural networks? [closed]

Tags:

python

machine-learning

neural-network

tensorflow

signal-processing

Tarun Khare

People also ask

1 Answers

Someone

Recent Activity

Donate For Us