Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to plot a multi-dimensional data point in python

Some background first:

I want to plot of Mel-Frequency Cepstral Coefficients of various songs and compare them. I calculate MFCC's throughout a song and then average them to get one array of 13 coefficients. I want this to represent one point on a graph that I plot.

I'm new to Python and very new to any form of plotting (though I've seen some recommendations to use matplotlib).

I want to be able to visualize this data. Any thoughts on how I might go about doing this?

like image 817
CatLord Avatar asked Jan 13 '15 19:01

CatLord


People also ask

How can you visualize multidimensional data?

Considering three attributes or dimensions in the data, we can visualize them by considering a pair-wise scatter plot and introducing the notion of color or hue to separate out values in a categorical dimension. The above plot enables you to check out correlations and patterns and also compare around wine groups.


1 Answers

Firstly, if you want to represent an array of 13 coefficients as a single point in your graph, then you need to break the 13 coefficients down to the number of dimensions in your graph as yan king yin pointed out in his comment. For projecting your data into 2 dimensions you can either create relevant indicators yourself such as max/min/standard deviation/.... or you apply methods of dimensionality reduction such as PCA. Whether or not to do so and how to do so is another topic.

Then, plotting is easy and is done as here: http://matplotlib.org/api/pyplot_api.html

I provide an example code for this solution:

import matplotlib.pyplot as plt
import numpy as np

#fake example data
song1 = np.asarray([1, 2, 3, 4, 5, 6, 2, 35, 4, 1])
song2 = song1*2
song3 = song1*1.5

#list of arrays containing all data
data = [song1, song2, song3]

#calculate 2d indicators
def indic(data):
    #alternatively you can calulate any other indicators
    max = np.max(data, axis=1)
    min = np.min(data, axis=1)
    return max, min

x,y = indic(data)
plt.scatter(x, y, marker='x')
plt.show()

The results looks like this: enter image description here

Yet i want to suggest another solution to your underlying problem, namely: plotting multidimensional data. I recommend using something parralel coordinate plot which can be constructed with the same fake data:

import pandas as pd
pd.DataFrame(data).T.plot()
plt.show()

Then the result shows all coefficents for each song along the x axis and their value along the y axis. I would looks as follows: enter image description here

UPDATE:

In the meantime I have discovered the Python Image Gallery which contains two nice example of high dimensional visualization with reference code:

  • Radar chart

enter image description here

  • Parallel plot

enter image description here

like image 143
Nikolas Rieble Avatar answered Sep 19 '22 11:09

Nikolas Rieble