I have this kind of data :
ID x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
1 -0.18 5 -0.40 -0.26 0.53 -0.66 0.10 2 -0.20 1
2 -0.58 5 -0.52 -1.66 0.65 -0.15 0.08 3 3.03 -2
3 -0.62 5 -0.09 -0.38 0.65 0.22 0.44 4 1.49 1
4 -0.22 -3 1.64 -1.38 0.08 0.42 1.24 5 -0.34 0
5 0.00 5 1.76 -1.16 0.78 0.46 0.32 5 -0.51 -2
what's the best method for visualizing this data, i'm using matplotlib to visualizing it, and read it from csv using pandas
thanks
Considering three attributes or dimensions in the data, we can visualize them by considering a pair-wise scatter plot and introducing the notion of color or hue to separate out values in a categorical dimension. The above plot enables you to check out correlations and patterns and also compare around wine groups.
Seaborn vs matplotlib is that seaborn utilises fascinating themes, while matplotlib used for making basic graphs. Seaborn contains a few plots and patterns for data visualisation, while in matplotlib, datasets are visualised with the assistance of lines, scatter plots, pie charts, histograms, bar-graphs, etc.
Matplotlib is a multi-platform data visualization library built on NumPy arrays, and designed to work with the broader SciPy stack. It was conceived by John Hunter in 2002, originally as a patch to IPython for enabling interactive MATLAB-style plotting via gnuplot from the IPython command line.
Visualising data in a high-dimensional space is always a difficult problem. One solution that is commonly used (and is now available in pandas
) is to inspect all of the 1D and 2D projections of the data. It doesn't give you all of the information about the data, but that's impossible to visualise unless you can see in 10D! Here's an example of how to do this with pandas (version 0.7.3 upwards):
import numpy as np
import pandas as pd
from pandas.plotting import scatter_matrix
#first make some fake data with same layout as yours
data = pd.DataFrame(np.random.randn(100, 10), columns=['x1', 'x2', 'x3',\
'x4','x5','x6','x7','x8','x9','x10'])
#now plot using pandas
scatter_matrix(data, alpha=0.2, figsize=(6, 6), diagonal='kde')
This generates a plot with all of the 2D projections as scatter plots, and KDE histograms of the 1D projections:
I also have a pure matplotlib approach to this on my github page, which produces a very similar type of plot (it is designed for MCMC output, but is also appropriate here). Here's how you'd use it here:
import corner_plot as cp
cp.corner_plot(data.as_matrix(),axis_labels=data.columns,nbins=10,\
figsize=(7,7),scatter=True,fontsize=10,tickfontsize=7)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With