Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating Pandas Dataframe between two Numpy arrays, then draw scatter plot

I'm relatively new with numpy and pandas (I'm an experimental physicist so I've been using ROOT for years...). A common plot in ROOT is a 2D scatter plot where, given a list of x- and y- values, makes a "heatmap" type scatter plot of one variable versus the other.

How is this best accomplished with numpy and Pandas? I'm trying to use the Dataframe.plot() function, but I'm struggling to even create the Dataframe.

import numpy as np import pandas as pd x = np.random.randn(1,5) y = np.sin(x) df = pd.DataFrame(d) 

First off, this dataframe has shape (1,2), but I would like it to have shape (5,2). If I can get the dataframe the right shape, I'm sure I can figure out the DataFrame.plot() function to draw what I want.

like image 784
n3utrino Avatar asked Apr 29 '15 16:04

n3utrino


People also ask

How do you combine arrays into data frames?

To convert an array to a dataframe with Python you need to 1) have your NumPy array (e.g., np_array), and 2) use the pd. DataFrame() constructor like this: df = pd. DataFrame(np_array, columns=['Column1', 'Column2']) .

How do I add NumPy arrays to each other?

To add the two arrays together, we will use the numpy. add(arr1,arr2) method. In order to use this method, you have to make sure that the two arrays have the same length. If the lengths of the two arrays are​ not the same, then broadcast the size of the shorter array by adding zero's at extra indexes.

How do I combine two DataFrames in pandas?

The concat() function in pandas is used to append either columns or rows from one DataFrame to another. The concat() function does all the heavy lifting of performing concatenation operations along an axis while performing optional set logic (union or intersection) of the indexes (if any) on the other axes.


2 Answers

There are a number of ways to create DataFrames. Given 1-dimensional column vectors, you can create a DataFrame by passing it a dict whose keys are column names and whose values are the 1-dimensional column vectors:

import numpy as np import pandas as pd x = np.random.randn(5) y = np.sin(x) df = pd.DataFrame({'x':x, 'y':y}) df.plot('x', 'y', kind='scatter') 
like image 129
unutbu Avatar answered Oct 21 '22 09:10

unutbu


In order to do what you want, I wouldn't use the DataFrame plotting methods. I'm also a former experimental physicist, and based on experience with ROOT I think that the Python analog you want is best accomplished using matplotlib. In matplotlib.pyplot there is a method, hist2d(), which will give you the kind of heat map you're looking for.

As for creating the dataframe, an easy way to do it is:

df=pd.DataFrame({'x':x, 'y':y}) 
like image 26
RKD314 Avatar answered Oct 21 '22 11:10

RKD314