I want to convert two numpy array to one DataFrame
containing two columns.
The first numpy array 'images' is of shape 102, 1024
.
The second numpy array 'label' is of shape (1020, )
My core code is:
images=np.array(images)
label=np.array(label)
l=np.array([images,label])
dataset=pd.DataFrame(l)
But it turns out to be an error saying that:
ValueError: could not broadcast input array from shape (1020,1024) into shape (1020)
What should I do to convert these two numpy array into two columns in one dataframe?
To convert an array to a dataframe with Python you need to 1) have your NumPy array (e.g., np_array), and 2) use the pd. DataFrame() constructor like this: df = pd. DataFrame(np_array, columns=['Column1', 'Column2']) . Remember, that each column in your NumPy array needs to be named with columns.
We can use np. column_stack() to combine two 1-D arrays X and Y into a 2-D array. Then, we can use pd. DataFrame to change it into a dataframe.
To convert a numpy array to pandas dataframe, we use pandas. DataFrame() function of Python Pandas library.
You can use the numpy. concatenate() function to concat, merge, or join a sequence of two or multiple arrays into a single NumPy array. Concatenation refers to putting the contents of two or more arrays in a single array.
You can't stack them easily, especially if you want them as different columns, because you can't insert a 2D array in one column of a DataFrame, so you need to convert it to something else, for example a list
.
So something like this would work:
import pandas as pd
import numpy as np
images = np.array(images)
label = np.array(label)
dataset = pd.DataFrame({'label': label, 'images': list(images)}, columns=['label', 'images'])
This will create a DataFrame
with 1020 rows and 2 columns, where each item in the second column contains 1D arrays of length 1024.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With