I received a DataFrame from somewhere and want to create another DataFrame with the same number and names of columns and rows (indexes). For example, suppose that the original data frame was created as
import pandas as pd
df1 = pd.DataFrame([[11,12],[21,22]], columns=['c1','c2'], index=['i1','i2'])
I copied the structure by explicitly defining the columns and names:
df2 = pd.DataFrame(columns=df1.columns, index=df1.index)
I don't want to copy the data, otherwise I could just write df2 = df1.copy()
. In other words, after df2 being created it must contain only NaN elements:
In [1]: df1
Out[1]:
c1 c2
i1 11 12
i2 21 22
In [2]: df2
Out[2]:
c1 c2
i1 NaN NaN
i2 NaN NaN
Is there a more idiomatic way of doing it?
Use: new_df = dataframe. copy(deep=False); new_df. astype(dataframe. dtypes.
To select all columns except one column in Pandas DataFrame, we can use df. loc[:, df. columns != <column name>].
To select a single column, use square brackets [] with the column name of the column of interest.
To copy Pandas DataFrame, use the copy() method. The DataFrame. copy() method makes a copy of the provided object's indices and data. The copy() method accepts one parameter called deep, and it returns the Series or DataFrame that matches the caller.
That's a job for reindex_like
. Start with the original:
df1 = pd.DataFrame([[11, 12], [21, 22]], columns=['c1', 'c2'], index=['i1', 'i2'])
Construct an empty DataFrame and reindex it like df1:
pd.DataFrame().reindex_like(df1)
Out:
c1 c2
i1 NaN NaN
i2 NaN NaN
In version 0.18 of pandas, the DataFrame constructor has no options for creating a dataframe like another dataframe with NaN instead of the values.
The code you use df2 = pd.DataFrame(columns=df1.columns, index=df1.index)
is the most logical way, the only way to improve on it is to spell out even more what you are doing is to add data=None
, so that other coders directly see that you intentionally leave out the data from this new DataFrame you are creating.
TLDR: So my suggestion is:
df2 = pd.DataFrame(data=None, columns=df1.columns, index=df1.index)
Very much like yours, but more spelled out.
My case was creating a copy of the data frame without data and without index. One can achieve this by doing the following. This will maintain the dtypes of the columns.
empty_copy = df.drop(df.index)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With