I received a DataFrame from somewhere and want to create another DataFrame with the same number and names of columns and rows (indexes). For example, suppose that the original data frame was created as <pre class="prettyprint"><code>import pandas as pd df1 = pd.DataFrame([[11,12],[21,22]], columns=['c1','c2'], index=['i1','i2']) </code></pre> I copied the structure by explicitly defining the columns and names: <pre class="prettyprint"><code>df2 = pd.DataFrame(columns=df1.columns, index=df1.index) </code></pre> I don't want to copy the data, otherwise I could just write <code>df2 = df1.copy()</code>. In other words, after df2 being created it must contain only NaN elements: <pre class="prettyprint"><code>In [1]: df1 Out[1]: c1 c2 i1 11 12 i2 21 22 In [2]: df2 Out[2]: c1 c2 i1 NaN NaN i2 NaN NaN </code></pre> Is there a more idiomatic way of doing it?

That's a job for <code>reindex_like</code>. Start with the original: <pre class="prettyprint"><code>df1 = pd.DataFrame([[11, 12], [21, 22]], columns=['c1', 'c2'], index=['i1', 'i2']) </code></pre> Construct an empty DataFrame and reindex it like df1: <pre class="prettyprint"><code>pd.DataFrame().reindex_like(df1) Out: c1 c2 i1 NaN NaN i2 NaN NaN </code></pre>

In version 0.18 of pandas, the DataFrame constructor has no options for creating a dataframe like another dataframe with NaN instead of the values. The code you use <code>df2 = pd.DataFrame(columns=df1.columns, index=df1.index)</code> is the most logical way, the only way to improve on it is to spell out even more what you are doing is to add <code>data=None</code>, so that other coders directly see that you intentionally leave out the data from this new DataFrame you are creating. TLDR: So my suggestion is: <h3>Explicit is better than implicit</h3> <pre class="prettyprint"><code>df2 = pd.DataFrame(data=None, columns=df1.columns, index=df1.index) </code></pre> Very much like yours, but more spelled out.

<h3>Not exactly answering this question, but a similar one for people coming here via a search engine</h3> My case was creating a copy of the data frame without data and without index. One can achieve this by doing the following. This will maintain the dtypes of the columns. <pre class="prettyprint"><code>empty_copy = df.drop(df.index) </code></pre>

Is there a way to copy only the structure (not the data) of a Pandas DataFrame?

Tags:

python

pandas

dataframe

I received a DataFrame from somewhere and want to create another DataFrame with the same number and names of columns and rows (indexes). For example, suppose that the original data frame was created as

import pandas as pd
df1 = pd.DataFrame([[11,12],[21,22]], columns=['c1','c2'], index=['i1','i2'])

I copied the structure by explicitly defining the columns and names:

df2 = pd.DataFrame(columns=df1.columns, index=df1.index)

I don't want to copy the data, otherwise I could just write df2 = df1.copy(). In other words, after df2 being created it must contain only NaN elements:

In [1]: df1
Out[1]: 
    c1  c2
i1  11  12
i2  21  22

In [2]: df2
Out[2]: 
     c1   c2
i1  NaN  NaN
i2  NaN  NaN

Is there a more idiomatic way of doing it?

397

asked Dec 14 '14 08:12

bmello

3 Answers

That's a job for reindex_like. Start with the original:

df1 = pd.DataFrame([[11, 12], [21, 22]], columns=['c1', 'c2'], index=['i1', 'i2'])

Construct an empty DataFrame and reindex it like df1:

pd.DataFrame().reindex_like(df1)
Out: 
    c1  c2
i1 NaN NaN
i2 NaN NaN

154

answered Oct 08 '22 17:10

ayhan

In version 0.18 of pandas, the DataFrame constructor has no options for creating a dataframe like another dataframe with NaN instead of the values.

The code you use df2 = pd.DataFrame(columns=df1.columns, index=df1.index) is the most logical way, the only way to improve on it is to spell out even more what you are doing is to add data=None, so that other coders directly see that you intentionally leave out the data from this new DataFrame you are creating.

TLDR: So my suggestion is:

Explicit is better than implicit

df2 = pd.DataFrame(data=None, columns=df1.columns, index=df1.index)

Very much like yours, but more spelled out.

answered Oct 08 '22 17:10

firelynx

Not exactly answering this question, but a similar one for people coming here via a search engine

My case was creating a copy of the data frame without data and without index. One can achieve this by doing the following. This will maintain the dtypes of the columns.

empty_copy = df.drop(df.index)

answered Oct 08 '22 15:10

Martijn Lentink

Related questions
                            
                                Object does not support item assignment error
                            
                                Unit testing a python app that uses the requests library
                            
                                pandas select from Dataframe using startswith
                            
                                What is wrong with using a bare 'except'? [duplicate]
                            
                                How do I use cache_clear() on python @functools.lru_cache
                            
                                Get all documents of a collection using Pymongo
                            
                                Exception thrown in multiprocessing Pool not detected
                            
                                Pandas merge two dataframes with different columns
                            
                                see if two files have the same content in python [duplicate]
                            
                                Impute categorical missing values in scikit-learn
                            
                                Python histogram outline
                            
                                Matplotlib: How to force integer tick labels?
                            
                                Difference between dir(…) and vars(…).keys() in Python?
                            
                                Python urllib2: Reading content body even during HTTPError exception?
                            
                                How to correctly call base class methods (and constructor) from inherited classes in Python? [duplicate]
                            
                                How to iterate over pandas multiindex dataframe using index
                            
                                Python format throws KeyError
                            
                                setup.py and adding file to /bin/
                            
                                How to specify Python 3 source in Cython's setup.py?
                            
                                Python lambda does not accept tuple argument [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With