Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas create empty DataFrame with only column names

People also ask

How will you create an empty DataFrame in pandas?

You can create an empty dataframe by importing pandas from the python library. Later, using the pd. DataFrame(), create an empty dataframe without rows and columns as shown in the below example.

How do I create a new data frame in a specific column?

You can create a new DataFrame of a specific column by using DataFrame. assign() method. The assign() method assign new columns to a DataFrame, returning a new object (a copy) with the new columns added to the original ones.

How do you create an empty DataFrame in PySpark with column names?

In order to create an empty PySpark DataFrame manually with schema ( column names & data types) first, Create a schema using StructType and StructField . Now use the empty RDD created above and pass it to createDataFrame() of SparkSession along with the schema for column names & data types.


You can create an empty DataFrame with either column names or an Index:

In [4]: import pandas as pd
In [5]: df = pd.DataFrame(columns=['A','B','C','D','E','F','G'])
In [6]: df
Out[6]:
Empty DataFrame
Columns: [A, B, C, D, E, F, G]
Index: []

Or

In [7]: df = pd.DataFrame(index=range(1,10))
In [8]: df
Out[8]:
Empty DataFrame
Columns: []
Index: [1, 2, 3, 4, 5, 6, 7, 8, 9]

Edit: Even after your amendment with the .to_html, I can't reproduce. This:

df = pd.DataFrame(columns=['A','B','C','D','E','F','G'])
df.to_html('test.html')

Produces:

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>A</th>
      <th>B</th>
      <th>C</th>
      <th>D</th>
      <th>E</th>
      <th>F</th>
      <th>G</th>
    </tr>
  </thead>
  <tbody>
  </tbody>
</table>

Are you looking for something like this?

    COLUMN_NAMES=['A','B','C','D','E','F','G']
    df = pd.DataFrame(columns=COLUMN_NAMES)
    df.columns

   Index(['A', 'B', 'C', 'D', 'E', 'F', 'G'], dtype='object')

Creating colnames with iterating

df = pd.DataFrame(columns=['colname_' + str(i) for i in range(5)])
print(df)

# Empty DataFrame
# Columns: [colname_0, colname_1, colname_2, colname_3, colname_4]
# Index: []

to_html() operations

print(df.to_html())

# <table border="1" class="dataframe">
#   <thead>
#     <tr style="text-align: right;">
#       <th></th>
#       <th>colname_0</th>
#       <th>colname_1</th>
#       <th>colname_2</th>
#       <th>colname_3</th>
#       <th>colname_4</th>
#     </tr>
#   </thead>
#   <tbody>
#   </tbody>
# </table>

this seems working

print(type(df.to_html()))
# <class 'str'>

The problem is caused by

when you create df like this

df = pd.DataFrame(columns=COLUMN_NAMES)

it has 0 rows × n columns, you need to create at least one row index by

df = pd.DataFrame(columns=COLUMN_NAMES, index=[0])

now it has 1 rows × n columns. You are be able to add data. Otherwise its df that only consist colnames object(like a string list).


df.to_html() has a columns parameter.

Just pass the columns into the to_html() method.

df.to_html(columns=['A','B','C','D','E','F','G'])