Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create a Pandas Dataframe by appending one row at a time

I understand that Pandas is designed to load a fully populated DataFrame, but I need to create an empty DataFrame then add rows, one by one. What is the best way to do this?

I successfully created an empty DataFrame with:

res = DataFrame(columns=('lib', 'qty1', 'qty2')) 

Then I can add a new row and fill a field with:

res = res.set_value(len(res), 'qty1', 10.0) 

It works, but it seems very odd :-/ (It fails for adding a string value.)

How can I add a new row to my DataFrame (with a different columns type)?

like image 936
PhE Avatar asked May 23 '12 08:05

PhE


People also ask

How do I append multiple rows to a DataFrame in Python?

Add multiple rows to pandas dataframe We can pass a list of series too in the dataframe. append() for appending multiple rows in dataframe. For example, we can create a list of series with same column names as dataframe i.e. Now pass this list of series to the append() function i.e.

How do I append a row from one DataFrame to another?

append() function is used to append rows of other dataframe to the end of the given dataframe, returning a new dataframe object. Columns not in the original dataframes are added as new columns and the new cells are populated with NaN value. ignore_index : If True, do not use the index labels.

How do I append a row to an empty DataFrame in Python?

Append Rows to Empty DataFrameDataFrame. append() function is used to add the rows of other DataFrame to the end of the given DataFrame and return a new DataFrame object.


2 Answers

You can use df.loc[i], where the row with index i will be what you specify it to be in the dataframe.

>>> import pandas as pd >>> from numpy.random import randint  >>> df = pd.DataFrame(columns=['lib', 'qty1', 'qty2']) >>> for i in range(5): >>>     df.loc[i] = ['name' + str(i)] + list(randint(10, size=2))  >>> df      lib qty1 qty2 0  name0    3    3 1  name1    2    4 2  name2    2    8 3  name3    2    1 4  name4    9    6 
like image 165
fred Avatar answered Sep 29 '22 21:09

fred


In case you can get all data for the data frame upfront, there is a much faster approach than appending to a data frame:

  1. Create a list of dictionaries in which each dictionary corresponds to an input data row.
  2. Create a data frame from this list.

I had a similar task for which appending to a data frame row by row took 30 min, and creating a data frame from a list of dictionaries completed within seconds.

rows_list = [] for row in input_rows:          dict1 = {}         # get input row in dictionary format         # key = col_name         dict1.update(blah..)           rows_list.append(dict1)  df = pd.DataFrame(rows_list)                
like image 27
ShikharDua Avatar answered Sep 29 '22 19:09

ShikharDua