Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas insert empty column between existing columns

Tags:

python

pandas

I'm trying to make a csv converter with Pandas. Currently it reads a file (export.csv) and writes a new file (google.csv). I've selected the columns from the export.csv file but for the new file I want to insert a few new columns.

I'd like to insert one at the beginning of the dataframe called Date and have the same value entered (11-23-20) for each row, I'd like to add another one between ProductUrl and RetailPrice called Shippinged with no values.

Here's my code:

import pandas as pd

ed = pd.read_csv('export.csv')
df = ed[['ItemID', 'ProductUrl', 'RetailPrice', 'ItemName', 'ItemName', 'ItemPrimaryImageUrl', 'CategoryList', 'Manufacturer', 'GTIN', 'ProductStatus','CategoryList', 'ItemNumber']]
df.columns =['id', 'link', 'price', 'title', 'description','image_link', 'product_type','brand', 'gtin', 'availability', 'google_product_category','mpn'] 
df.to_csv('google.csv')
like image 892
robothead Avatar asked Mar 07 '26 10:03

robothead


1 Answers

To insert a column at the specific location, you can use:

df.insert(loc, column, value)

Let's assume, your dataset is as follows:

>>> df

   ItemID ProductUrl  RetailPrice ItemName
0       1       aUrl          1.0    aItem
1       2       bUrl          2.0    bItem
2       3       cUrl          3.0    cItem

You can add a column to the beginning by using:

df.insert(0, 'Date', pd.Timestamp.now().strftime('%m-%d-%y'))

This will give you:

>>> df.insert(0, 'Date', pd.Timestamp.now().strftime('%m-%d-%y'))
>>> df
       Date  ItemID ProductUrl  RetailPrice ItemName
0  11-23-20       1       aUrl          1.0    aItem
1  11-23-20       2       bUrl          2.0    bItem
2  11-23-20       3       cUrl          3.0    cItem

Then you can add the column to the specific location with a similar insert statement.

You have two options: You can search for the column name where you want to insert, then insert. OR, you can insert to a specific location if you already know the location.

In your case, you want to insert the column between 'ProductUrl' and 'RetailPrice'. The original location for 'RetailPrice' was 3. We added a column to the beginning. So the current location for 'RetailPrice' is 4. So you can insert the new column at position 4. Remember, columns starts from 0. So you need to insert at location 3 (0 = Date, 1 = ItemID, 2 = ProductUrl, 3 = newColumn, 4 = RetailPrice).

>>> df.insert(3, 'Shippinged', '')
>>> df
       Date  ItemID ProductUrl Shippinged  RetailPrice ItemName
0  11-23-20       1       aUrl                     1.0    aItem
1  11-23-20       2       bUrl                     2.0    bItem
2  11-23-20       3       cUrl                     3.0    cItem

Alternatively, you can use columns.get_loc() to get the index of any column.

Since you want to insert the column in between 'ProductUrl' and 'RetailPrice', you can choose to search for the position of 'RetailPrice' and insert at that location.

>>> idx = df.columns.get_loc('RetailPrice')
>>> df.insert(idx, 'Shippinged', '')
>>> df
       Date  ItemID ProductUrl Shippinged  RetailPrice ItemName
0  11-23-20       1       aUrl                     1.0    aItem
1  11-23-20       2       bUrl                     2.0    bItem
2  11-23-20       3       cUrl                     3.0    cItem

This will give you the new dataframe to write to 'google.csv'

like image 103
Joe Ferndz Avatar answered Mar 08 '26 23:03

Joe Ferndz