Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Adding new column to dataframe which is a copy of the index column

Tags:

I have a dataframe which I want to plot with matplotlib, but the index column is the time and I cannot plot it.

This is the dataframe (df3):

enter image description here

but when I try the following:

plt.plot(df3['magnetic_mag mean'], df3['YYYY-MO-DD HH-MI-SS_SSS'], label='FDI') 

I'm getting an error obviously:

KeyError: 'YYYY-MO-DD HH-MI-SS_SSS' 

So what I want to do is to add a new extra column to my dataframe (named 'Time) which is just a copy of the index column.

How can I do it?

This is the entire code:

#Importing the csv file into df df = pd.read_csv('university2.csv', sep=";", skiprows=1)  #Changing datetime df['YYYY-MO-DD HH-MI-SS_SSS'] = pd.to_datetime(df['YYYY-MO-DD HH-MI-SS_SSS'],                                                 format='%Y-%m-%d %H:%M:%S:%f')  #Set index from column df = df.set_index('YYYY-MO-DD HH-MI-SS_SSS')  #Add Magnetic Magnitude Column df['magnetic_mag'] = np.sqrt(df['MAGNETIC FIELD X (μT)']**2 + df['MAGNETIC FIELD Y (μT)']**2 + df['MAGNETIC FIELD Z (μT)']**2)  #Subtract Earth's Average Magnetic Field from 'magnetic_mag' df['magnetic_mag'] = df['magnetic_mag'] - 30  #Copy interesting values df2 = df[[ 'ATMOSPHERIC PRESSURE (hPa)',           'TEMPERATURE (C)', 'magnetic_mag']].copy()  #Hourly Average and Standard Deviation for interesting values  df3 = df2.resample('H').agg(['mean','std']) df3.columns = [' '.join(col) for col in df3.columns]  df3.reset_index() plt.plot(df3['magnetic_mag mean'], df3['YYYY-MO-DD HH-MI-SS_SSS'], label='FDI')   

Thank you !!

like image 356
ValientProcess Avatar asked Apr 29 '16 07:04

ValientProcess


People also ask

How do I replace an index column with another column?

You can change the index to a different column by using set_index() after reset_index() .

How will you add a new column and new row to a pandas DataFrame?

In pandas you can add/append a new column to the existing DataFrame using DataFrame. insert() method, this method updates the existing DataFrame with a new column. DataFrame. assign() is also used to insert a new column however, this method returns a new Dataframe after adding a new column.

How do I add a column to an index in a DataFrame?

To create an index, from a column, in Pandas dataframe you use the set_index() method. For example, if you want the column “Year” to be index you type <code>df. set_index(“Year”)</code>. Now, the set_index() method will return the modified dataframe as a result.


1 Answers

I think you need reset_index:

df3 = df3.reset_index() 

Possible solution, but I think inplace is not good practice, check this and this:

df3.reset_index(inplace=True) 

But if you need new column, use:

df3['new'] = df3.index 

I think you can read_csv better:

df = pd.read_csv('university2.csv',                   sep=";",                   skiprows=1,                  index_col='YYYY-MO-DD HH-MI-SS_SSS',                  parse_dates='YYYY-MO-DD HH-MI-SS_SSS') #if doesnt work, use pd.to_datetime 

And then omit:

#Changing datetime df['YYYY-MO-DD HH-MI-SS_SSS'] = pd.to_datetime(df['YYYY-MO-DD HH-MI-SS_SSS'],                                                 format='%Y-%m-%d %H:%M:%S:%f') #Set index from column df = df.set_index('YYYY-MO-DD HH-MI-SS_SSS') 
like image 178
jezrael Avatar answered Oct 05 '22 22:10

jezrael