I have a dataframe which I want to plot with matplotlib, but the index column is the time and I cannot plot it.
This is the dataframe (df3):
but when I try the following:
plt.plot(df3['magnetic_mag mean'], df3['YYYY-MO-DD HH-MI-SS_SSS'], label='FDI')
I'm getting an error obviously:
KeyError: 'YYYY-MO-DD HH-MI-SS_SSS'
So what I want to do is to add a new extra column to my dataframe (named 'Time) which is just a copy of the index column.
How can I do it?
This is the entire code:
#Importing the csv file into df df = pd.read_csv('university2.csv', sep=";", skiprows=1) #Changing datetime df['YYYY-MO-DD HH-MI-SS_SSS'] = pd.to_datetime(df['YYYY-MO-DD HH-MI-SS_SSS'], format='%Y-%m-%d %H:%M:%S:%f') #Set index from column df = df.set_index('YYYY-MO-DD HH-MI-SS_SSS') #Add Magnetic Magnitude Column df['magnetic_mag'] = np.sqrt(df['MAGNETIC FIELD X (μT)']**2 + df['MAGNETIC FIELD Y (μT)']**2 + df['MAGNETIC FIELD Z (μT)']**2) #Subtract Earth's Average Magnetic Field from 'magnetic_mag' df['magnetic_mag'] = df['magnetic_mag'] - 30 #Copy interesting values df2 = df[[ 'ATMOSPHERIC PRESSURE (hPa)', 'TEMPERATURE (C)', 'magnetic_mag']].copy() #Hourly Average and Standard Deviation for interesting values df3 = df2.resample('H').agg(['mean','std']) df3.columns = [' '.join(col) for col in df3.columns] df3.reset_index() plt.plot(df3['magnetic_mag mean'], df3['YYYY-MO-DD HH-MI-SS_SSS'], label='FDI')
Thank you !!
You can change the index to a different column by using set_index() after reset_index() .
In pandas you can add/append a new column to the existing DataFrame using DataFrame. insert() method, this method updates the existing DataFrame with a new column. DataFrame. assign() is also used to insert a new column however, this method returns a new Dataframe after adding a new column.
To create an index, from a column, in Pandas dataframe you use the set_index() method. For example, if you want the column “Year” to be index you type <code>df. set_index(“Year”)</code>. Now, the set_index() method will return the modified dataframe as a result.
I think you need reset_index
:
df3 = df3.reset_index()
Possible solution, but I think inplace
is not good practice, check this and this:
df3.reset_index(inplace=True)
But if you need new column, use:
df3['new'] = df3.index
I think you can read_csv
better:
df = pd.read_csv('university2.csv', sep=";", skiprows=1, index_col='YYYY-MO-DD HH-MI-SS_SSS', parse_dates='YYYY-MO-DD HH-MI-SS_SSS') #if doesnt work, use pd.to_datetime
And then omit:
#Changing datetime df['YYYY-MO-DD HH-MI-SS_SSS'] = pd.to_datetime(df['YYYY-MO-DD HH-MI-SS_SSS'], format='%Y-%m-%d %H:%M:%S:%f') #Set index from column df = df.set_index('YYYY-MO-DD HH-MI-SS_SSS')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With