I'm trying to do the iris data set given by our instructor. He edited the file and deleted all flower types plus the column name. He wants us to do calculations based on Petal length and width. I have been pulling my hair for 10 hours trying to figure this out but I face errors where ever I go.
First, I loaded the file into a dataframe and then decalred two variables to hold the length and width values. Later, I tried to multiply those values and store them in a new dataframe with "size" as a column name. After that I tried to append the new column to the initial dataframe so I could export it to csv. This is my try:
import os
import csv
import matplotlib.pyplot as plt
import numpy
import pandas as pd
df1 = pd.DataFrame.from_csv('Sample.csv', header = 0, index_col = 0)
pl = df1['Petal.Length']
pw = df1['Petal.Width']
df2 = pd.DataFrame({'size':[pl * pw]})
newdata = df1 + df2
print(newdata)
Regardless of csv export, I want also to loop through the file and do calculations based on the size, let's say if the sizes is > 8 then write 'Setosa' as the flower name in a row next to it with a column name "Flower type" I have been trying to figure out a way to loop through a dataframe and create another column based on its calculation but I ended up with no luck.
here is some sample data from what the instructor gave us:
id,Sepal.Length,Sepal.Width,Petal.Length,Petal.Width
1,5.1,3.5,1.4,0.2
2,4.9,3,1.4,0.2
3,4.7,3.2,1.3,0.2
4,4.6,3.1,1.5,0.2
5,5,3.6,1.4,0.2
6,5.4,3.9,1.7,0.4
7,4.6,3.4,1.4,0.3
8,5,3.4,1.5,0.2
Appreciated in advance.
You can assign directly to new column:
df1['size'] = df1['Petal.Length'] * df1['Petal.Width']
print(df1)
Output:
Sepal.Length Sepal.Width Petal.Length Petal.Width size
id
1 5.1 3.5 1.4 0.2 0.28
2 4.9 3.0 1.4 0.2 0.28
3 4.7 3.2 1.3 0.2 0.26
4 4.6 3.1 1.5 0.2 0.30
5 5.0 3.6 1.4 0.2 0.28
6 5.4 3.9 1.7 0.4 0.68
7 4.6 3.4 1.4 0.3 0.42
8 5.0 3.4 1.5 0.2 0.30
Add Setosa for all sizes greater 0.3:
df1.loc[df1['size'] > 0.3, 'Flower.Type'] = 'Setosa'
print(df1)
Output:
Sepal.Length Sepal.Width Petal.Length Petal.Width size Flower.Type
id
1 5.1 3.5 1.4 0.2 0.28 NaN
2 4.9 3.0 1.4 0.2 0.28 NaN
3 4.7 3.2 1.3 0.2 0.26 NaN
4 4.6 3.1 1.5 0.2 0.30 Setosa
5 5.0 3.6 1.4 0.2 0.28 NaN
6 5.4 3.9 1.7 0.4 0.68 Setosa
7 4.6 3.4 1.4 0.3 0.42 Setosa
8 5.0 3.4 1.5 0.2 0.30 Setosa
You can also use multiple conditions:
df1.loc[(df1['size'] > 0.3) & (df1['size'] < 0.5), 'Flower.Type'] = 'Setosa'
print(df1)
Output:
Sepal.Length Sepal.Width Petal.Length Petal.Width size Flower.Type
id
1 5.1 3.5 1.4 0.2 0.28 NaN
2 4.9 3.0 1.4 0.2 0.28 NaN
3 4.7 3.2 1.3 0.2 0.26 NaN
4 4.6 3.1 1.5 0.2 0.30 Setosa
5 5.0 3.6 1.4 0.2 0.28 NaN
6 5.4 3.9 1.7 0.4 0.68 NaN
7 4.6 3.4 1.4 0.3 0.42 Setosa
8 5.0 3.4 1.5 0.2 0.30 Setosa
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With