Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to store values from loop to a dataframe?

Tags:

python

pandas

I have created a loop that generates some values. I want to store those values in a data frame. For example, completed one loop, append to the first row.

def calculate (allFiles):

    result = pd.DataFrame(columns = ['Date','Mid Ebb Total','Mid Flood Total','Mid Ebb Control','Mid Flood Control'])

    total_Mid_Ebb = 0
    total_Mid_Flood = 0
    total_Mid_EbbControl = 0
    total_Mid_FloodControl = 0

    for file_ in allFiles:
        xls = pd.ExcelFile(file_)
        df = xls.parse('General Impact')
        Mid_Ebb = df[df['Tidal Mode'] == "Mid-Ebb"] #filter 
        Mid_Ebb_control = df[df['Station'].isin(['C1','C2','C3'])] #filter control
        Mid_Flood = df[df['Tidal Mode'] == "Mid-Flood"] #filter
        Mid_Flood_control = df[df['Station'].isin(['C1','C2','C3', 'SR2'])] #filter control
        total_Mid_Ebb += Mid_Ebb.Station.nunique() #count unique stations = sample number
        total_Mid_Flood += Mid_Flood.Station.nunique()
        total_Mid_EbbControl += Mid_Ebb_control.Station.nunique()
        total_Mid_FloodControl += Mid_Flood_control.Station.nunique()

    Mid_Ebb_withoutControl = total_Mid_Ebb - total_Mid_EbbControl
    Mid_Flood_withoutControl = total_Mid_Flood - total_Mid_FloodControl

    print('Ebb Tide: The total number of sample is {}. Number of sample without control station is {}. Number of sample in control station is {}'.format(total_Mid_Ebb, Mid_Ebb_withoutControl, total_Mid_EbbControl))
    print('Flood Tide: The total number of sample is {}. Number of sample without control station is {}. Number of sample in control station is {}'.format(total_Mid_Flood, Mid_Flood_withoutControl, total_Mid_FloodControl))

The dataframe result contains 4 columns. The date is fixed. I would like to put total_Mid_Ebb, Mid_Ebb_withoutControl, total_Mid_EbbControl to the dataframe.

like image 403
JOHN Avatar asked Jan 18 '18 05:01

JOHN


1 Answers

I believe you need append scalars in loop to list of tuples and then use DataFrame constructor. Last count differences in result DataFrame:

def calculate (allFiles):

    data = []
    for file_ in allFiles:
        xls = pd.ExcelFile(file_)
        df = xls.parse('General Impact')
        Mid_Ebb = df[df['Tidal Mode'] == "Mid-Ebb"] #filter 
        Mid_Ebb_control = df[df['Station'].isin(['C1','C2','C3'])] #filter control
        Mid_Flood = df[df['Tidal Mode'] == "Mid-Flood"] #filter
        Mid_Flood_control = df[df['Station'].isin(['C1','C2','C3', 'SR2'])] #filter control
        total_Mid_Ebb = Mid_Ebb.Station.nunique() #count unique stations = sample number
        total_Mid_Flood = Mid_Flood.Station.nunique()
        total_Mid_EbbControl = Mid_Ebb_control.Station.nunique()
        total_Mid_FloodControl = Mid_Flood_control.Station.nunique()
        data.append((total_Mid_Ebb, 
                     total_Mid_Flood, 
                     total_Mid_EbbControl, 
                     total_Mid_FloodControl))

    cols=['total_Mid_Ebb','total_Mid_Flood','total_Mid_EbbControl','total_Mid_FloodControl']

    result = pd.DataFrame(data, columns=cols)
    result['Mid_Ebb_withoutControl'] = result.total_Mid_Ebb - result.total_Mid_EbbControl
    result['Mid_Flood_withoutControl']=result.total_Mid_Flood-result.total_Mid_FloodControl

    #if want check all totals
    total = result.sum()
    print (total)


    return result
like image 171
jezrael Avatar answered Nov 04 '22 15:11

jezrael