I stuck with the problem how to divide a pandas dataframe by row,
I have a similar dataframe with a column where values are separated by \r\n and they are in one cell,
    Color                              Shape  Price
0  Green  Rectangle\r\nTriangle\r\nOctangle     10
1   Blue              Rectangle\r\nTriangle     15 
I need to divide this cell into several cells with the same values as other columns, e.g.
   Color      Shape  Price
0  Green  Rectangle     10
1  Green   Triangle     10
2  Green   Octangle     10
3   Blue  Rectangle     15
4   Blue    Tringle     15
How do I do it well?
How do you split a row in a data frame? Using the iloc() function to split DataFrame in Python We can use the iloc() function to slice DataFrames into smaller DataFrames. The iloc() function allows us to access elements based on the index of rows and columns.
Series and DataFrame methods define a . explode() method that explodes lists into separate rows. See the docs section on Exploding a list-like column. Since you have a list of comma separated strings, split the string on comma to get a list of elements, then call explode on that column.
Slicing Rows and Columns by Index PositionWhen slicing by index position in Pandas, the start index is included in the output, but the stop index is one step beyond the row you want to select. So the slice return row 0 and row 1, but does not return row 2. The second slice [:] indicates that all columns are required.
split() Pandas provide a method to split string around a passed separator/delimiter. After that, the string can be stored as a list in a series or it can also be used to create multiple column data frames from a single separated string.
You can do:
df["Shape"]=df["Shape"].str.split("\r\n")
print(df.explode("Shape").reset_index(drop=True))
Output:
   Color    Shape   Price
0   Green   Rectangle   10
1   Green   Triangle    10
2   Green   Octangle    10
3   Blue    Rectangle   15
4   Blue    Triangle    15
                        This might not be the most efficient way to do it but I can confirm that it works with the sample df:
data = [['Green', 'Rectangle\r\nTriangle\r\nOctangle', 10], ['Blue', 'Rectangle\r\nTriangle', 15]]   
df = pd.DataFrame(data, columns = ['Color', 'Shape', 'Price'])
new_df = pd.DataFrame(columns = ['Color', 'Shape', 'Price'])
for index, row in df.iterrows():
    split = row['Shape'].split('\r\n')
    for shape in split:
        new_df = new_df.append(pd.DataFrame({'Color':[row['Color']], 'Shape':[shape], 'Price':[row['Price']]}))
new_df = new_df.reset_index(drop=True)
print(new_df)
Output:
   Color Price      Shape
0  Green    10  Rectangle
1  Green    10   Triangle
2  Green    10   Octangle
3   Blue    15  Rectangle
4   Blue    15   Triangle
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With