I didn't really know how to give a good descriptive title, but here's my question. Let's consider a DataFrame df
:
col_name
0 Category1
1 item1()
2 item2()
3 Category2
4 item3()
5 item4()
6 item5()
I need to get this:
categories items
0 Category1 item1
1 Category1 item2
2 Category2 item3
3 Category2 item4
4 Category2 item5
But categories
could be continents and items
could be countries.
I know that all the items have ()
with an expression inside, so I can easily provide a boolean mask and then create a list of categories
with:
msk = df[~df['col_name'].str.contains('[^A-Za-z\s]')]['col_name'].tolist()
But now, now I'm stuck. Could you please give me any piece of advice?
Let us do startswith
find the category row and create the other column with ffill
df['category']=df.col_name.mask(df.col_name.str.endwith('Category')).ffill()
#df['category']=df.col_name.mask(df.col_name.str.endswith(')')).ffill()
df=df[df.category!=df.col_name]
df
Out[241]:
col_name category
1 item1() Category1
2 item2() Category1
4 item3() Category2
5 item4() Category2
6 item5() Category2
Here is necessary specify how distinguish non category or category values. In these solution are tested values if (
in data, then replace these values by missing values and forward filling them, then replace ()
and last filter by original mask:
m = df['col_name'].str.contains('(', regex=False)
df['categories'] = df['col_name'].mask(m).ffill()
df['items'] = df.pop('col_name').str.replace('[\(\)]', '')
df = df[m]
print (df)
categories items
1 Category1 item1
2 Category1 item2
4 Category2 item3
5 Category2 item4
6 Category2 item5
With your mask with added digits is solution changed by:
m = df['col_name'].str.contains('[^A-Za-z0-9\s]')
df['categories'] = df['col_name'].mask(m).ffill()
df['items'] = df.pop('col_name').str.replace('[\()]', '')
df = df[m]
print (df)
categories items
1 Category1 item1
2 Category1 item2
4 Category2 item3
5 Category2 item4
6 Category2 item5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With