I want to be able to concatenate string on several lines into one according to an ID. I use the library pandas (python 3).
val id
Cat 1
Tiger 2
Ball 3
Bat 1
bill 2
dog 1
l = []
a = 0
while a < lendata:
if df["id"][a] == 1:
if a != 0:
df["val"][tmp] = ' '.join(l)
l = []
tmp = a
l.append(df["val"][a])
else:
l.append(df["val"][a])
a += 1
It works with loops. i need this result,
val
Cat Tiger Ball
Bat bill
dog
not a group by
Question: Do you know how to do it with pandas functions? Thanks.
Staying in pandas:
df['group'] = (df['id'] == 1).cumsum()
df.groupby('group')['val'].apply(' '.join).reset_index()
id val
0 1 Cat Tiger Ball
1 2 Bat bill
2 3 dog
The first line defines groups according to your definition. The second line is a standard groupby operation.
You can also create an array like so :
a = np.array(range(len(df)))
Then you create a third column which equals to your id minus the previous array. This third column will show you which val are together.
df['regroup'] = df['id'].subtract(a)
Out:
id val regroup
0 1 Cat 1
1 2 Tiger 1
2 3 Ball 1
3 1 Bat -2
You can now use a group by to have your desired output :
In [1] : df.groupby(['regroup'])['val'].apply(' '.join)
Out[1] : regroup
-2 Bat
1 Cat Tiger Ball
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With