i have a table in pandas df
bigram frequency
(123,3245) 2
(676,35346) 84
(93,32) 9
and so on, till 50 rows.
what i am looking for is, split the bigram column into two different columns removing the brackets and comma like,
col1 col2 frequency
123 3245 2
676 35346 84
93 32 9
is there any way to split if after comma,and removing brackets.
If your bigram column happens to be string format, you can use .str.extract() method with regex to extract numbers from it:
pd.concat([df.bigram.str.extract('(?P<col1>\d+),(?P<col2>\d+)'), df.frequency], axis = 1)

Or if the bigram column is of tuple type:
Method1: use pd.Series to create columns from the tuple:
pd.concat([df.bigram.apply(lambda x: pd.Series(x, index=['col1', 'col2'])),
df.frequency], axis=1)
Method2: use .str to get the first and second element from the tuple
df['col1'], df['col2'] = df.bigram.str[0], df.bigram.str[1]
df = df.drop('bigram', axis=1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With