Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read all rows of a column and check for delimiter and if found create column with that data (need of 4 columns) in Python

I have a column data like below (dtype:object):

    Column A 
1324@Hi how are you//where 
are you: I am in London@Cool place@Nice
5649@Hello Christina@Awesome Trip 
@Fantastic 

Expected output:

Col A  Col B                    Col C         Col D
1324   Hi how are you//where    Cool place    Nice
       are you: I am in London
5649   Hello Christina          Awesome Trip  Fantastic

I need to check for the delimiter "@" in all rows. Create 4 columns for the first 4 occurrences and for the next four occurences, need to append the data in next row of the same 4 columns as i mentioned in above table.

It would be grateful if any possible solution. Thanks in Advance.

like image 996
Rohit Avatar asked Nov 23 '25 21:11

Rohit


2 Answers

A quick way to achieve your dataframe would be to pass expand=True to str.split. This would only work if your data is row-separated. And if you can accept Col 0 instead of Col A this becomes a easy task.

df['Column A'].str.split('@', expand=True).add_prefix('Col ')

Full example

import pandas as pd

data = '''\
Column A
1324@Hi how are you//where are you: I am in London@Cool place@Nice
5649@Hello Christina@Awesome Trip@Fantastic'''

fileobj = pd.compat.StringIO(data)
df = pd.read_csv(fileobj, sep='|')
df2 = df['Column A'].str.split('@', expand=True).add_prefix('Col ')

print(df2)

Prints:

  Col 0                                          Col 1         Col 2  \
0  1324  Hi how are you//where are you: I am in London    Cool place   
1  5649                                Hello Christina  Awesome Trip   

       Col 3  
0       Nice  
1  Fantastic  
like image 113
Anton vBR Avatar answered Nov 25 '25 11:11

Anton vBR


You can use split for this operation:

df.ColumnA.str.split('@').tolist()

The output will be a list of lists which can be used to make a new dataframe as per your requirements

[['1324',
  'Hi how are you//where are you: I am in London',
  'Cool place',
  'Nice'],
 ['5649', 'Hello Christina', 'Awesome Trip ', 'Fantastic']]

to straightaway create a new dataframe that is split accordingly you can use a :

new_df=pd.DataFrame(df.name_of_column.str.split('@').tolist(),
                                   columns = ['a','b','c','d'])

P.s the number of columns should be equal to the maximum number @ that are there in any of the elements of the column that you intend to switch.

the output will look like this

like image 44
Inder Avatar answered Nov 25 '25 10:11

Inder