So i have a column of codes: "dataset.csv"
0020-004241 purple
00532 - Blue
00121 - Yellow
055 - Greem
0025-097 - Orange
Desired Output:
code name_of_code
0020-004241 purple
00532 blue
I want the codes and the words for the codes to be split into two different columns.
I tried:
df =pandas.read_csv(dataset.txt)
df = pandas.concat([df, df.columnname.str.split('/s', expand=True)], 1)
df = pandas.concat([df, df.columnname.str.split('-', expand=True)], 1)
` It gave the unexpected output of: purple none blue none yellow none green none orange none
How should I split this data correctly?
Using str.split(" ", 1)
Ex:
import pandas as pd
df = pd.read_csv(filename,names=['code'])
df[['code','name_of_code']] = df["code"].str.split(" ", 1, expand=True)
df["name_of_code"] = df["name_of_code"].str.strip("-")
print(df)
Output:
code name_of_code
0 0020-004241 purple
1 00532 Blue
2 00121 Yellow
3 055 Greem
4 0025-097 Orange
You can process this via a couple of split calls:
df = pd.DataFrame({'col': ['0020-004241 purple', '00532 - Blue',
'00121 - Yellow', '055 - Greem',
'0025-097 - Orange']})
df[['col1', 'col2']] = df['col'].str.split(n=1, expand=True)
df['col2'] = df['col2'].str.split().str[-1]
print(df)
col col1 col2
0 0020-004241 purple 0020-004241 purple
1 00532 - Blue 00532 Blue
2 00121 - Yellow 00121 Yellow
3 055 - Greem 055 Greem
4 0025-097 - Orange 0025-097 Orange
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With