I have a string:
C1 C2 DATE C4 C5 C6 C7
0 0.0 W04 2021-01-08 00:00:00+00:00 E EUE C1 157
1 0.0 W04 2021-01-08 00:00:00+00:00 E AEU C1 157
2 0.0 W04 2021-01-01 00:00:00+00:00 E SADA H1 747
3 0.0 W04 2021-01-04 00:00:00+00:00 E SSEA H1 747
4 0.0 W04 2021-01-05 00:00:00+00:00 E GPEA H1 747
It sure looks like a Pandas DataFrame because it comes from one. I need to convert it into a Pandas DataFrame.
I tried the following:
pd.read_csv(StringIO(string_file),sep=r"\s+")
but it messes with the columns and separates the DATE column into 2 columns.
split() Pandas provide a method to split string around a passed separator/delimiter. After that, the string can be stored as a list in a series or it can also be used to create multiple column data frames from a single separated string.
Split column by delimiter into multiple columns Apply the pandas series str. split() function on the “Address” column and pass the delimiter (comma in this case) on which you want to split the column. Also, make sure to pass True to the expand parameter.
split() function. The str. split() function is used to split strings around given separator/delimiter. The function splits the string in the Series/Index from the beginning, at the specified delimiter string.
We can use the pandas Series. str. split() function to break up strings in multiple columns around a given separator or delimiter. It's similar to the Python string split() method but applies to the entire Dataframe column.
First, recreate the string:
s = """
C1 C2 DATE C4 C5 C6 C7
0 0.0 W04 2021-01-08 00:00:00+00:00 E EUE C1 157
1 0.0 W04 2021-01-08 00:00:00+00:00 E AEU C1 157
2 0.0 W04 2021-01-01 00:00:00+00:00 E SADA H1 747
3 0.0 W04 2021-01-04 00:00:00+00:00 E SSEA H1 747
4 0.0 W04 2021-01-05 00:00:00+00:00 E GPEA H1 747
"""
Now, you can use Pandas.read_csv
to import a buffer:
from io import StringIO
df = pd.read_csv(StringIO(s), sep=r"\s\s+")
From what I can tell, this results in exactly the DataFrame that you are looking for:
You may want to convert the DATE
column to datetime
values as well:
df['DATE'] = df.DATE.astype('datetime64')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With