I've got a string which looks like:
a1\tb1\tc1\na2\tb2\tc2\na3\tb3\tc3\n...
Is there an efficient and smart way to convert this kind of string into a Pandas DataFrame? StringIO seems not to be correct for this approach.
Thanks in advance!!
StringIO
works perfectly.
import io
string = 'a1\tb1\tc1\na2\tb2\tc2\na3\tb3\tc3'
pd.read_csv(io.StringIO(string), delim_whitespace=True, header=None)
0 1 2
0 a1 b1 c1
1 a2 b2 c2
2 a3 b3 c3
You can also use pd.read_table
or pd.read_fwf
in the same manner:
pd.read_table(io.StringIO(string), header=None)
Or,
pd.read_fwf(io.StringIO(string), header=None)
0 1 2
0 a1 b1 c1
1 a2 b2 c2
2 a3 b3 c3
In these last two examples, it is assumed that whitespace is the natural delimiter. However, your raw string must maintain a consistent structure within data.
Finally, you can also use a string splitting approach, splitting on newlines first, and then on tabs:
pd.DataFrame(list(map(str.split, string.splitlines())))
0 1 2
0 a1 b1 c1
1 a2 b2 c2
2 a3 b3 c3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With