Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert a tab- and newline-delimited string to pandas dataframe

I have a string of the following format:

aString = '123\t456\t789\n321\t654\t987 ...'

And I would like to convert it to a pandas DataFrame

frame:
  123 456 789
  321 654 987
  ...

I have tried to convert it to a Python list:

stringList = aString.split('\n')

which results in:

stringList = ['123\t456\t789',
              '321\t654\t987',
              ...
             ]

Have no idea what to do next.

like image 230
Randy Tang Avatar asked Jan 09 '19 03:01

Randy Tang


People also ask

How do I convert a string to a DataFrame in Python?

Method 1: Create Pandas DataFrame from a string using StringIO() One way to achieve this is by using the StringIO() function. It will act as a wrapper and it will help us to read the data using the pd. read_csv() function.

How do I convert Dtypes to pandas?

In order to convert data types in pandas, there are three basic options: Use astype() to force an appropriate dtype. Create a custom function to convert the data. Use pandas functions such as to_numeric() or to_datetime()

Can you turn a list into a DataFrame Python?

We can create data frames using lists in the dictionary.


1 Answers

one option is list comprehension with str.split

pd.DataFrame([x.split('\t') for x in stringList], columns=list('ABC'))

     A   B   C
0   123 456 789
1   321 654 987

You can use StringIO

from io import StringIO
pd.read_csv(StringIO(aString), sep='\t', header=None)

    0   1   2
0   123 456 789
1   321 654 987
like image 146
It_is_Chris Avatar answered Oct 19 '22 10:10

It_is_Chris