Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas read text file into a dataframe

Tags:

python

pandas

I have a .txt file

[7, 9, 20, 30, 50]  [1-8]
[9, 14, 27, 31, 45]  [2-5]
[7, 10, 22, 27, 38]  [1-7]

that I am trying to read into a data frame of two columns using df = pd.read_fwf(readfile,header=None) Instead of two columns it forms a data frame with three columns and sometimes reads each of the first list of numbers into five columns

    0              1      2
0   [7, 9, 20, 30, 50]  [1-8]
1   [9, 14, 27, 31, 45] [2-5]
2   [7, 10, 22, 27, 38] [1-7]

I do not understand what I am doing wrongly. Could someone please help?

like image 312
user1478335 Avatar asked Feb 24 '26 21:02

user1478335


1 Answers

You can exploit the two spaces between the lists

pd.read_csv(readfile, sep='\s\s', header=None, engine='python')

Out:

                     0      1
0   [7, 9, 20, 30, 50]  [1-8]
1  [9, 14, 27, 31, 45]  [2-5]
2  [7, 10, 22, 27, 38]  [1-7]

pd.read_fwf without an explicit widths argument tries to infere the fixed widths. But the length of the first list varies. There is no fixed width to separate each line into two columns.
The widths argument is very usefull if your data has no delimiter but fixed number of letters per value. 40 years ago this was a common data format.

# data.txt
20200810ITEM02PRICE30COUNT001
20200811ITEM03PRICE31COUNT012
20200812ITEM12PRICE02COUNT107

pd.read_csv sep argument accepts multi char and regex delimiter. Often this is more flexible to separate strings to columns.

like image 154
Michael Szczesny Avatar answered Feb 26 '26 09:02

Michael Szczesny



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!