I have a .txt file
[7, 9, 20, 30, 50] [1-8]
[9, 14, 27, 31, 45] [2-5]
[7, 10, 22, 27, 38] [1-7]
that I am trying to read into a data frame of two columns using df = pd.read_fwf(readfile,header=None)
Instead of two columns it forms a data frame with three columns and sometimes reads each of the first list of numbers into five columns
0 1 2
0 [7, 9, 20, 30, 50] [1-8]
1 [9, 14, 27, 31, 45] [2-5]
2 [7, 10, 22, 27, 38] [1-7]
I do not understand what I am doing wrongly. Could someone please help?
You can exploit the two spaces between the lists
pd.read_csv(readfile, sep='\s\s', header=None, engine='python')
Out:
0 1
0 [7, 9, 20, 30, 50] [1-8]
1 [9, 14, 27, 31, 45] [2-5]
2 [7, 10, 22, 27, 38] [1-7]
pd.read_fwf without an explicit widths argument tries to infere the fixed widths. But the length of the first list varies. There is no fixed width to separate each line into two columns.
The widths argument is very usefull if your data has no delimiter but fixed number of letters per value. 40 years ago this was a common data format.
# data.txt
20200810ITEM02PRICE30COUNT001
20200811ITEM03PRICE31COUNT012
20200812ITEM12PRICE02COUNT107
pd.read_csv sep argument accepts multi char and regex delimiter. Often this is more flexible to separate strings to columns.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With