I'm trying to read in a file that looks like this:
1, 2,
3, 4,
I'm using the following line:
l1,l2 = numpy.loadtxt('file.txt',unpack=True,delimiter=', ')
This gives me an error because the end comma in each row is lumped together as the last element (e.g. "2" is read as "2,"). Is there a way to ignore the last comma in each row, with loadtxt or another function?
Load data from a text file. Each row in the text file must have the same number of values. File, filename, list, or generator to read.
dtype : Data-type of the resulting array; default: float. If this is a structured data-type, the resulting array will be 1-dimensional, and each row will be interpreted as an element of the array.
fifth parameter - unpack. When unpack is True, the returned array is transposed, so that arguments may be unpacked using x, y, z = loadtxt(...) .
numpy.genfromtxt is a bit more robust.  If you use the default dtype (which is np.float64), it thinks there is a third column with missing values, so it creates a third column containing nan.  If you give it dtype=None (which tells it to figure out the data type from the file), it returns a third column containing all zeros.  Either way, you can ignore the last column by using usecols=[0, 1]:
In [14]: !cat trailing_comma.csv
1, 2,
3, 4,
Important note: I use delimiter=',', not delimiter=', '.
In [15]: np.genfromtxt('trailing_comma.csv', delimiter=',', dtype=None, usecols=[0,1])
Out[15]: 
array([[1, 2],
       [3, 4]])
In [16]: col1, col2 = np.genfromtxt('trailing_comma.csv', delimiter=',', dtype=None, usecols=[0,1], unpack=True)
In [17]: col1
Out[17]: array([1, 3])
In [18]: col2
Out[18]: array([2, 4])
                        usecols also works with loadtxt:
Simulate a file with text split into lines:
In [162]: txt=b"""1, 2,
3,4,"""
In [163]: txt=txt.splitlines()
In [164]: txt
Out[164]: [b'1, 2,', b'3,4,']
In [165]: x,y=np.loadtxt(txt,delimiter=',',usecols=[0,1],unpack=True)
In [166]: x
Out[166]: array([ 1.,  3.])
In [167]: y
Out[167]: array([ 2.,  4.])
loadtxt and genfromtxt don't work well with multicharacter delimiters.
loadtxt and genfromtxt accept any iterable, including a generator.  Thus you could open the file and process the lines one by one, removing the extra character.  
In [180]: def g(txt):
   .....:     t = txt.splitlines()
   .....:     for l in t:
   .....:         yield l[:-1]
In [181]: list(g(txt))
Out[181]: [b'1, 2', b'3,4']
A generator that yields the lines one by one, stripped of the last character. This could be changed to read a file line by line:
In [182]: x,y=np.loadtxt(g(txt),delimiter=',',unpack=True)
In [183]: x,y
Out[183]: (array([ 1.,  3.]), array([ 2.,  4.]))
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With