I have a text file, of which i need each column, preferably into a dictionary or list, the format is :
N ID REMAIN VERS
2 2343333 bana twelve
3 3549287 moredp twelve
3 9383737 hinsila twelve
3 8272655 hinsila eight
I have tried:
crs = open("file.txt", "r")
for columns in ( raw.strip().split() for raw in crs ):
print columns[0]
Result = 'Out of index error'
Also tried:
crs = csv.reader(open(file.txt", "r"), delimiter=',', quotechar='|', skipinitialspace=True)
for row in crs:
for columns in row:
print columns[3]
Which seems to read each char as a column, instead of each 'word'
I would like to get the four columns, ie:
2
2343333
bana
twelve
into seperate dictionaries or lists
Any help is great, thanks!
To read a text file in Python, you follow these steps: First, open a text file for reading by using the open() function. Second, read text from the text file using the file read() , readline() , or readlines() method of the file object. Third, close the file using the file close() method.
Using read_csv()csv extension. In order to read our text file and load it into a pandas DataFrame all we need to provide to the read_csv() method is the filename, the separator/delimiter (which in our case is a whitespace) and the row containing the columns names which seems to be the first row.
Python File read() Method The read() method returns the specified number of bytes from the file. Default is -1 which means the whole file.
Method 1: Read a File Line by Line using readlines() readlines() is used to read all the lines at a single go and then return them as each line a string element in a list. This function can be used for small files, as it reads the whole file content to the memory, then split it into separate lines.
This works fine for me:
>>> crs = open("file.txt", "r")
>>> for columns in ( raw.strip().split() for raw in crs ):
... print columns[0]
...
N
2
3
3
3
If you want to convert columns to rows, use zip
.
>>> crs = open("file.txt", "r")
>>> rows = (row.strip().split() for row in crs)
>>> zip(*rows)
[('N', '2', '3', '3', '3'),
('ID', '2343333', '3549287', '9383737', '8272655'),
('REMAIN', 'bana', 'moredp', 'hinsila', 'hinsila'),
('VERS', 'twelve', 'twelve', 'twelve', 'eight')]
If you have blank lines, filter them before using zip.
>>> crs = open("file.txt", "r")
>>> rows = (row.strip().split() for row in crs)
>>> zip(*(row for row in rows if row))
[('N', '2', '3', '3', '3'), ('ID', '2343333', '3549287', '9383737', '8272655'), ('REMAIN', 'bana', 'moredp', 'hinsila', 'hinsila'), ('VERS', 'twelve', 'twelve', 'twelve', 'eight')]
>>> with open("file.txt") as f:
... c = csv.reader(f, delimiter=' ', skipinitialspace=True)
... for line in c:
... print(line)
...
['N', 'ID', 'REMAIN', 'VERS', ''] #that '' is for leading space after columns.
['2', '2343333', 'bana', 'twelve', '']
['3', '3549287', 'moredp', 'twelve', '']
['3', '9383737', 'hinsila', 'twelve', '']
['3', '8272655', 'hinsila', 'eight', '']
Or, old-fashioned way:
>>> with open("file.txt") as f:
... [line.split() for line in f]
...
[['N', 'ID', 'REMAIN', 'VERS'],
['2', '2343333', 'bana', 'twelve'],
['3', '3549287', 'moredp', 'twelve'],
['3', '9383737', 'hinsila', 'twelve'],
['3', '8272655', 'hinsila', 'eight']]
And for getting column values:
>>> l
[['N', 'ID', 'REMAIN', 'VERS'],
['2', '2343333', 'bana', 'twelve'],
['3', '3549287', 'moredp', 'twelve'],
['3', '9383737', 'hinsila', 'twelve'],
['3', '8272655', 'hinsila', 'eight']]
>>> {l[0][i]: [line[i] for line in l[1:]] for i in range(len(l[0]))}
{'ID': ['2343333', '3549287', '9383737', '8272655'],
'N': ['2', '3', '3', '3'],
'REMAIN': ['bana', 'moredp', 'hinsila', 'hinsila'],
'VERS': ['twelve', 'twelve', 'twelve', 'eight']}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With