Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing a tab delimited file into separate lists or strings

I am trying to take a tab delimited file with two columns, Name and Age, which reads in as this:

'Name\tAge\nMark\t32\nMatt\t29\nJohn\t67\nJason\t45\nMatt\t12\nFrank\t11\nFrank\t34\nFrank\t65\nFrank\t78\n'

And simply create two lists, one with names (called names, without heading) and one with the ages (called ages, but without ages in the list).

like image 984
user972297 Avatar asked Sep 30 '11 02:09

user972297


People also ask

How do you parse a tab-separated in Python?

You can use the csv module to parse tab seperated value files easily. import csv with open("tab-separated-values") as tsv: for line in csv. reader(tsv, dialect="excel-tab"): #You can also use delimiter="\t" rather than giving a dialect. ... Where line is a list of the values on the current row for each iteration.

How do I read a tab delimited text file in Python?

To read tab-separated values files with Python, we'll take advantage of the fact that they're similar to CSVs. We'll use Python's csv library and tell it to split things up with tabs instead of commas. Just set the delimiter argument to "\t" . That's it!


2 Answers

Using the csv module, you might do something like this:

import csv

names=[]
ages=[]
with open('data.csv','r') as f:
    next(f) # skip headings
    reader=csv.reader(f,delimiter='\t')
    for name,age in reader:
        names.append(name)
        ages.append(age) 

print(names)
# ('Mark', 'Matt', 'John', 'Jason', 'Matt', 'Frank', 'Frank', 'Frank', 'Frank')
print(ages)
# ('32', '29', '67', '45', '12', '11', '34', '65', '78')
like image 120
unutbu Avatar answered Sep 29 '22 20:09

unutbu


tab delimited data is within the domain of the csv module:

>>> corpus = 'Name\tAge\nMark\t32\nMatt\t29\nJohn\t67\nJason\t45\nMatt\t12\nFrank\t11\nFrank\t34\nFrank\t65\nFrank\t78\n'
>>> import StringIO
>>> infile = StringIO.StringIO(corpus)

pretend infile was just a regular file...

>>> import csv
>>> r = csv.DictReader(infile, 
...                    dialect=csv.Sniffer().sniff(infile.read(1000)))
>>> infile.seek(0)

you don't even have to tell the csv module about the headings and the delimiter format, it'll figure it out on its own

>>> names, ages = [],[]
>>> for row in r:
...     names.append(row['Name'])
...     ages.append(row['Age'])
... 
>>> names
['Mark', 'Matt', 'John', 'Jason', 'Matt', 'Frank', 'Frank', 'Frank', 'Frank']
>>> ages
['32', '29', '67', '45', '12', '11', '34', '65', '78']
>>> 
like image 23
SingleNegationElimination Avatar answered Sep 29 '22 19:09

SingleNegationElimination