I have a tab delimited .txt file that I'm trying to import into a matrix array in Python of the same format as the text file is as shown below:
123088 266 248 244 266 244 277
123425 275 244 241 289 248 231
123540 156 654 189 354 156 987
Note there are many, many more rows of the stuff above (roughly 200) that I want to pass into Python and maintain the same formatting when creating a matrix array from it.
The current code that I have for this is:
d = {}
with open('file name', 'rb') as csv_file:
csv_reader = csv.reader(csv_file, delimiter='\t')
for row in csv_reader:
d[row[0]] = row[1:]
Which it slightly does what I need it to do, but not my target goal for it. I want to finish code that I can type in print(d[0,3]) and it will spit out 248.
First, you are loading it into a dictionary, which is not going to get the list of lists that you want.
It's dead simple to use the CSV module to generate a list of lists like this:
import csv
with open(path) as f:
reader = csv.reader(f, delimiter="\t")
d = list(reader)
print d[0][2] # 248
That would give you a list of lists of strings, so if you wanted to get numbers, you'd have to convert to int.
That said, if you have a large array (or are doing any kind of numeric calculations), you should consider using something like NumPy or pandas. If you wanted to use NumPy, you could do
import numpy as np
d = np.loadtxt(path, delimiter="\t")
print d[0,2] # 248
As a bonus, NumPy arrays allow you to do quick vector/matrix operations. (Also, note that d[0][2]
would work with the NumPy array too).
Try this:
d = []
with open(sourcefile,'rb') as source:
for line in source:
fields = line.split('\t')
d.append(fields)
print d[0][1]
will print 266.
print d[0][2]
(remember your arrays are 0-based) will print 248.
To output the data in the same format as your input:
for line in d:
print "\t".join(line)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With