Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Read fields of CSV File with a list of list

Tags:

python

list

csv

i just wondering how i can read special field from a CVS File with next structure:

40.0070222,116.2968604,2008-10-28,[["route"], ["sublocality","political"]]
39.9759505,116.3272935,2008-10-29,[["route"], ["establishment"], ["sublocality", "political"]]

the way that on reading cvs files i used to work with:

with open('routes/stayedStoppoints', 'rb') as csvfile:
    spamreader = csv.reader(csvfile, delimiter=',', quotechar='"')

The problem with that is the first 3 fields no problem i can use:

for row in spamreader:

row[0],row[1],row[2] i can access without problem. but in the last field and i guess that with csv.reader(csvfile, delimiter=',', quotechar='"') split also for each sub-list:

so when i tried to access just show me:

[["route"] 

Anyone has a solution to handle the last field has a full list ( list of list indeed)

[["route"], ["sublocality","political"]]

in order to can access to each category.

Thanks

like image 540
taonico Avatar asked Mar 24 '23 13:03

taonico


2 Answers

Your format is close to json. You only need to wrap each line in brackets, and to quote the dates. For each line l just do:

lst=json.loads(re.sub('([0-9]+-[0-9]+-[0-9]+)',r'"\1"','[%s]'%(l)))

results in lst being

[40.0070222, 116.2968604, u'2008-10-28', [[u'route'], [u'sublocality', u'political']]]

You need to import the json parser and regular expressions

import json
import re

edit: you asked how to access the element containing 'route'. the answer is

lst[3][0][0]

'political' is at

lst[3][1][1]

If the strings ('political' and others) may contain strings looking like dates, you should go with the solution by @unutbu

like image 80
Johan Lundberg Avatar answered Apr 05 '23 23:04

Johan Lundberg


Use line.split(',', 3) to split on just the first 3 commas:

import json
with open(filename, 'rb') as csvfile:
    for line in csvfile:
        row = line.split(',', 3)
        row[3] = json.loads(row[3])
        print(row)

yields

['40.0070222', '116.2968604', '2008-10-28', [[u'route'], [u'sublocality', u'political']]]
['39.9759505', '116.3272935', '2008-10-29', [[u'route'], [u'establishment'], [u'sublocality', u'political']]]
like image 40
unutbu Avatar answered Apr 05 '23 23:04

unutbu