Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can i parse a comma delimited string into a list (caveat)?

Tags:

I need to be able to take a string like:

'''foo, bar, "one, two", three four''' 

into:

['foo', 'bar', 'one, two', 'three four'] 

I have an feeling (with hints from #python) that the solution is going to involve the shlex module.

like image 682
Jeremy Cantrell Avatar asked Sep 22 '08 22:09

Jeremy Cantrell


People also ask

How do you split a string to a list using a comma delimiter?

You can use the Python string split() function to split a string (by a delimiter) into a list of strings. To split a string by comma in Python, pass the comma character "," as a delimiter to the split() function. It returns a list of strings resulting from splitting the original string on the occurrences of "," .

What is a comma-delimited string?

(adj.) Comma-delimited is a type of data format in which each piece of data is separated by a comma. This is a popular format for transferring data from one application to another, because most database systems are able to import and export comma-delimited data.


2 Answers

It depends how complicated you want to get... do you want to allow more than one type of quoting. How about escaped quotes?

Your syntax looks very much like the common CSV file format, which is supported by the Python standard library:

import csv reader = csv.reader(['''foo, bar, "one, two", three four'''], skipinitialspace=True) for r in reader:   print r 

Outputs:

['foo', 'bar', 'one, two', 'three four'] 

HTH!

like image 185
Dan Lenski Avatar answered Jan 20 '23 08:01

Dan Lenski


The shlex module solution allows escaped quotes, one quote escape another, and all fancy stuff shell supports.

>>> import shlex >>> my_splitter = shlex.shlex('''foo, bar, "one, two", three four''', posix=True) >>> my_splitter.whitespace += ',' >>> my_splitter.whitespace_split = True >>> print list(my_splitter) ['foo', 'bar', 'one, two', 'three', 'four'] 

escaped quotes example:

>>> my_splitter = shlex.shlex('''"test, a",'foo,bar",baz',bar \xc3\xa4 baz''',                               posix=True)  >>> my_splitter.whitespace = ',' ; my_splitter.whitespace_split = True  >>> print list(my_splitter) ['test, a', 'foo,bar",baz', 'bar \xc3\xa4 baz'] 
like image 33
nosklo Avatar answered Jan 20 '23 08:01

nosklo