Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split string at delimiter '\' in python

I have a quick question. I am trying to split a string S : 'greenland.gdb\topology_check\t_buildings' at '\' using:

     S.split('\')

I expect output list :

 ['greenland.gdb', 'topology_check', 't_buildings']. 

Instead it returns error : SyntaxError: EOL while scanning string literal. What is with character '\' in python. With any other character it works fine.

like image 847
Jio Avatar asked May 27 '16 13:05

Jio


2 Answers

You need to escape the backslash:

 S.split('\\')

You may also need to string_escape:

In [10]: s = 'greenland.gdb\topology_check\t_buildings'

In [11]: s.split("\\")
Out[11]: ['greenland.gdb\topology_check\t_buildings']

In [12]: s.encode("string_escape").split("\\")
Out[12]: ['greenland.gdb', 'topology_check', 't_buildings']

\t would be interpreted as a tab character unless you were using a raw string:

In [18]: s = 'greenland.gdb\topology_check\t_buildings'

In [19]: print(s)
greenland.gdb   opology_check   _buildings

In [20]: s = r'greenland.gdb\topology_check\t_buildings'

In [21]: print(s)
greenland.gdb\topology_check\t_buildings

Escape characters

like image 110
Padraic Cunningham Avatar answered Oct 02 '22 15:10

Padraic Cunningham


You need to use a raw string first, 'r' or 'R' as follows:

In [01]: s = r'greenland.gdb\topology_check\t_buildings'

Then split while remembering to escape the backslash

In [02]:  s.split('\\')

Out [02]: ['greenland.gdb', 'topology_check', 't_buildings']

Quoting from re — Regular expression operations (Python documentation)

\ Either escapes special characters (permitting you to match characters like '*', '?', and so forth), or signals a special sequence; special sequences are discussed below.

If you’re not using a raw string to express the pattern, remember that Python also uses the backslash as an escape sequence in string literals; if the escape sequence isn’t recognized by Python’s parser, the backslash and subsequent character are included in the resulting string. However, if Python would recognize the resulting sequence, the backslash should be repeated twice. This is complicated and hard to understand, so it’s highly recommended that you use raw strings for all but the simplest expressions.

like image 33
bones Avatar answered Oct 02 '22 15:10

bones