Using Python 2.7.9, I'm trying to take a DOS-style directory listing and split it into separate lines using the built-in splitlines function. The listing is of a directory, and one of the lines in my triple-quoted string contains a trailing backslash. The problem is that the line ending with a backslash is not being split:
# DOS-style listing of the directory "B:\"
listing = """Directory of B:\
12/15/2014 02:12 PM 1814814 BIRD.LOG
01/01/2000 12:04 AM <DIR> CONFIG
12/15/2014 02:55 PM 35060 ALLIGATOR.LOG
03/15/2013 02:06 PM <DIR> MONKEY
03/15/2013 02:06 PM <DIR> FROG
03/15/2013 02:06 PM <DIR> BADGER
2 File(s) 1849874 bytes
4 Dir(s) 1674739712 bytes free
"""
# BIRD.LOG is combined with prior line ending in a backslash
print "keepends = False"
for line in listing.splitlines(False): print repr(line)
# Setting keepends=True does not help
print "keepends = True"
for line in listing.splitlines(True): print repr(line)
Here is the output:
keepends = False
'Directory of B: 12/15/2014 02:12 PM 1814814 BIRD.LOG'
' 01/01/2000 12:04 AM <DIR> CONFIG'
' 12/15/2014 02:55 PM 35060 ALLIGATOR.LOG'
' 03/15/2013 02:06 PM <DIR> MONKEY'
' 03/15/2013 02:06 PM <DIR> FROG'
' 03/15/2013 02:06 PM <DIR> BADGER'
' 2 File(s) 1849874 bytes'
' 4 Dir(s) 1674739712 bytes free'
keepends = True
'Directory of B: 12/15/2014 02:12 PM 1814814 BIRD.LOG\n'
' 01/01/2000 12:04 AM <DIR> CONFIG\n'
' 12/15/2014 02:55 PM 35060 ALLIGATOR.LOG\n'
' 03/15/2013 02:06 PM <DIR> MONKEY\n'
' 03/15/2013 02:06 PM <DIR> FROG\n'
' 03/15/2013 02:06 PM <DIR> BADGER\n'
' 2 File(s) 1849874 bytes\n'
' 4 Dir(s) 1674739712 bytes free\n'
The problem is unchanged passing keepends = True. The Python splitlines documentation does not mention any special handling of backslashes, and neither does the documentation for the universal newlines approach to splitting lines.
My code sample is from a unit test, but in the real world the listing will be retrieved programmatically. I can think of workarounds involving manipulating my input listing or other methods, but I'm wondering why a workaround should be necessary at all. Is it a bug? Any advice would certainly be appreciated!
The problem with your unit test is that the \ character in your string literal is interpreted as an escape character by Python. Try changing the first line to
listing = r"""Directory of B:\
From the Python docs:
String literals may optionally be prefixed with a letter 'r' or 'R'; such strings are called raw strings and use different rules for interpreting backslash escape sequences.
In a real work scenario where you get the string from a command output, this should not be a problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With