Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to properly add quotes to a string using python?

Tags:

python

string

I want to add a set of (double) quotes to a python string if they are missing but the string can also contain quotes.

The purpose of this is to quote all command that are not already quoted because Windows API requires you to quote the entire command line when you execute a process using _popen().

Here are some strings that should be quoted:

<empty string>
type
"type" /?
"type" "/?"
type "a a" b
type "" b

Here are some that should not be quoted:

"type"
""type" /?"

Please take the time to test all examples; it is not too easy to detect if the string needs the quotes or not.

like image 649
bogdan Avatar asked Dec 09 '22 14:12

bogdan


2 Answers

Your problem is inconsistent.

Consider the two cases

""a" b"

"a" "b"

The former is interpreted as a pre-quoted string with 'nested quotes', but the latter is interpreted as separately-quoted strings. Here are some examples that highlight the issue.

" "a" "b" "

" "a" b"

"a ""b"

How should they be treated?

like image 196
Katriel Avatar answered Dec 23 '22 21:12

Katriel


I think this is a difficult question to specify in a precise way, but perhaps this strategy will approximate your goal.

The basic idea is to create a copy of the original string, removing the internally quoted items. An internally quoted item is defined here so that it must contains at least one non-whitespace character.

After the internally quoted items have been removed, you then check whether the entire string needs surrounding quotes or not.

import re

tests = [
    # Test data in original question.
    ( '',                '""'                ),
    ( 'a',               '"a"'               ),
    ( '"a"',             '"a"'               ), # No change.
    ( '""a" b"',         '""a" b"'           ), # No change.
    ( '"a" b',           '""a" b"'           ),
    ( '"a" "b"',         '""a" "b""'         ),
    ( 'a "b" c',         '"a "b" c"'         ),

    # Test data in latest edits.
    ( 'type',            '"type"'         ),    # Quote these.
    ( '"type" /?',       '""type" /?"'    ),
    ( '"type" "/?"',     '""type" "/?""'  ),
    ( 'type "a a" b',    '"type "a a" b"' ),
    ( 'type "" b',       '"type "" b"'    ),
    ( '"type"',          '"type"'         ),    # Don't quote.
    ( '""type" /?"',     '""type" /?"'    ),

    # Some more tests.
    ( '"a b" "c d"',     '""a b" "c d""'     ),
    ( '" a " foo " b "', '"" a " foo " b ""' ),
]

Q = '"'
re_quoted_items = re.compile(r'" \s* [^"\s] [^"]* \"', re.VERBOSE)

for orig, expected in tests:
    # The orig string w/o the internally quoted items.
    woqi = re_quoted_items.sub('', orig)

    if len(orig) == 0:
        orig_quoted = Q + orig + Q
    elif len(woqi) > 0 and not (woqi[0] == Q and woqi[-1] == Q):
        orig_quoted = Q + orig + Q    
    else:
        orig_quoted = orig

    print orig_quoted == expected
like image 21
FMc Avatar answered Dec 23 '22 22:12

FMc