Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you remove backslashes and the word attached to the backslash in Python?

Tags:

python

string

I understand to remove a single backslash we might do something like from Removing backslashes from a string in Python

I've attempted to:

I'd like to know how to remove in the list below all the words like '\ue606',

A = 
['Historical Notes 1996',
'\ue606',
'The Future of farms 2012',
'\ch889',
'\8uuuu',]

to transform it into

['Historical Notes 1996',
'The Future of farms 2012',]

I tried:

A = ['Historical Notes 1996',
'\ue606',
'The Future of farms 2012',
'\ch889',
'\8uuuu',]

for y in A:
      y.replace("\\", "")
A

It returns:

['Historical Notes 1996',
 '\ue606',
 'The Future of farms 2012',
 '\\ch889',
 '\\8uuuu']

I'm not sure how to address the string following the '\' or why it added a new '\' rather than remove it.

like image 835
Katie Melosto Avatar asked Dec 30 '22 14:12

Katie Melosto


2 Answers

Python is somewhat hard to convince to just ignore unicode characters. Here is a somewhat hacky attempt:

l = ['Historical Notes 1996',
'\ue606',
'The Future of farms 2012',
'\ch889',
'\8uuuu',]


def not_unicode_or_backslash(x):
    try:
        x = x.encode('unicode-escape').decode()
    finally:
        return not x.startswith("\\")
        

[x for x in l if not_unicode_or_backslash(x)]

# Output: ['Historical Notes 1996', 'The Future of farms 2012']

The problem is that you can't check directly whether or not the string starts with a backslash since \ue606 is not considered as the 6-character string, but as a single unicode character. Because of this, it does not start with a backslash and for

[x for x in l if not x.startswith("\\")]

you get

['Historical Notes 1996', '\ue606', 'The Future of farms 2012']
like image 183
mcsoini Avatar answered Jan 02 '23 02:01

mcsoini


You can use this.
Use isprintable() for unicode string and '\\' for strings start with backlash.

List = ['Historical Notes 1996','\ue606','The Future of farms 2012','\ch889','\8uuuu',]
print([x for x in List if x[0] != '\\' and x.isprintable()])
like image 21
angel_dust Avatar answered Jan 02 '23 04:01

angel_dust