I understand to remove a single backslash we might do something like from Removing backslashes from a string in Python
I've attempted to:
I'd like to know how to remove in the list below all the words like '\ue606',
A =
['Historical Notes 1996',
'\ue606',
'The Future of farms 2012',
'\ch889',
'\8uuuu',]
to transform it into
['Historical Notes 1996',
'The Future of farms 2012',]
I tried:
A = ['Historical Notes 1996',
'\ue606',
'The Future of farms 2012',
'\ch889',
'\8uuuu',]
for y in A:
y.replace("\\", "")
A
It returns:
['Historical Notes 1996',
'\ue606',
'The Future of farms 2012',
'\\ch889',
'\\8uuuu']
I'm not sure how to address the string following the '\' or why it added a new '\' rather than remove it.
Python is somewhat hard to convince to just ignore unicode characters. Here is a somewhat hacky attempt:
l = ['Historical Notes 1996',
'\ue606',
'The Future of farms 2012',
'\ch889',
'\8uuuu',]
def not_unicode_or_backslash(x):
try:
x = x.encode('unicode-escape').decode()
finally:
return not x.startswith("\\")
[x for x in l if not_unicode_or_backslash(x)]
# Output: ['Historical Notes 1996', 'The Future of farms 2012']
The problem is that you can't check directly whether or not the string starts with a backslash since \ue606
is not considered as the 6-character string, but as a single unicode character. Because of this, it does not start with a backslash and for
[x for x in l if not x.startswith("\\")]
you get
['Historical Notes 1996', '\ue606', 'The Future of farms 2012']
You can use this.
Use isprintable() for unicode string and '\\' for strings start with backlash.
List = ['Historical Notes 1996','\ue606','The Future of farms 2012','\ch889','\8uuuu',]
print([x for x in List if x[0] != '\\' and x.isprintable()])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With