I'm really stumped as to why this doesn't work. All I want to do is removing zwsp (u200b), and newlines and extra spaces from content read from a file.
Ultimately, I want to write this out to a new file, which I have functional, just not in the desired format yet.
My input (a short test file, which has zwsp / u200b in it) consists of the following:
Australia 1975
Adelaide 2006 23,500
Brisbane (Logan) 2006 29,700
Brisbane II (North Lakes) 2016 29,000
Austria 1977
Graz 1989 26,100
Innsbruck 2000 16,000
Klagenfurt 2008 27,000
My code so is as follows:
input_file = open('/home/me/python/info.txt', 'r')
file_content = input_file.read()
input_file.close()
output_nospace = file_content.replace('\u200b' or '\n' or ' ', '')
print(output_nospace)
f = open('nospace_u200b.txt', 'w')
f.write(output_nospace)
f.close()
However, this doesn't work as I expect.
Whilst it removes u200b, it does not remove newlines or spaces. I have to test for absence of u200b by checking the output file produced as part of my script.
If I remove one of the operations, e.g. /u200b, like so:
output_nospace = file_content.replace('\n' or ' ', '')
...then sure enough the resulting file is without newlines or spaces, but u200b remains as expected. Revert back to the original described at the top of this post, and it doesn't remove u200b, newlines and spaces.
Can anyone advise what I'm doing wrong here? Can you chain list operations like this? How can I get this to work?
Thanks.
The result of code like "a or b or c" is just the first thing of a, b, or c that isn't considered false by Python (None, 0, "", [], and False are some false values). In this case the result is the first value, the zwsp character. It doesn't convey to the replace function that you're looking to replace a or b or c with ''; the replace code isn't informed you used 'or' at all. You can chain replacements like this, though: s.replace('a', '').replace('b', '').replace('c', ''). (Also, replace is a string operation, not a list operation, here.)
Based on this question, I'd suggest a tutorial like learnpython.org. Statements in Python or other programming languages are different from human-language sentences in ways that can confuse you when you're just starting out.
As indicated by @twotwotwo, the following implementation of a .replace chain solves the issue.
output_nospace = \
file_content.replace('\u200b', '').replace('\n', '').replace(' ', '')
Thanks so much for pointing me in the right direction. :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With