I am using a very simple sed script removing comments : sed -e 's/--.*$//'
It works great until non-ascii characters are present in a comment, e.g.: -- °
.
This line does not match the regular expression and is not substituted.
Any idea how to get .
to really match any character?
Solution :
Since file
says it is an iso8859 text, LANG
variable environment must be changed before calling sed
:
LANG=iso8859 sed -e 's/--.*//' -
Match any specific character in a setUse square brackets [] to match any characters in a set. Use \w to match any single alphanumeric character: 0-9 , a-z , A-Z , and _ (underscore). Use \d to match any single digit. Use \s to match any single whitespace character.
By default, the '. ' dot character in a regular expression matches a single character without regard to what character it is. The matched character can be an alphabet, a number or, any special character.
(Range Expression): Accept ANY ONE of the character in the range, e.g., [0-9] matches any digit; [A-Za-z] matches any uppercase or lowercase letters. [^...]: NOT ONE of the character, e.g., [^0-9] matches any non-digit. Only these four characters require escape sequence inside the bracket list: ^ , - , ] , \ .
It works for me. It's probably a character encoding problem.
This might help:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With