I was wondering how to remove:
Is this possible with a single Regex, with unicode support for international space characters, etc.?
The TrimSpace() method from the strings package allows you to remove leading and trailing white-space characters from a specified string. The function removes the following white-space characters as defined by Unicode: \t – tab character.
Using String. To remove duplicate whitespaces from a string, you can use the regular expression \s+ which matches with one or more whitespace characters, and replace it with a single space ' ' .
To trim spaces around string in Go language, call TrimSpace function of strings package, and pass the string as argument to it. TrimSpace function returns a String.
In Java, we can use regex \\s+ to match whitespace characters, and replaceAll("\\s+", " ") to replace them with a single space.
You can get quite far just using the strings
package as strings.Fields
does most of the work for you:
package main import ( "fmt" "strings" ) func standardizeSpaces(s string) string { return strings.Join(strings.Fields(s), " ") } func main() { tests := []string{" Hello, World ! ", "Hello,\tWorld ! ", " \t\n\t Hello,\tWorld\n!\n\t"} for _, test := range tests { fmt.Println(standardizeSpaces(test)) } } // "Hello, World !" // "Hello, World !" // "Hello, World !"
It seems that you might want to use both \s
shorthand character class and \p{Zs}
Unicode property to match Unicode spaces. However, both steps cannot be done with 1 regex replacement as you need two different replacements, and the ReplaceAllStringFunc
only allows a whole match string as argument (I have no idea how to check which group matched).
Thus, I suggest using two regexps:
^[\s\p{Zs}]+|[\s\p{Zs}]+$
- to match all leading/trailing whitespace[\s\p{Zs}]{2,}
- to match 2 or more whitespace symbols inside a stringSample code:
package main import ( "fmt" "regexp" ) func main() { input := " Text More here " re_leadclose_whtsp := regexp.MustCompile(`^[\s\p{Zs}]+|[\s\p{Zs}]+$`) re_inside_whtsp := regexp.MustCompile(`[\s\p{Zs}]{2,}`) final := re_leadclose_whtsp.ReplaceAllString(input, "") final = re_inside_whtsp.ReplaceAllString(final, " ") fmt.Println(final) }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With