Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ignore character accents when sorting strings

I'm writing a golang program, which takes a list of strings and sorts them into bucket lists by the first character of string. However, I want it to group accented characters with the unaccented character that it most resembles. So, if I have a bucket for the letter A, then I want strings that start with Á to be included.

Does Go have anything built-in for determining this, or is my best bet to just have a large switch statement with all characters and their accented variations?

like image 488
Nairou Avatar asked Jan 02 '14 22:01

Nairou


1 Answers

Looks like there are some addon packages for this. Here's an example...

package main

import (
   "fmt"
   "golang.org/x/text/collate"
   "golang.org/x/text/language"
)

func main() {
   strs := []string{"abc", "áab", "aaa"}
   cl := collate.New(language.English, collate.Loose)
   cl.SortStrings(strs)
   fmt.Println(strs) 
}

outputs:

[aaa áab abc]

Also, check out the following reference on text normalization: http://blog.golang.org/normalization

like image 139
Eve Freeman Avatar answered Oct 07 '22 02:10

Eve Freeman