Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sort UTF-8 lines in Vim?

Tags:

vim

unicode

utf-8

I have these lines in Vim:

a
c
b
e
é
f
g

and when I do :%sort, I get this:

a
b
c
e
f
g
é

Obviously, the "é" line should not be at the end, it should be after the "e" line. Is it possible to get Vim to sort these lines correctly? Not using the ASCCI key for the characters but the actual character.

I also tried with :!sort (to use GNU sort utiliy) but I get the same result.

like image 536
remi Avatar asked Apr 21 '10 13:04

remi


1 Answers

:%sort and :%!sort do not necessarily work in the same way. To quote :help sort:

The details about sorting depend on the library function used. There is no guarantee that sorting is "stable" or obeys the current locale. You will have to try it out.

On the other hand, GNU sort sorts according to the current locale. To quote man sort:

* WARNING * The locale specified by the environment affects sort order. Set LC_ALL=C to get the traditional sort order that uses native byte values.

On my system (Ubuntu 9.10 with fr_CA.UTF-8 temporarily set) :%sort sorts as if C or POSIX was set, while :%!sort sorts according to the French locale.

My guess is that you've initially tried both :%sort and :%!sort under a POSIX-like locale (which yielded the same result), and then continued your experiments with different locales using :%sort only (which always returned POSIX-like order). Can you confirm that?

like image 148
Bolo Avatar answered Oct 31 '22 21:10

Bolo