Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert non-ASCII characters (umlauts, accents...) to their closest ASCII equivalent (slug creation)

I am looking for way in JavaScript to convert non-ASCII characters in a string to their closest equivalent, similarly to what the PHP iconv function does. For instance if the input string is Rånades på Skyttis i Ö-vik, it should be converted to Ranades pa skyttis i o-vik. I had a look at phpjs but iconv isn't included.

Is it possible to perform such conversion in JavaScript, if so how?

Notes:

  • more generally this process of conversion is called transliteration
  • my use-case is the creation of URL slugs
like image 337
Max Avatar asked Aug 05 '12 11:08

Max


3 Answers

The easiest way I've found:

var str = "Rånades på Skyttis i Ö-vik";
var combining = /[\u0300-\u036F]/g; 

console.log(str.normalize('NFKD').replace(combining, ''));

For reference see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/normalize

like image 75
Rez Avatar answered Oct 21 '22 14:10

Rez


It's because iconv is a native compiled UNIX utility behind the most i18n character map conversion functions.

You won't find it in javascript unless you access some browser component.

Encoding is a property of the document so most javascript implementation just simply dismiss it.

You'll need a pure js library for unaccented strings. It would be the best to have one for the specific language you need.

The simpliest way is via some translate tables or even regex replaces.

like here : http://lehelk.com/2011/05/06/script-to-remove-diacritics/

check this thread too : Replacing diacritics in Javascript

like image 45
kisp Avatar answered Oct 21 '22 13:10

kisp


I would recommend Unicode package, it will also map Greek and Cyrillic letters to their closest ascii symbol:

unidecode('Lillı Celiné Никита Ödipus');

'Lilli Celine Nikita Odipus'

like image 4
Adam Avatar answered Oct 21 '22 12:10

Adam