Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

downgrade non-ascii symbols to closest 7-bit ASCII equivalent (preferrably Java)

is there any simple/lightweight solution to change at least some of non-ASCII symbols to respective ASCII analogs? For example this string

abc-åäö.txt

should be changed to

abc-aao.txt

A bit of background: Zip-tools do not reliably support UTF-8, hence the need to downgrade. AFAICR Google "download attachments as single zip file" feature replaces any non-ascii symbols with the '_' character.

PS: the code might as well be in some other language, if it's more or less understandable I'll port that to Java. PPS: my first question so far, so please don't minus me below the ground okay?

like image 253
Anton Kraievyi Avatar asked Jul 28 '10 10:07

Anton Kraievyi


3 Answers

Have a look at java.text.Normalizer. It can help you with transforming equivalent characters: http://en.wikipedia.org/wiki/Unicode_equivalence

like image 124
relet Avatar answered Oct 09 '22 02:10

relet


Maybe this would do?

like image 43
Krumelur Avatar answered Oct 09 '22 01:10

Krumelur


Looks like the problem is solved here -

[solution][howto] Convert special characters to normal chars (é to e) http://www.ramonfincken.com/permalink/topic192.html

like image 24
d-live Avatar answered Oct 09 '22 02:10

d-live