Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to determine a string is English or Persian?

Tags:

java

android

I have edittext in a form, I want that when the user inputs text into the edittext for my program to detect which language was inserted into the edittext.

Is there a way to determine a string is English or Persian?

I found this code for Arabic

public static boolean isProbablyArabic(String s) {
for (int i = 0; i < Character.codePointCount(s, 0, s.length()); i++) {
    int c = s.codePointAt(i);
    if (c >= 0x0600 && c <=0x06E0)
        return true;
}
return false;
}

but how can I change this code for Persian?

like image 755
Saeed Hashemi Avatar asked Apr 13 '14 07:04

Saeed Hashemi


2 Answers

All possible Unicode ranges for Persian (also for Urdu) alphabet:

  • 0x0600 to 0x06FF

  • 0xFB50 to 0xFDFF

  • 0xFE70 to 0xFEFF

    So if you want don't miss any char check all ranges. Hope helps you.

like image 110
Guido Mocha Avatar answered Sep 19 '22 17:09

Guido Mocha


You can know a string is english or persian by using Regex.

public static final Pattern VALID_NAME_PATTERN_REGEX = Pattern.compile("[a-zA-Z_0-9]+$");

public static boolean isEnglishWord(String string) {
    return VALID_NAME_PATTERN_REGEX.matcher(string).find();
}

this only works with words and numbers. if there is a character like '=' or '+' , the function would return false . you can fix that by editing the regex to match what you need .

like image 37
ilaimihar Avatar answered Sep 20 '22 17:09

ilaimihar