Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detect language of text [duplicate]

Tags:

Is there any C# library which can detect the language of a particular piece of text? i.e. for an input text "This is a sentence", it should detect the language as "English". Or for "Esto es una sentencia" it should detect the language as "Spanish".

I understand that language detection from text is not a deterministic problem. But both Google Translate and Bing Translator have an "Auto detect" option, which best-guesses the input language. Is there something similar available publicly, preferably in C#?

like image 824
Nikhil Avatar asked Sep 23 '09 07:09

Nikhil


People also ask

Can Python detect language of text?

Googletrans python library uses the google translate API to detect the language of text data.

How do you find the language of a word in Python?

The idea behind language detection is based on the detection of the character among the expression and words in the text. The main principle is to detect commonly used words like to, of in English. Python provides various modules for language detection.


1 Answers

Yes indeed, TextCat is very good for language identification. And it has a lot of implementations in different languages.

There were no ports in .Net. So I have written one: NTextCat (NuGet, Online Demo).

It is pure .NET Standard 2.0 DLL + command line interface to it. By default, it uses a profile of 14 languages.

Any feedback is very appreciated! New ideas and feature requests are welcomed too :)

like image 105
Ivan Akcheurov Avatar answered Oct 10 '22 15:10

Ivan Akcheurov