Is there any C# library which can detect the language of a particular piece of text? i.e. for an input text "This is a sentence"
, it should detect the language as "English"
. Or for "Esto es una sentencia"
it should detect the language as "Spanish"
.
I understand that language detection from text is not a deterministic problem. But both Google Translate and Bing Translator have an "Auto detect" option, which best-guesses the input language. Is there something similar available publicly, preferably in C#?
Googletrans python library uses the google translate API to detect the language of text data.
The idea behind language detection is based on the detection of the character among the expression and words in the text. The main principle is to detect commonly used words like to, of in English. Python provides various modules for language detection.
Yes indeed, TextCat is very good for language identification. And it has a lot of implementations in different languages.
There were no ports in .Net. So I have written one: NTextCat (NuGet, Online Demo).
It is pure .NET Standard 2.0 DLL + command line interface to it. By default, it uses a profile of 14 languages.
Any feedback is very appreciated! New ideas and feature requests are welcomed too :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With