i am writing a Bot that can just check thousands of website either they are in English or not.
i am using Scrapy (python 2.7 framework) for crawling each website first page ,
can some one suggest me which is the best way to check website language ,
any help would be appreciated.
Since you are using Python, you can try out NLTK. More precisely you can check for NLTK.detect
More information and the exact code snippet is here: NLTK and language detection
You can use the response headers to find out:
Wikipedia
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With