Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to determine the language of a website

I have a url of a website and need to find out which language the website uses (whether it's spanish, french, italian, etc).

The site's top level domain is .com, and this doesn't help at all. I cannot simply check if the string contains '.de', '.fr', or any other country codes.

I was trying to get the lang attribute of the html tag, but there are many websites that don't have it. Also I found here that I can check the meta tag, which would look like this:

<meta name="language" content="english">

But again, not all websites use this tag.

Do you know any other ways to determine a website's language?

Thanks.

like image 214
Kirill Polevsky Avatar asked Oct 29 '25 15:10

Kirill Polevsky


1 Answers

Sadly many developers don't think that adding a language metainfo to their web page is something useful. Also it might be that the page has multiple languages on it which - as far as I know - forces the usage of the <div> parameter lang or other such things. Here are some pointers that might help you:

  1. Check for the <meta name="language" content="..."> tag
  2. Check inside <div>s and look if those contain lang parameter
  3. Check the menus (if any) - these usually contain much, much less text than the main body of the page
  4. Look for further smaller chunks of HTML data that you can parse easily and that can give you more information about the language(s) the page uses
  5. Finally start heuristically analyzing the big text chunks

It's actually really sad how things currently are because providing such information is not that difficult and doesn't require much extra time invested into doing it but the pros are definitely there especially when it comes to search engines and most importantly - improving the accessibility for people who have various disabilities.

like image 77
rbaleksandar Avatar answered Nov 01 '25 06:11

rbaleksandar



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!