Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Internationalization and Search Engine Optimization

I'd like to internationalize my site such that it's accessible in many languages. The language setting will be detected in the request data automatically, and can be overridden in the user's settings / stored in the session.

My question pertains to how I should display the various versions of the same page based upon language in terms of the pages' URL's. Let's say we're just looking at the index page of http://www.example.com/, which defaults to English. Now if a French-speaker loads the index page, should I simply keep the URL as http://www.example.com/, or should I have it redirect to http://www.example.com/fr/?

I'm trying to figure out what benefits or consequences this has in terms of SEO. I don't want the French version of the site showing up in google.com if it prevents the English version of the same pages from showing up there, but I would like it to show up in google.fr.

like image 210
Matt Huggins Avatar asked Dec 01 '09 19:12

Matt Huggins


People also ask

What is internationalization in SEO?

What is international SEO? International SEO is the process of optimizing your website so that search engines can easily identify which countries you want to target and which languages you use for business.

Why international SEO is important?

International SEO ensures that search engines can easily identify which countries your company wants to target. It also shows search engines which languages you use to attract customers from different countries or speak different languages.

What are the three types of search engine optimization techniques?

The three kinds of SEO are: On-page SEO – Anything on your web pages – Blogs, product copy, web copy. Off-page SEO – Anything which happens away from your website that helps with your SEO Strategy- Backlinks. Technical SEO – Anything technical undertaken to improve Search Rankings – site indexing to help bot crawling.

Is SEO different in different countries?

4. SEO isn't that different around the globe -- but you still need a strategy for each country. Because platforms like Google, Bing, and Yahoo are still present around the world, optimizing your website for them won't be that different from country to country.


2 Answers

There are a lot of things to consider from a search standpoint when you start localizing your website into multiple languages. Generally, you want to watch out and make sure that you're not being too smart with the user's intentions. Things like auto-detecting language and storing them in cookies can be good in some scenarios, but if they become a requirement for your localizations to work correctly than you can run into some issues with search engines (and real people too).

For search engines, you'll want to make sure that they can find and access all of your content in all the different languages without POST requests (no drop down forms), javascript, flash or cookies. Because search engines generally don't use these technologies.

It turns out that this is often good for real customers as well. If you rely on browser settings or ip detection, than some of your real customers who are either borrowing a friends computer, or traveling in a foreign country might get stuck in the wrong language (Microsoft Bing actually had this problem for a while).

Here's some best practices to keep in mind

  • Each language should be contained under some root in your information architecture. Best option would be to acquire the TLD (mysite.fr) for each specific region for your website. Although this sometimes isn't feasible, so a second option is to use a sub-domain (fr.mysite.com), and the third option is to use a sub folder (mysite.com/fr). That makes it easiest for us to look at a set of pages in aggregate and best determine a language/ region. Don't make it a parameter (mysite.com/products/iphone/lang=en&region=us), that is the most difficult case for us to detect.

  • We have language classifiers (artificial intelligence nets) that try to determine what language/ region a page is describing. So make sure you have enough clues on your page as to what the language is. E.g. if the page is french, make sure the meta description tag is also in french, as are the <h1> tags, the title and make sure you have a solid couple sentences in french. Many sites will mix languages and have very little actual french on the page

  • Telephone numbers, mailing addresses and the name of the geographic location are also great clues for search engines in identifying region/ language of a page. Use these well (and make sure they are actual text on the page, not images)

  • Use Google Webmaster Tools to specify the language and region of your pages. Create an account, verify your site, and then you can specify which region and language different parts of your website are targeted for.

Mis-information - the lang attribute, or any language tags you may have heard about are currently not used by any search engine. When we (Microsoft Bing) did an analysis of these last year, the most common 'standard' lang tag people were using only showed up on 0.000125% of pages on the web - not enough to be useful!

Vanessa Fox (she build google's webmaster center, and created the sitemap protocol) wrote a particularly good article recently about how Google thinks about localization, and what that means for site architecture. I recommend checking it out here: http://www.ninebyblue.com/blog/making-geotargeted-content-findable-for-the-right-searchers/

like image 156
Nathan Buggia Avatar answered Oct 25 '22 03:10

Nathan Buggia


This is how I solved the problem on my personal website as an exercise in i18n:

  • When a user arrives at, e.g. brazzy.de/index.php, the site tries to determine the language from cookie (if present) or browser settings (Accept-language header), defaults to English, and does not redirect
  • Every page has links to the different language versions of that page (IMO the most important factor for user convenience, and also makes sure search engines can easily find the different versions).
  • These links lead to e.g. brazzy.de/en/index.php, which is in my case rewritten to brazzy.de/index.php?lang=en - this ensures that search engines see distinct URLs for the different language versions.
  • Visiting such a subdirectory sets the language cookie to that language
  • The pages without a language-specific URL (i.e. where the language depends on client data) use e.g. <link rel="canonical" href="/en/"> to tell the search engine at which language-specific URL that page can be found.
  • Use XML sitemaps to further make sure search engines can find all pages and all different language versions.
like image 40
Michael Borgwardt Avatar answered Oct 25 '22 03:10

Michael Borgwardt