Simple Language Content Negotiation With Override
Summary: This paper details a method for achieving "Accept-Language"-driven content negotiation with Apache, while still permitting the user an override mechanism which persists from page to page, without the need for server-side scripting or a CMS.
At one point, I was doing the website for two summer camps I'm involved with. (That site used to demonstrate these techniques, but I no longer maintain it and they are no longer using this mechanism.) The site was pretty simple in construction, and made up of static HTML pages. Because the camp is bilingual, we had a requirement that the site be available in both English and French. No problem - we made two versions of each page, and I set up content negotiation with a simple .htaccess file. However, sometimes people have mis-configured browsers, or have some other reason for wanting the site in a language other than the one their Accept-Language header is advertising. And, if they make that choice, they want it to persist from page to page.
After much head scratching and web searching, I found an obscure page which detailed, at the end of a very long discussion, how to do it. This is a (hopefully more accessible) summary of the technique. I will use English (en) and French (fr) as my example, but this could be done with any languages.
2. Content Negotiation
The first thing you do is create your content in two versions -
an English version
of the content, and
a French version. How to do this is outside the scope of this paper; I use a template plus a hacked-together Perl script which replaces tags with localised text from a strings file. Give the two versions file extensions to match their language - e.g.
.html.en for English. Use the standard instructions in
the .htaccess in the root directory of your site to enable MultiViews and language-based content negotiation for the languages in question.
3. Manual Override
You could have a manual override on e.g. each English page by having a "en Franšais" linking specifically to the .fr file; however, this override wouldn't persist; when the user navigated away from that page - Accept-Language negotiation would take over again. Avoiding this would require rewriting all of the links in the page dynamically to have a parameter or extension indicating the language - and that requires complicated server-side scripting support or a CMS.
Instead, you take advantage of the Apache mod_rewrite module, which permits server-side URL rewriting. Create two directories off the root,
/fr, whose only contents is a single .htaccess.
The .htaccess in the
/en directory turns on FollowSymLinks and the Rewriting Engine, and then rewrites
*.html files to
../*.html.en. So, if you access
the URL gets rewritten by the server to
http://www.interaction-france.org/index.html.en, and the English version of the content is returned.
However, the URL in the user's browser still remains as http://www.interaction-france.org/en/index.html, and any further navigation is relative to that, so the language choice persists.
The .htaccess also rewrites all other URLs but without appending .en - this allows you to have graphics and other sorts of file which just live in a single place, but are used by both language versions. The
.htaccess in the
/fr directory does exactly the same thing but for French.
So, your "en Franšais" link on each page just refers to the same URL but with
/fr inserted at the front.
This technique imposes only one restriction - links within your site must all be relative. Any absolute link will break the persistence of the user's language choice, because it will remove the
Hey - it's swings and roundabouts.
In this way, with just the addition of a couple of .htaccess files, and with no need for server-side scripting or other complexities, you can achieve both the automatic serving of the negotiated language (correct in 99.9% of cases) and a manual override which persists.