Greek and AJAX

Submitted by dan.littlejohn on Wed, 11/09/2005 - 9:59pm. :: Software Dev.

It’s been quite a struggle lately with language tools. Learned more than I wanted to know about i18n internationalization (for php that's gettext). You know, the mumbo jumbo to translate language from one to another. I was doing ok until I needed to use Unicode, Greek Unicode to be exact. So I looked and looked and was lucky enough to get help from folks in Sweden (Niklas Larsson) and Greece (Elias Sofronas). This is what a learned.

First of all there are all kinds of character encodings. This is something I was somewhat familiar with, but never thought about in a practical matter. Since I speak English as my first language, most things are encoded in Latin1 or something similar and I am none the wiser. When using more complex character sets how it is encoded becomes very important and UTF-8 is where it is at. Almost every language will work with UTF-8.

Now the thing to take away from this is that even though you set up all your headers in all of the files, etc (html header, .po files) to support i18n to say UTF-8, you still need to convert the actual files themselves that are in other formats like Latin1, byte by byte to UTF-8 format. (for folks on Linux, iconv is the command to look up for this).

Here are the commands I have to build an i18n file or update an existing one. (notice the conversion of the file to the UTF-8 encoding that is left out of most tutorials)

// To create the .po (write your translations to this file)
$ find *.php | xargs xgettext -L PHP -o ari.po --keyword=_ -

// To create the .mo:
$ iconv -f iso-8859-1 -t utf-8 -o ari.utf-8.po ari.po
$ msgfmt -v ari.utf-8.po -o ari.mo

// To update
$ msgmerge ari.po old.po --output-file=new.po
$ iconv -f iso-8859-1 -t utf-8 -o new.utf-8.po new.po
$ msgfmt new.po

The second thing to look for is that you have the correct language and country code. For Greek it is el_GR. I was using gr_GR and even though I got everything else right, since there is not font set on the computer for gr_GR it would not translate. (on Linux run "locale -a" to see what fonts are available)

One last gotcha is if you make a change, you have to restart your webserver (on Linux with Apache /etc/init.d/httpd restart). Otherwise you get weird translations. Hopefully, that will head off someone else's week of frustration.

Now that you can translate. Check out this cool code. It is an AJAX spell checker like that found in gmail, including the Open Source Code to use.

http://www.broken-notebook.com/spell_checker/

Pretty nifty that I was able to get all those Greek references in the title to tie together, eh? Have fun.