Internationalization Lessons from UEFA

The UEFA EURO 2012, or the European football championship, is close to the end of group stage as I write this. (That would be “soccer” to U.S. readers.) As usual, I have been delighted to notice that UEFA has done a great job in getting the names of the various international players presented as accurately as possible, both on the website and in the mobile apps they’ve published.

UEFA’s good track record with names has been going on for a long time; I first noticed this when I started to follow the UEFA Champions League in 2000, when I was already working with software internationalization, and knew enough to pay attention to these things.

In this post I’ll examine what it takes to pull off a task like this – which you may also need to do.

In the championship there are 16 national football teams, out of which only two (England and Ireland) have players whose names can be accurately represented by the traditional ASCII character set. All the others are somehow “difficult”, “problematic” or “challenging”.

The case with Greece, Russia and Ukraine is interesting because these countries use Greek and Cyrillic alphabets, which would be slightly difficult for most Europeans. Therefore the names of the players in those national teams have sensibly been transliterated into the Latin alphabet (quite acceptable in this particular context), so that Андрі́й Шевче́нко becomes Andriy Shevchenko (UKR). (If the Ukrainian is just a set of boxes to you, it is down to device or browser font limitations.)

Transliteration works better than transcription. Bad examples of the latter have occurred over the recent two decades, particularly in the Olympic Games, ski jumping and cross-country skiing competitions and various ice hockey tournaments. (Many Finns still remember, and make fun of, transcriptions of the ‘ä’ character with ‘ae’, resulting in “Marja-Liisa Haemaelaeinen” and “Matti Nykaenen“.)

Others, such as the Czech Republic, Poland and France, require Latin characters which used to be available only in various national character sets (such as the ISO 8859 character set family or Windows code pages). In the dark Middle Ages of computing (circa 1970-1990) there was no single character set that could represent all the characters required by all the names of the players in any such tournament now or then. Now, with the widespread use of Unicode, things are much easier.

Here are just some examples of the names of some famous footballers for various national teams playing in UEFA EURO 2012:

Petr Čech (CZE), Mario Mandžukić (CRO), Philippe Mexès (FRA), Kim Källström (SWE), Mesut Özil (GER), Cesc Fàbregas (ESP), Przemysław Tytoń (POL)

The characters you might think of as “difficult” are set in bold above. These are just some examples. (I was contemplating of collecting all the names of all the players in all the teams, and creating a histogram of all the characters appearing, but I think I’m making the point already.)

It is encouraging to see that UEFA consistently makes the effort to get people’s names right. You don’t get results like this by accident, so there must be an enlightened individual (or several) working with their information systems, possessing the right mindset. That also fits very nicely into the motto of this year’s European tournament: RESPECT.

Do you know how to take advantage of the amazing power of Unicode? Let us be your guide to it, and to much more that is involved in going global.