Translating a web application is not an easy task (although it might seem so). Or rather, translating it well is not easy.
Using tools like gettext will help you, of course, but past the tools, there are a few things that do not seem to be well understood by web developers with little experience in foreign languages.
In this series of articles, I'll try to give one example at a time of how to make a perfect translation.
String identifiers
In this first lesson, we'll talk about the string identifiers, or the name of the translated item.
For example, let's say you want to translate the string "Title" to many languages, so that when your users come to your page, they will see "Title" if they use your English version, and "Titre" if they use your French version.
Of course, you want to make sure most of the translators understand, from the identifier of the string itself, what this string refers to. For example, you could name it $title. Pretty clear, pretty obvious.
Conventions
To avoid having lots of different ways to represent translateable strings, you should define a clear convention from the start. Something about using UpperCamelCaseToRepresentYourString or lower_camel_case_to_represent_your_string, or even 'a pure text identifier always considered as an array index'. I have read once that, if you have to choose between UpperCamelCase and lower_camel_case, non-native English speakers have more difficulty understanding UpperCamelCase because it is more difficult to split the words (visually).
This being said, read the following before establishing your conventions...
Clashing with local variables
Oh but... wait... if you use "$title" kind-of-identifiers, then how are you going to do when you have to use a string identifier that you are already using for the computational elements of your script? Surely, there must be something to do about it...
Well yes, you can decide on 2 options here, which imply creating the notion of namespace, that is create identifiers that will be easily recognized as being part of a group, somehow:
* use one (or several) array(s), of which each index is the identifier (like $t['title'] = 'Title';)
* use prefixes to your variables (like $translate_title or $t_title = 'Title' to make it short)
Uppercase letters
Now wether you use namespaces or not, and whether you use upper camel case or not, you will have a problem when it comes to differentiate: 'Title' from 'title'. You know, it so happens that some times you have to put a specific text element between parenthesis, and in this case you will need to put it in lowercase.
A quick solution to this is to use the strtoupper() and strtolower() strings in PHP, which allow you to put something in lower case or in upper case. There's even a function to uppercase only the first letter. But that will not help one you start to have more than European languages, or when you end up in a tricky situation.
In terms of translations, it is generally accepted that you should try to define the different cases in different terms, and not try to programmatically convert strings to what they are not because, in some language, it will not be possible or logical to do so.
As such, try to be specific about your strings. If you really need to make a difference (for the term meaning or correctness), then indicate a suffix to your identifier, insisting on the fact it should we lowercase, uppercase or capitalized (in the last case, only the first letter is uppercase).
For example: $t_TitleItem_lower. This doesn't break your convention, because the term is still identified by 'TitleItem', but you are giving something more of precision.
Check out
lesson 2 on including punctuation inside web applications translations here
Comments
[...] April 18, 2010 ywarnier Leave a comment Go to comments If you haven’t read lesson 1 on string identifiers for web applications translations, then I strongly recommend you do so [...]