UTF-8 and resistance to change

Wouldn't you know... While we (european people at least) are trying to make more and more open-source product use UTF-8 by default, so that we can (finally) share the same encoding with people that use character sets so different from us (mostly Middle to Far-East), resistance to change seems to be an important problem in Japan as well (as everywhere else). On the php-i18n mailing list (see post here), Dietrich Bollmann took the hassle of posting a list of comments he gets from time to time regarding the implementation of UTF-8 inside websites and e-mails, from web developers and web authors. Let's give him some space for a large quote and then I'll give you a few personal views about all this (note that "cellars" actually mean "cellulars/mobile phones"):

Here a list with some of the answers I got (as I got them):

- Most cellars don't work with UTF-8.

...this is the one most important answer I got as lots of
people in Japan use the time they spend in the subway to
read and write their email with their cellar.

Only some cellars work with UTF-8 , most don't. And I often
was told by friends that my email program (I normally use UTF-8 )
"doesn't work correctly" : )  More often I just didn't get any
answer at all...

Based on this experience it is just natural that people don't
switch to UTF-8.  And even if more and more of the newer programs
also work with UTF-8 , probably it will still take a while until
this "tradition" in the Japanese software developer community
will change.

continuing with the answers:

1. I don't like UTF-8.  It is too new, everybody is used to JIS and
  when using UTF-8 there are always lots of problems.
  With JIS things work well out of the box.

2. The file size becomes bigger as there are so many different Characters
  which have to be encoded and Japanese characters are encoded with
  three bytes in UTF-8.

3. There are too many different versions of UTF-8 which create problems.
  There is only one version of JIS which and therefor no version
  problems arise.

4. I only use UTF-8 if absolutely necessary, for example when Chinese
  and Japanese texts are on the same page.

5. when using UTF-8 the characters do not look nice.

6. There is no need for UTF-8 : Japanese and Ascii is all we need in
  normal circumstances, why bother about other languages?

7. Similar Characters are grouped together
  and differences between similar Japanese Characters get lost.

8. Doesn't look good.

9. When only Chinese or only Japanese it looks good, when mixing
  languages the Characters the page gets ugly.

So, what do you think of that? I couldn't imagine easily that reaction. In fact, it made me think about the North-American reaction to other charsets than ASCII. It's a strong resistance to change, expressed in many different ways (particularly strong in point 6). This list is particularly useful in understanding the problems end-users are facing and how things could be improved at the technical level for them to get a better experience with UTF-8. There is a difference between coding systems and charsets (and I recommend following the link in the side-menu to understand that a bit better), but to be short I think the charset might influence the fonts used, in that every computer has a long list of charsets available and, for each of these charsets, there is a set of fonts (some of them being usable by several charsets, some of them not), that some people actually drew. Now this means that, if the guys that drew the images for "ttf-mikachan" (for UTF-8 ) are not as good as the "xfonts-jisx0213" fonts (JIS only?), then people looking at UTF-8 text will think that, because it's UTF-8, it actually looks worst, although they could just change fonts and it would look nice too. Changing default fonts, however, is not the easiest thing to do on a computer (and it might just be impossible on mobile devices). So, if you, out there, are looking into implementing UTF-8 on your website or your e-mail client to communicate with Eastern people, please try to take into account the list of reasons why they don't like it, and hopefully we'll all end-up with better applications and a full adoption of UTF-8.

Latest Articles

Development

UTF-8 and resistance to change

Session management in PHP

Chamilo LMS recognized as a Digital Public Good

Best wishes for 2025

Chamilo

Why so many settings in Chamilo's configura…

Chamilo

Twig errors with custom template on Chamilo upgr…

Blog

Remove permanent redirect (HTTP 301) cache in Fi…

Latest Articles

Development

UTF-8 and resistance to change

Related Blogs

Chamilo

Why so many settings in Chamilo's configura…

Chamilo

Twig errors with custom template on Chamilo upgr…

Blog

Remove permanent redirect (HTTP 301) cache in Fi…