Sorting out web encoding problems

This article was first written in June 2005 for the BeezNest technical
website (http://glasnost.beeznest.org/articles/270).

Introduction

This article is about sorting web forms encoding problems. In particular, it looks into an encoding problem found when serving a UTF-8 form, expecting a completed UTF-8 form to come back, but really getting a ISO-8859-1 or ISO-8859-15 form. It is aimed to be a solution article, explaining step by step how to reverse-engineer such a problem to find a reasonable solution.

HTML character encoding

This article was first written in July 2004 for the BeezNest technical
website (http://glasnost.beeznest.org/articles/139).
To enable all characters to be displayed correctly in an HTML page, even if you use different languages (english, japanese, russian, …), a good way is to encode everything in unicode, using the UTF-8 character set representation.

Server & client config

In Apache config file httpd.conf, one of the following must be defined: #AddDefaultCharset on AddDefaultCharset off AddDefaultCharset utf-8 More info: