Vincent Gable’s Blog

July 3, 2009

When In Doubt, UTF8

Filed under: Accessibility,Programming | , , , ,
― Vincent Gable on July 3, 2009
/* If you are uncertain of the correct encoding, you should use UTF-8, */
/* which is the encoding designated by RFC 2396 as the correct encoding */
/* for use in URLs.… */

CFURL.h

This echos my experience, when in doubt, choose UTF8 for the web. UTF8 is backwards compatible with 7-bit ASCII (eg. ‘A’ is 0x41 in ASCII and UTF8).

But know that UTF8 is a variable-length encoding: non-ASCII characters maybe represented by > 1 byte. As a general rule with Unicode, I do not expect a char or wchar_t to always map to a character in a string. Encoding details can be messy, e.g. “É” might be represented as one character, or two composed characters “´E”. It never hurts to brush up on Unicode.

No Comments »

No comments yet.

RSS feed for comments on this post.

Leave a comment

Powered by WordPress