Web/database development and more...
« Data Modelling ToolsSurvival guide to i18n »

Charset conversions (i18n)

17.04.04

Charset conversions (i18n)

  14:51:53, by fplanque   , Categories: Media Web, Internationalisation

Yesterday, I came accross this interesting table which lets me know what conversions I need to do when I paste text from Word into a textarea and further want to use this text on the web...

To be accurate, this table is useful for conversion from the default windows charset (windows-1252 aka CP1252) to the default web charset (ISO-8859-1 aka Latin-1). Nethertheless, this allowed me to check the conversion in my b2evolution software and I noticed that it was missing one conversion (in a total of 27).

Anyway, the world actually extends way beyond cp1252 and Latin-1, so how would one deal with other languages? :?:

For example, how do I convert Latvian from Windows-1257 to iso-8859-13 (close match) ? Or Russian from Koi8-r to iso-8859-5 (funky match) ? Check out this awesome character set database provided by the Institute of the Estonian Language. (Wouldn't it make sense if unicode.org provided this? :crazy:)

By the way, how do I know what charsets are to be used for a particular language? Here's a page by the W3C, but it's a little sparse... Another one.

1 comment

Comment from: J.o.sue [Visitor]
J.o.sue

how do I convert Hebreu

2006-03-03 @ 18:24