Charset conversions (i18n)

Yesterday, I came accross this interesting table which lets me know what conversions I need to do when I paste text from Word into a textarea and further want to use this text on the web...

To be accurate, this table is useful for conversion from the default windows charset (windows-1252 aka CP1252) to the default web charset (ISO-8859-1 aka Latin-1). Nethertheless, this allowed me to check the conversion in my b2evolution software and I noticed that it was missing one conversion (in a total of 27).

Anyway, the world actually extends way beyond cp1252 and Latin-1, so how would one deal with other languages? :?:

For example, how do I convert Latvian from Windows-1257 to iso-8859-13 (close match) ? Or Russian from Koi8-r to iso-8859-5 (funky match) ? Check out this awesome character set database provided by the Institute of the Estonian Language. (Wouldn't it make sense if unicode.org provided this? :crazy:)

By the way, how do I know what charsets are to be used for a particular language? Here's a page by the W3C, but it's a little sparse... Another one.

Survival guide to i18n

Has an interesting conversion table from win-1252 to Unicode.

Citation du jour

"Selon la plupart des experts, le monde sera très probablement détruit par accident. C'est là que nous intervenons. Nous sommes des informaticiens professionnels. Nous provoquons les accidents."

-Nathaniel Borenstein

A méditer! ;)

Atom frustration

Today I thought it was time for me to catch up on Atom and add support for this format to b2evolution.

Okay, done. Here's my validates too.

But I had to leave out a link to comments as well as my categorization! There's no support for these in the spec yet. What a shame! :no:

And finally, the biggest frustration: checked out my feed in the popular SharpReader aggregator... and it turns out it doesn't support "multipart/alternative" content. Bleh! >:XX So I had to leave that out too...

So at least, we have b2evolution supporting Atom in croncrete terms now... but if you ask me, RSS 2.0 is still the most useful syndication format! :roll: