Goodbye vCard, hello jCard!

The vCard format has served us well for encoding and exchanging contact information, but there is a better alternative – jCard. In this post I’ll describe why I think jCard can be better than vCard, and should be adopted by every software vendor who deals with contacts. Both vCard and jCard are text-based formats, but whereas vCard has a unique legacy grammar, jCard is based on JSON. They both represent essentially the same data model that describes contact information: name, phone number, address, e-mail, and so on. [Read More]

Unicode character dump in Python

Sometimes you just need to see what characters are lurking inside a Unicode encoded text file. Your garden variety dump utility (like the venerable od in UNIX systems and the Windows standard hex dump (though I don’t think there is one) only shows you the plain bytes, so you have to head over to unicode.org to find out what they mean. But first you need to decode UTF-8 to get the actual code points, or grok UTF-16 LE or BE, and so on. [Read More]

Unicode is in your now

If this blog entry was written 10–15 years ago, the title would have been “Unicode is in Your Future“. Luckily, the Unicode standard has been widely adopted during the last decade, so much so that it has almost become a part of the process and not something that you need to expend very much extra effort on. It is here Now, and has been for some time now. However, Unicode still isn’t quite as widely understood as it needs to be, and it is often adopted as a black box that nobody can really fix when something goes wrong. [Read More]