Thinking of Learning Python? Start here!

Python is one of the friendliest general-purpose programming languages out there. It is free to use, well supported and used by many big companies. Since its introduction in 1991, it may not have taken the world by storm, but has gained a huge share of programmers’ interest. As of this writing (November 2014), Python is number 8 on the TIOBE Index. Recently I have been studying bioinformatics, and in the course of my studies I have met many people who are learning to program for the first time, and doing it with Python. [Read More]

Unicode character dump in Python

Sometimes you just need to see what characters are lurking inside a Unicode encoded text file. Your garden variety dump utility (like the venerable od in UNIX systems and the Windows standard hex dump (though I don’t think there is one) only shows you the plain bytes, so you have to head over to to find out what they mean. But first you need to decode UTF-8 to get the actual code points, or grok UTF-16 LE or BE, and so on. It’s fun, but it’s not for everyone.

The udump utility shows you a nice list of character names, together with their offsets in the file. Currently it only handles UTF-8, so the offset is calculated based on the UTF-8 length of the character.

[Read More]