Colorful source code in Terminal

Often it is quicker to take a look at a source code file in Terminal using the cat or less commands, instead of starting up an editor, especially if you don’t need to make changes. However, like myself, most developers are used to syntax highlighting, or presenting source code in various colours. It makes the different elements in the source code stand out, and helps with comprehension. I find that the older I become, the more I need syntax highlighting, and I think back to a time when it was not so common, amazed that I could make sense of anything. (Of course younger brains have more cycles to burn.)

An easy way to get syntax highlighting to your macOS Terminal is to install Pygments. It is a Python-based source code colorizer library with a command-line interface.

First, if necessary, install Python 3 with Homebrew, make sure your pip3 tool is up to date, and then install Pygments:

brew install python3
pip3 install --upgrade pip setuptools
pip3 install pygments

Now you should have the pygmentize command in your system, and

pygmentize -h

will give you an overview.

Pygments supports many programming languages and has several built-in styles. Since I work with Xcode a lot, I like to see similar syntax highlighting in the Terminal, so a typical command for me would use the ‘xcode’ style like this:

pygmentize -f terminal256 -O style=xcode -g

Pygments tries to infer the correct formatting from the file extension, but the -g flag makes it also look at the contents of the file. The -f terminal256 option directs Pygments to output 256-color ANSI escape sequences.

These options are a little too much to type every time I need syntax highlighting, so I’ve defined an alias in my ~/.bash_profile file:

alias pcat='pygmentize -f terminal256 -O style=xcode -g'

so that I can just say pcat

If you need paging, you’ll probably use the less utility, but you need to use the -R option to interpret the ANSI codes emitted by Pygments:

pcat | less -R

If this doesn’t make your Terminal colourful enough, you can always install lolcat!

Time signals with the Raspberry Pi

Time signals have been broadcast by various radio stations for almost 100 years, usually “every hour, on the hour”, or every 60 minutes. The tradition was started by the BBC, but has been adopted by many national broadcasting companies and other broadcasters as a way of informing their listeners about the passing of time. The history of the Greenwich time signal, or “the BBC pips” is detailed in Mike Todd’s article.

This article gives you details on how to generate your own time signals using the Raspberry Pi. It assumes you have a RasPi up and running. It doesn’t really matter which hardware version, as this method should work on all of them (but do let me know if there are limitations – I’ve tested this on Raspberry Pi 3 Model B).

Originally I got the idea of replicating the broadcaster’s time signal from an article about the Finnish Broadcasting Company YLE’s version. Only later I realised that the Finnish version has been adopted straight from the original BBC time signal. If you’re Finnish, you should read the article “Yleisradion aikamerkki on radioklassikko” about the origins of the Finnish tradition.

First, install SoX

SoX, or Sound eXchange, is the Swiss Army knife of audio manipulation in UNIX-like environments. You should install SoX on your Raspberry Pi by issuing the command

sudo apt-get install sox

This command installs the basic commands used in this article. Wait until the command is finished, and then read the manual page with the command man sox.

Most of this will work as is on macOS, so if you’re on the Mac, use Homebrew to install SoX.

Shell script to signal time

Using SoX, it is trivial to generate the basic ingredients of the the time signal. From sources we know that it consists of five short 0.1 second beeps, interspersed with 0.9 seconds of silence (so that each pip lasts for exactly one second), followed by a final beep of 0.5 seconds. Each beep is a sine tone with a frequency of 1000 Hz.

A tone like this can be generated and played using the SoX play command:

play -n synth 0.1 sin 1000

The use of a ‘null file’ with the -n parameter is sort of implicitly documented, but once you know that the name ‘synth’ generates audio, you can find the parameters easily in the man page. And, as you would expect, the synthesiser can do a lot more than generate sine wave beeps.

Using the play command and the UNIX sleep command together makes it trivial to write a small shell script to replicate the whole time signal. Open a programming editor and save the following as ‘’:


# file:

play -n synth 0.1 sin 1000

sleep 0.9

play -n synth 0.1 sin 1000

sleep 0.9

play -n synth 0.1 sin 1000

sleep 0.9

play -n synth 0.1 sin 1000

sleep 0.9

play -n synth 0.1 sin 1000

sleep 0.9

play -n synth 0.5 sin 1000

Now you can run this in your terminal with the command:


Or, you can give it executable rights with:

chmod u+x

And run it with just:


If you have connected the audio output of your Raspberry Pi to a speaker, you should hear the time signal beeps. I’m using a Tivoli Audio PAL with the RasPi connected to its AUX IN with a standard audio cable. I also selected Audio from the Raspberry Pi graphical user interface. You should also be able to get HDMI audio; refer to the Raspberry Pi audio configuration page for details.

Timing the time signal

You could use the at command on the Raspberry Pi to run a command (like a shell script), but that would run it only once. We need a way to run the script repeatedly.

In UNIX-like environments you can schedule a command using cron. It is not a command, but a system daemon that consults its own table called crontab to determine what commands to run and how frequently. For a basic overview, read Scheduling tasks with cron on the Raspberry Pi website.

Note that cron works on a per-user basis, so be sure to edit the right user’s crontab file. Typically, when you’re logged in and you want to edit your own crontab entries, you just say

crontab -e

If and when you are prompted to select and editor, I recommend that you select nano. If you’ve never used nano before, you only really need to know two commands for now, Ctrl + O to save the file, and Ctrl + X to exit (both are shown for you at the bottom of the screen).

After you have edited the crontab file, you can check your scheduled jobs with crontab -l; effectively it just dumps your crontab file on the screen.

When I was testing this solution, I didn’t want to wait for an hour to find out if my crontab entry worked, so I used the “every minute” option. My crontab entry thus looks like this:

* * * * * ~/Projects/TimeSignal/

To actually run hourly, use 0 for the minutes. Also, if you want to restrict the time signal to office hours (say from 8 a.m. to 4 p.m.), specify the hours also, like this:

0 8-16 * * * ~/Projects/TimeSignal/

Consult the cron manual page (with the command man cron) for more details, and use the cron sandbox to test your entries.

There is one problem, though: cron does not deal in seconds, but the time signal should be started six seconds before the hour. Currently I don’t have a solution to this, but if you do come up with one, let me know.


When I started to write this post, I realised it has been a year since the last one. Wow. Let this topic be a signal to mark the occasion. Note to self: must blog more often.

Also, coincidentally, yesterday was Pi Day (because in some cultures the date is expressed as 3.14 or 3/14, which is a poor substitute for the estimated value of pi, but I digress). Besides, Stephen Hawking passed away yesterday. May he rest in peace.



Book review: Data Science at the Command Line

No matter how handy graphical user interfaces are, the good old command line remains a useful tool for performing various low-level data manipulation and system administration tasks. It is the fallback when you need to do something that has no way of graphical control. Being much more expressive and open-ended than a predefined set of controls, the command shell is the ultimate control environment for your computer.

Data science has become one of the most intensely practised computer applications, so it is no wonder that it also benefits greatly from the hands-on control approach of the command line shell. Data scientist Jeroen Janssens has had the foresight to combine the fundamental operations of data science and the most suitable command line tools into a book that collects many useful practices, tips and tricks for processing and preparing data, called “Data Science at the Command Line” (O’Reilly, 2014).

Data Science at the Command Line

At its highest abstraction levels, data science involves using models and machine learning to extract patterns from data and extrapolate results from data sets that are often much larger than fits in memory at any one time. At a lower level, it involves multiple file formats and just plain hard work to get the data in a fit shape to be analysed, and this is where the command line comes in.

There is only so much you can do with canned tools like text editors, but a world of possibilities opens for you when you have the power can chain simple commands together, forming pipelines of data where one command’s output becomes another one’s input. You can also redirect input from a file to a command, and from a command to a file.

Even though Linux and macOS installations have various command shells, apart from the defaults, Janssens shows you how to use a set of tools called the Data Science Toolbox, which actually uses VirtualBox or Vagrant to plant a self contained GNU/Linux environment with Python, R and various other tools of the trade on your local machine, without disturbing the host operating system too much.

With real-life examples, Janssens shows you how to use classic Linux command line tools like cut, grep, tr, uniq and sort to your advantage. You will also learn how to get data from the Internet, from databases and even Microsoft Excel spreadsheets, where most of the world’s operational data lies hidden from plain sight.

From this book I learned completely new and interesting ways to work with CSV (Comma Separated Value) files, and it introduced me to the excellent csvkit, with its collection of power tools to cut, merge and reorder columns in CSV files, perform SQL-style queries on the lines, and grep through them.

Among other things you get information on Drake, described as “make for data” – which, if you’re familiar with the classic software development tool make (and of course you are) should whet your appetite. There is also a chapter about how to make your data pipelines run faster by parallelising them and running commands on remote machines.

Scrubbing the data is less than half the fun, but usually more than half of the work in data science. You will learn to write executable scripts in Python and R with their comprehensive data science and statistics libraries, and learn to explore your data using visualisations that consist of statistical diagrams like bar charts and box plots. So the command line is not just text; even though the images are generated using commands, they are obviously shown in a GUI window.

Finally, there is a chapter on modelling data using both supervised and unsupervised learning methods, which serves as a cursory introduction to machine learning, although you are referred to more comprehensive texts on the algorithms involved.

At the back of the book there is a handy reference for all the commands discussed in the book, which include many of the old UNIX stalwarts found in Linux, but also newer tools like jq for processing JSON.

If you need to do data preparation for a data science project, you owe it to yourself to become good friends with the command line, and utilise the many tools described in Janssens’ book in your daily work. Even if you don’t “automate all the things“, you will benefit from the pipeline approach to data processing.

Buy the e-book at the O’Reiily web shop:
Data Science at the Command Line

The book also has a website,, where you can preview some of its content.

For the history and philosophy of the command line, you should read Neal Stephenson’s In the Beginning Was the Command Line.