Functional programming without feeling stupid, part 4: Logic

In the previous parts of “Functional programming without feeling stupid” we have slowly been building ucdump, a utility program for listing the Unicode codepoints and character names of characters in a string. In actual use, the string will be read from a UTF-8 encoded text file.

We don’t know yet how to read a text file in Clojure (well, you may know, but I only have a foggy idea), so we have been working with a single string. This is what we have so far:

(def test-str 
  "Na\u00EFve r\u00E9sum\u00E9s... for 0 \u20AC? Not bad!")
(def test-ch { :offset 0 :character \u20ac })
(def short-test-str "Na\u00EFve")

(defn character-name [x]
  (java.lang.Character/getName (int x)))

(defn character-line [pair]
  (let [ch (:character pair)]
    (format "%08d: U+%06X %s"
      (:offset pair) (int ch)
      (character-name ch))))
    
(defn character-lines [s]
  (let [offsets (repeat (count s) 0)
        pairs (map #(into {} {:offset %1 :character %2}) 
          offsets s)]
    (map character-line pairs)))

I’ve reformatted the code a bit to keep the lines short. You can copy and paste all of that in the Clojure REPL, and start looking at some strings in a new way:

user=> (character-lines "résumé")
("00000000: U+000072 LATIN SMALL LETTER R" 
"00000000: U+0000E9 LATIN SMALL LETTER E WITH ACUTE" 
"00000000: U+000073 LATIN SMALL LETTER S" 
"00000000: U+000075 LATIN SMALL LETTER U" 
"00000000: U+00006D LATIN SMALL LETTER M" 
"00000000: U+0000E9 LATIN SMALL LETTER E WITH ACUTE")

But we are still missing the actual offsets. Let’s fix that now.

Continue reading

Functional programming without feeling stupid, part 3: Higher-order functions

Welcome to the third installment of “Functional programming without feeling stupid”! I originally started to describe my own learnings about FP in general, and Clojure in particular, and soon found myself writing a kind of Clojure tutorial or introduction. It may not be as comprehensive as others out there, and I still don’t think of it as a tutorial — it’s more like a description of a process, and the documented evolution of a tool.

I wanted to use Clojure “in anger”, and found out that I was learning new and interesting stuff quickly. I wanted to share what I’ve learned in the hope that others may find it useful.

Some of the stuff I have done and described here might not be the most optimal, but I see nothing obviously wrong with my approach. Maybe you do; if that is the case, tell me about it in the comments, or contact me otherwise. But please be nice and constructive, because…

…in Part 0 I wrote about how some people may feel put off by the air of “smarter than thou” that sometimes floats around functional programming. I’m hoping to present the subject in a friendly way, because much of the techniques are not obvious to someone (like me) conditioned with a couple of decades of imperative, object-oriented programming. Not nearly as funny as Learn You a Haskell For Great Good, and not as zany as Clojure for the Brave and True — just friendly, and hopefully lucid.

xkcd 1270: Functional
xkcd 1270: Functional. Licensed under Creative Commons Attribution-Non-Commercial License. This is a company blog, so it is kind of commercial by definition. Is that a problem?

In Part 1 we played around with the Clojure REPL, and in Part 2 we started making definitions and actually got some useful results. In this third part we’re going to take a look at Clojure functions and how to use them, and create our own — because that’s what functional programming is all about.

Continue reading

Functional programming without feeling stupid, part 2: Definitions

In this installment of “Functional programming without feeling stupid” I would like to show you how to define things in Clojure. Values and function applications are all well and good, but if we can’t give them symbolic names, we need to keep repeating them over and over.

Before we start naming things, let’s have a look at how Clojure integrates with Java. I’m assuming you are still in the REPL, or have started it again with lein repl.

Track 1: “If anyone should ask / We are mated”

As it happens, Clojure’s core library is lean and focused on manipulating the data structures of the language, so many things are deferred to the underlying Java machinery as a rule. For example, mathematical computation is typically done using the static methods in the java.lang.Math class:

user=> (java.lang.Math/sqrt 5.0)
2.23606797749979

As you can see, this is a function application like we have already seen in Part 1, but this time the function we are using is the sqrt static method in the java.lang.Math class.

Java 7 acquired Unicode character names, and they are accessed through
the getName method in the java.lang.Character class. This is no mean feat, since there are over 110,000 characters in the Unicode standard, and each of them has a name (although some of them are algorithmically generated). To find out the canonical character name of a Unicode character, such as the euro currency symbol, you would use the getName static method:

user=> (java.lang.Character/getName \u20ac)
ClassCastException java.lang.Character cannot be cast to java.lang.Number user/eval703 (NO_SOURCE_FILE:1)

Hey, what’s wrong? Well, if you look up the documentation of java.lang.Character.getName, you will find out that it takes an int value as an argument, not a character. You can actually do this check from inside the REPL:

user=> (javadoc java.lang.Character)
true

The REPL doesn’t seem to do much, but you should now have a new web browser window or tab open, with the JavaDoc of the java.lang.Character class loaded up. That’s what the REPL meant when it said

Javadoc: (javadoc java-object-or-class-here)

when it started up. The getName method does need an int value, so let’s try something else:

user=> (java.lang.Character/getName (int \u20ac))
"EURO SIGN"

All right! How about another one:

user=> (java.lang.Character/getName 67)
"LATIN CAPITAL LETTER C"

Well, some say that Clojure is the new C.

Continue reading

Functional programming without feeling stupid, part 1: The Clojure REPL

In my recent post Functional Programming Without Feeling Stupid I took a quick look at how functional programming can be a little off-putting for the non-initiated. I promised to provide some examples of my own first steps with FP, and now I would like to present some to you.

Advocates of functional programming often refer to increased programmer productivity. At least some of that can be attributed to the REPL, or the Read-Evaluate-Print Loop. We are basically talking about an environment which accepts and parses any code you type in, and gives you a place to experiment and see results quickly. Before interpreted or semi-interpreted languages like Python, Java and JavaScript became mainstream, the typical repeating cycle in software development was Compile-Link-Execute, and debugging meant observing special output on the console. In the 1990s integrated debuggers with watches and breakpoints became the norm, but long before that Lisp-like languages already had a REPL, and Python also acquired one.

If you are thinking about getting intimate with Clojure, you will need to get to know the REPL. It is your playground, and will always be, even if you later start packaging and organizing your code.

Clojure depends on the Java Virtual Machine (JVM) and is actually distributed as a normal JAR file (Java ARchive), like most Java libraries are. You can start Clojure from the JAR, but you will save yourself some trouble and prepare for the future if you install Leiningen, the dependency management tool for Clojure. It is simple to install and run, and I will assume that you will follow the instructions on the Leiningen web site sooner or later. Now would be a good time.

When you’re done with the installation, you only need to say

lein repl

to start a Clojure REPL. I’m using OS X, so what I describe here was done from Terminal. You don’t need to create a project with Leiningen if you just want to play around in the REPL.

Of course, if you don’t have Java installed, you need to get it first. Refer to the Java web site of Oracle for details as necessary. Furthermore, some of the things I will describe require Java version 7 or later.

Continue reading

Functional programming without feeling stupid

If you follow software design trends (yes, they exist), you may have noticed an increasing amount of buzz about functional programming, and particularly the Clojure language. While functional programming is hard to define, almost everyone mentions pure functions, the lack of side effects and state, and easy parallelisation. As for Clojure, it is all about (a kind of) Lisp running on the Java Virtual Machine (and .NET, and transformed to JavaScript).

I’m somewhat convinced that functional programming is at least worth knowing about and trying out, even if you don’t expect to fully convert. It has been said that learning about the functional paradigm makes you a better programmer in your current imperative language. Functional languages reduce accidental complexity, and that helps you focus.

“Whoop de doo, what does it all mean, Basil?”

If you have a background in imperative languages, you will have an interesting time if and when you start digging into functional programming, because whatever else it is, it’s different. And I’m not talking about syntax only, but most of what you do. If you need to add an item to a list, you construct a new list with the new item appended to the previous list (no, it is not as inefficient as it sounds, because there is great stuff under the hood to handle that). This is because immutability is one of the cornerstones of functional programming. If you can’t change something after it is created, there is no state to mess up. You program with values, not stateful objects.

I see I’m getting myself tricked into presenting a definition of functional programming, when that has been done better elsewhere. For pointers, see Michael Fogus’ 10 Technical Papers Every Programmer Should Read (At Least Twice), including the classic “Why Functional Programming Matters” by John Hughes. But I actually wanted to talk about something else.

Continue reading

Thinking of Learning Python? Start here!

Python is one of the friendliest general-purpose programming languages out there. It is free to use, well supported and used by many big companies. Since its introduction in 1991, it may not have taken the world by storm, but has gained a huge share of programmers’ interest. As of this writing (November 2014), Python is number 8 on the TIOBE Index.

Recently I have been studying bioinformatics, and in the course of my studies I have met many people who are learning to program for the first time, and doing it with Python. Others have a little bit of programming experience, but not in Python. Luckily Python is an excellent language for both groups, because it is clean and easy to learn, but it can still be powerful and expressive.

Beginners, step this way

Learning programming is not easy, but some of the things you need to understand are the same no matter what programming language you study. That is why I recommend Think Python by Allen Downey to all beginners. I’ve been programming for close to 30 years now, and I think that this book is one of the most accessible introductions to programming in general, and Python in particular. The subtitle of the book is “How to think like a computer scientist”, which essentially means “problem solving”. You need to be able to take apart what you are trying to achieve, and then find ways to make the computer do what you mean.

Think Python

Think Python is free to download from Green Tea Press in PDF format. However, if you want a printed book, you can buy one from O’Reilly.

Seasoned experts, check this out

I first learned Python in the early 2000s, when the language was still relatively unknown, but already had a lot of users. Since I learn best from a good book, I spent some time looking for one about Python, and quickly found Learning Python by Mark Lutz. At the time it was not a lean book anymore: the 2nd edition, which covers Python 2.3, already came up to almost 600 pages. Still, it is an easygoing book which has only gotten better with time.

Learning Python

In the recent years I’ve gone strictly e-book only, because I don’t have the shelf space for all the books I want or need, and e-books are also a lot cheaper. My whole programming library fits on my iPad, so it is with me wherever I go. New editions of a popular book like Learning Python typically accumulate more material over the years; the latest, 5th edition covers both Python 2.7 and 3.3, and comes up to (count ’em) 1540 pages. That might already be a little too much for a “learning” book, but there you have it.

To each their own

As a summary:

  • Absolute beginners in programming who want or need to learn Python, get Think Python by Allen Downey.
  • Those who already know a little bit about programming, and want to learn Python,
    get Learning Python by Mark Lutz.

This post contains links to the O’Reilly webstore. If you follow the links and buy a book, I will get a minuscule commission. However, I was using both of these books professionally before I became an O’Reilly affiliate, and I want people to know about them and benefit from them.


oreilly.com - Your tech ebook super store

HipStyles End Of Life

On August 15, 2014, HipStyles will be retired from the App Store.

I have decided to end the development of HipStyles, for good. In practice it was on hold for a long time due to other commitments, but after long last, version 2.0 was released in April 2014. Now it’s time to finally close the curtain on this act.

Since November 2012, HipStyles has tracked the new gear in Hipstamatic, and provided an easy way to find shots, or HipstaPrints, taken with a specific combination of lens, film, and flash. Some iPhoneographers have found it useful, since it provides a function Hipstamatic itself doesn’t.

Here’s a summary why I think HipStyles failed to make a dent in the universe:

  • Lack of initial validation of the product idea
  • Premium pricing probably alienated many potential customers
  • The execution failed by some accounts: too many or wrong features, with bugs

Continue reading

The FIFA World Cup is not internationalized

At the time of this writing, the FIFA World Cup is almost at its final stage. Throughout the whole month of the tournament I have been amazed to find out that while UEFA has consistently made an effort to have the players’ names written as authentically as possible, FIFA hasn’t. Information conveyed to viewers in televised matches is transliterated, making many players’ names appear irritatingly different than their national, conventional spellings.

It can be argued that FIFA has a tougher job with various languages and characters. After all, the 2014 World Cup had Japan, South Korea, Iran, Russia, and Greece, among others. These countries and their languages alone represent half a dozen character sets. There is a point in trying to achieve some sort of baseline transliteration, so that all names are expressed in the Latin character set. In practice this means that Russian, Greek, and other names are transliterated by default. However, names which can be perfectly written in extended Latin should not be transliterated. Now it seems like everything has been folded down to basic ASCII.

There is no technical reason to limit the displays on television screens to what is essentially a 7-bit character set, but this is what the result amounts to. For example, one of the finalists in this World Cup is Germany, and many of the players have characters with umlauts in their names. Here are a few examples:

Oezil should be Özil
Mueller should be Müller
Goetze should be Götze
Hoewedes should be Höwedes

EDIT: Also, the German coach is Joachim Löw, not Loew (or Low, as the content on the official FIFA apps would have us believe).

The players’ jerseys have the truth anyway, and it’s in direct contrast with what you see when somebody scores a goal, gets booked or gets sent off.

So please, FIFA, take a leaf from UEFA’s book and find out how to use modern broadcasting technology to the advantage of all football fans around the world (even if they might call it soccer). You might also start with a good book like “Unicode Explained” by Jukka K. Korpela.

BusMonTRE ja Tampereen seudun joukkoliikenneuudistus

Tampereen seudun joukkoliikenteessä tehdään mittava uudistus 30.6.2014. Liikennealue jaetaan maksuvyöhykkeisiin, ja linjasto sekä sen numerointi muuttuvat. Tarkempia tietoja uudistuksesta voit lukea joukkoliikennealueella talouksiin jaetusta Sinisten bussien matkassa -lehdestä, uudesta vuoden 2014 kesäaikataulukirjasta sekä Tampereen joukkoliikenteen nettisivuilta.

Uudistus vaikuttaa myös BusMonTRE-pysäkkiavustinohjelmaan, joka on saatavana iPhone-, Android- ja Windows Phone -puhelimille. Viimeisimmässä 0.3-versiossa lisättiin tiedot siitä, mitkä linjat kulkevat minkäkin pysäkin kautta. Kuten kaikki BusMonTRE:n käyttämä aineisto, tämäkin tieto perustuu Tampereen joukkoliikenteen avoimena datana julkaisemiin tiedostoihin ja rajapintoihin. Uudistuksen myötä linjojen numerointi muuttuu, joten myös BusMonTRE-ohjelmaa pitää päivittää siltä osin. Toki myös pysäkkitiedot muuttuvat aika ajoin.

Tietoja siitä, mitkä linjat kulkevat minkäkin pysäkin kautta, ei tällä hetkellä ole saatavana suoraan Tampereen joukkoliikenteen aineistoista, vaan ne on kerätty tosiaikaista pysäkkitietoa tuottavan rajapinnan kautta. Oletettavasti tämä rajapinta päivittyy 30.6.2014 tuottamaan muuttuneen tilanteen mukaisia tietoja.

BusMonTRE:n kannalta tämä on pieni ongelma, koska uusien tietojen tullessa ohjelman päivitys kestää toki tovin, mutta vielä pidempään kestää päivitysten saaminen käyttäjille. Tämä vaikuttaa erityisesti iPhone-käyttäjiin, joita BusMonTRE:n käyttäjistä on suurin osa. Päivitysversion saaminen Applen App Store -sovelluskaupan arviointiprosessin läpi kestää yleensä arviolta viikon tai kaksi. Android- ja Windows Phone -versioiden päivitykset saadaan käyttäjille huomattavasti nopeammin.

Koska tätä kirjoitettaessa eletään jo kesäkuun 16. päivää, ei ajantasaisia linjatietoja kenties saataisi kaikille käyttäjille uudistuksen voimaantuloon mennessä, vaikka ne saataisiin käyttöön saman tien. Niinpä BusMonTRE tulee mitä todennäköisimmin näyttämään virheellisiä pysäkkikohtaisia linjanumeroita niin kauan kuin uudet tiedot saadaan kerättyä ja päivitettyä ohjelmaan, ja päivitykset saadaan sovelluskauppoihin kullakin kolmella älypuhelinalustalla.

Koska BusMonTRE on edelleen kehitysvaiheessa, pysäkki- ja linjatiedot on “leivottu sisään” ohjelmaan. Teknisesti valveutunut lukija voi ihmetellä, miksi BusMonTRE ei käytä omaa palvelinta, jolta tiedot luettaisiin suoraan, eikä oltaisi riippuvaisia Tampereen joukkoliikenteen päivityksistä. Tässä tapauksessa sekään ei auttaisi, koska tarvittavia tietoja ei ole käytössä ennen uudistusta. Muutoin olisikin ollut yhtä helppo päivittää ne ohjelmaan jo aikaisemmin, ja siirtyä käyttämään tuoreita tietoja siinä vaiheessa kun 30.6.2014 on älypuhelimen kellon mukaan koittanut.

Pyrimme saamaan mahdollisimman lyhyeksi sen ajanjakson jonka BusMonTRE näyttää vanhentuneita linjatietoja. Toivottavasti tästä ei aiheudu kenellekään kovin paljoa haittaa.

Mikäli sinulla on kysyttävää tai kommentoitavaa, ota yhteyttä sähköpostilla: busmontre (at) coniferproductions (dot) com

 

Git with the program – use version control

If you are programming, and you are still not using any form of version control, you really have no excuse. There are many benefits to being able to keep track of your code and try out various branches, even if you are the only programmer in the project. If you are collaborating with someone, it soon becomes nearly impossible (or at least very time-consuming) to deal with various versions and changes.

Of all the version control systems I’ve tried over the years (CVS, Subversion, a little bit of Mercurial, and Git) it seems that Git has “won” in a sense. There is a sizable open-source community born around GitHub (and Bitbucket) for which Git works very well indeed. Also many programming tools have built-in or plug-in support for Git, so you don’t even have to use command-line tools for managing your source code repositories if you don’t want to.

For open-source development, GitHub is the obvious choice. If you’re doing closed source, or you think your code isn’t ready for public scrutiny, Bitbucket gives you unlimited private repositories. I’m currently using GitHub to collaborate on some private repositories, which you can get with a paid plan, and Bitbucket for my closed-source app projects.

In a spirited attempt to really learn to use the tools of my trade, I wanted to take some time to better learn Git for version control (and also dive deeper into Xcode, but that is another story).

Earlier I’ve occasionally been using the fine tome Version Control with Git, 2nd Edition* by Jon Loeliger and Matthew McCullough to learn the basics, but I wanted to really dive in. I’ve already mastered the very basics, and have also used remote repositories with both GitHub and BitBucket, but there is a lot more to learn to be able to really take advantage of Git.

Version Control with Git

* Disclaimer: I’m an O’Reilly affiliate, and the links above take you to the O’Reilly online bookstore, in the hope that you purchase something, so that I will get a small commission.

Continue reading