Goodbye vCard, hello jCard!

The vCard format has served us well for encoding and exchanging contact information, but there is a better alternative – jCard. In this post I’ll describe why I think jCard can be better than vCard, and should be adopted by every software vendor who deals with contacts.

Both vCard and jCard are text-based formats, but whereas vCard has a unique legacy grammar, jCard is based on JSON. They both represent essentially the same data model that describes contact information: name, phone number, address, e-mail, and so on.

While vCard is widely supported, it is difficult to deal with. Being JSON-based, jCard has the advantage of being ready for use as soon as you parse it. You can even pluck values straight from the JSON object tree you get from the parser. Like vCard versions 3.0 and 4.0, jCard is an IETF proposed standard, RFC 7095.

Both of them represent a very complex data model, and both are semantically quite complicated. Still, I maintain that it is time to leave vCard behind and move on, because… well, read on.

vCard – hard to parse, hard to generate, almost easy to read

On the surface, vCard seems simple enough. It is a line-based format, where each line has a delimited format. Here is a simple sample of a vCard from randomuser.me, converted to vCard 3.0 by hand:

BEGIN:VCARD
VERSION:3.0
FN:Marilou Lam
N:Lam;Marilou;;Ms;
ADR:;;3892 Duke St;Oakville;NJ;79279;U.S.A.
EMAIL:marilou.lam@example.com
BDAY:1967-06-09T06:59:48-05:00
END:VCARD

Doesn’t seem to bad, does it? Well, I’ve written my share of ad hoc vCard parsers and generators in the past (in Objective-C, Java, and C#), and never went all the way, because when you start looking into the corner cases and the special handling required for many fields, you want to run away and hide.

First of all, the lines are not just lines. They could continue on to the next, and the process of continuation is complex, but not as complex as joining them back together, so most naïve vCard handlers just ignore continuation. By the way, according to the specification the line delimiter is a CR-LF pair.

The properties are separated from the values with a colon. Some properties may have multiple fields, so those need to be separated with a semicolon. Some properties may have multiple values, so those need to be separated with a comma. The fun begins if you need to have any of those separators in a value (it could happen, you know). They need to be escaped with a backslash. If you need to have a backslash, it’s time for another backslash. And newlines in values need to be expressed just like in many programming languages, with \n. It’s complicated.

So vCard may be quite human-readable, but that is neither here nor there, because vCards are not meant for eyeballing. They are meant for information exchange between computer systems. That also brings up the not insignificant issue of character encodings.

Y U NO I18N?!

There are several versions of vCard out there. The latest version, 4.0 (which jCard is also based on), sensibly mandates UTF-8 as the character encoding of the vCard. Unfortunately nobody seems to support vCard 4.0.

Version 3.0 of the vCard specification says the character encoding must be specified in the Content-Type MIME header field, using the charset parameter. That is fine, unless you don’t use HTTP to transmit the vCard, at which point you either agree between parties or guess.

Poor old version 2.1 from 1996, still in active duty, has the ASCII character encoding as the default, unless you specify something else with the CHARSET parameter. For every property. As you can imagine, nobody does that, and then we get stuff like “Käpyaho” for my last name (and I really resent that).

The default character encoding for JSON is UTF-8, which is really nice. If you don’t trust your endpoints, you can always \uXXXX escape everything beyond ASCII in your JSON string values.

jCard – easy to parse, fairly easy to generate, hard to read (except when pretty-printed)

Consider the jCard equivalent of the vCard above. Because it is JSON, it has a uniform syntax, so that you can feed it into any JSON parser and reap the rewards. It looks something like this:

["vcard",
  [
    ["version", {}, "text", "4.0"],
    ["fn", {}, "text", "Marilou Lam"],
    ["n",
      {},
      "text",
      ["Lam", "Marilou", "", "", "Ms"]
    ],
    ["adr", {}, "text",
      ["", "", "3892 Duke St", "Oakville", "NJ", "79279", "U.S.A."]
    ],
    ["email", { }, "text", "marilou.lam@example.com"],
    ["bday", {}, "date-and-or-time", "1967-06-09T06:59:48-05:00"]
  ]
]

(Sorry if the formatting is a mess, I’m trying to figure out how to present it. If you see code tags, they are not part of the jCard.)

As you can see, the jCard has a standard JSON syntax, but the content itself is not typical JSON. It is a polymorphic array, containing string and arrays. The top-level object is an array, so that you can have multiple contacts in one payload.

All the vCard properties are arrays where the first item is a string identifying the property. The rest are objects, simple values or arrays, based on the property. The jCard specification goes into great detail explaining how they are constructed.

The JSON example above is verbose, because it has been pretty-printed to be eyeballed by a human. If you were to condense it for online transmission, it would be like this:

["vcard",[["version",{},"text","4.0"],["fn",{},"text","Marilou Lam"],
["n",{},"text",["Lam","Marilou","","","Ms"]],["adr",{},"text",
["","","3892 Duke St","Oakville","NJ","79279","U.S.A."]],
["email",{},"text","marilou.lam@example.com"],["bday",{},
"date-and-or-time","1967-06-09T06:59:48-05:00"]]]

I only cheated a little by inserting a couple of newlines, but otherwise it’s tight, no-nonsense JSON.

The devil is in the details

So, vCard is difficult to parse, but jCard is not, at least for a JSON parser. However, semantically jCard is still just as complicated.

There are obviously inherent difficulties in representing contact information, and while jCard sidesteps the obvious issues of parsing and generating, the semantics are still there to be dealt with. Some might argue that it would have been better to create a whole new data model for jCard, instead of defining a JSON-based equivalent, but I don’t think it would have made much difference, going by the low adoption rate of jCard by major vendors.

Depending on the programming language, it is relatively easy or quite painful to deal with the results from a parsed jCard. If you have dynamic typing, you can just skate away. But if you have a strongly typed language like Java or Swift, you will need to be creative.

I want/need to parse jCards in an iOS application that I’m in the process of converting to Swift from Objective-C. Last time I got all the way to Swift 2.2, but then the work on the app stalled for a while, and in the meantime, Swift 3 happened. Earlier I had written some helpers to get from jCard to the iOS Contacts framework, but they did some naughty things which are not allowed in Swift 3 anymore.

So I decided to bite the bullet and create a proper library for dealing with jCards in Swift, called ContactCard. It is open sourced under the MIT License, and is up on GitHub if you would like to have a look or contribute. It’s very early alpha, so I expect it to improve a lot in a short time as I prepare it for the app. I intend to eventually publish it on CocoaPods too, and I hope to write some more about how I intend to solve the problems of polymorphic values in jCard properties.

 

LCD-like banners in Python

Back in 1998 or so, I wrote a CD player application for Microsoft Windows in Borland Delphi. It was for a magazine tutorial article, and I wanted a cool LCD-like display to show track elapsed and remaining time. There was a good one available for Delphi, called LCDLabel, written by Peter Czidlina (if you’re reading this, thanks once more for your cooperation).

I’ve been thinking about doing a modern version of the LCD display component for several times over the years, and I even got pretty far with one for OS X in 2010, but then abandoned it because of other projects. A few years ago I did some experiments with the LCD font file and wrote a small Python app to test it.

My most recent idea involving simulated LCD displays is to create a custom component for iOS and OS X in Swift. For that, I dug up the most recent Python project and tried to nail down the LCD font file format, so that I could later use it in Swift. I decided to use JSON.

The LCD font consists of character matrices, typically 5 columns by 7 rows, each describing a character on the LCD panel. The value of a matrix cell is one if the dot should be on, and zero if it should be off. I decided to store each cell value as an integer, even if it is a bit wasteful – but it is easy to maintain, and if you squint a bit, you can see the shape of the LCD character.

So the digit zero would be represented as a 2-dimensional matrix like this:

[
[0,1,1,1,0],
[1,0,0,0,1],
[1,0,0,1,1],
[1,0,1,0,1],
[1,1,0,0,1],
[1,0,0,0,1],
[0,1,1,1,0]
]

The font consists of as many characters as you like, but you need to identify them somehow. In JSON, you can do this with one-character strings, where the sole character is the Unicode code point of the character. So, with some additional useful information, a font with just the numeric digits 0, 1, and 2 would be represented in JSON like this:

{
"name": "Hitachi",
"columncount": 5,
"rowcount": 7,
"characters": {
"\u0030": [
[0,1,1,1,0],
[1,0,0,0,1],
[1,0,0,1,1],
[1,0,1,0,1],
[1,1,0,0,1],
[1,0,0,0,1],
[0,1,1,1,0]
],
"\u0031": [
[0,0,1,0,0],
[0,1,1,0,0],
[0,0,1,0,0],
[0,0,1,0,0],
[0,0,1,0,0],
[0,0,1,0,0],
[0,1,1,1,0]
],
"\u0032": [
[0,1,1,1,0],
[1,0,0,0,1],
[0,0,0,0,1],
[0,0,0,1,0],
[0,0,1,0,0],
[0,1,0,0,0],
[1,1,1,1,1]
]
}

With the font coming along nicely, I wrote a Python script to exercise it, by printing a banner-like message:

import json

def banner(message):
    mats = []
    for ch in message:
        mats.append(characters[ch])

    output = ''
    num_rows = len(mats[0])
    num_cols = len(mats[0][0])

    for r in range(0, num_rows):
        for m in mats:
            for c in range(0, num_cols):
                if m[r] == 1:
                    output += 'X'
                else:
                    output += '.'
            output += ' '
        output += '\n'
        
    return output

font_data = None
with open('lcd-font-hitachi.json') as json_file:
  font_data = json.load(json_file)
  characters = font_data['characters']

print(banner('012'))

Running this Python script would print out a banner like this one:

.XXX. ..X.. .XXX.
X...X .XX.. X...X
X..XX ..X.. ....X
X.X.X ..X.. ...X.
XX..X ..X.. ..X..
X...X ..X.. .X...
.XXX. .XXX. XXXXX

By adding characters to the JSON font file it becomes possible to print text messages instead of just numbers:

X...X ..... .XX.. .XX.. ..... ..X..
X...X ..... ..X.. ..X.. ..... ..X..
X...X .XXX. ..X.. ..X.. .XXX. ..X..
XXXXX X...X ..X.. ..X.. X...X ..X..
X...X XXXXX ..X.. ..X.. X...X ..X..
X...X X.... ..X.. ..X.. X...X .....
X...X .XXX. .XXX. .XXX. .XXX. ..X..

But I think that a custom control for iOS in Swift would see most use in games or applications displaying numeric parameters like volume level, geographical coordinates or RPM.

If you want to learn Python, here is a good book:

Learning Python
Learning Python
by Mark Lutz

Learning Clojure

About one year ago I wrote a multi-part tutorial on Clojure programming, describing how I wrote a small utility called ucdump (available on GitHub).

Here are links to all the parts:

However, Carin Meier’s Living Clojure is excellent in many ways. Get it from O’Reilly (we’re an affiliate):
Living Clojure

My little tutorial started with part zero, in which I lamented how functional programming is made to appear unlearnable by mere mortals, and it kind of snowballed from there. Hope you like it and/or find it useful!

 

Functional programming without feeling stupid, part 5: Project

In the last four installments of Functional programming without feeling stupid I’ve slowly built up a small utility called ucdump with Clojure. Experimentiing and developing with the Clojure REPL is fun, but now it’s time to give some structure to the utility. I’ll package it up as a Leiningen project and create a standalone JAR for executing with the Java runtime.

Creating a new project with Leiningen

You can use Leiningen to create a skeleton project quickly. In my project’s root directory, I’ll say:

lein new app ucdump

Leiningen will respond with:

Generating a project called ucdump based on the 'app' template.

The result is a directory called ucdump, which contains:

.gitignore   README.md    project.clj  src/
LICENSE      doc/         resources/   test/

For now I’m are most interested in the project file, project.clj, which is actually a Clojure source file, and the src directory, which is intended for the app’s actual source files.

Leiningen creates a directory called src/ucdump and seeds it with a core.clj file, but that’s not what actually what I want, for two reasons:

  • I want ucdump to be a good Clojure citizen, so I’m going to put it in a namespace
    called com.coniferproductions.ucdump.
  • My Git repository for ucdump also contains the original Python version of the application, which is in <project-root>/python, and I want the Clojure version to live in <project-root>/clojure.

Continue reading

Functional programming without feeling stupid, part 4: Logic

In the previous parts of “Functional programming without feeling stupid” we have slowly been building ucdump, a utility program for listing the Unicode codepoints and character names of characters in a string. In actual use, the string will be read from a UTF-8 encoded text file.

We don’t know yet how to read a text file in Clojure (well, you may know, but I only have a foggy idea), so we have been working with a single string. This is what we have so far:

(def test-str 
  "Na\u00EFve r\u00E9sum\u00E9s... for 0 \u20AC? Not bad!")
(def test-ch { :offset 0 :character \u20ac })
(def short-test-str "Na\u00EFve")

(defn character-name [x]
  (java.lang.Character/getName (int x)))

(defn character-line [pair]
  (let [ch (:character pair)]
    (format "%08d: U+%06X %s"
      (:offset pair) (int ch)
      (character-name ch))))
    
(defn character-lines [s]
  (let [offsets (repeat (count s) 0)
        pairs (map #(into {} {:offset %1 :character %2}) 
          offsets s)]
    (map character-line pairs)))

I’ve reformatted the code a bit to keep the lines short. You can copy and paste all of that in the Clojure REPL, and start looking at some strings in a new way:

user=> (character-lines "résumé")
("00000000: U+000072 LATIN SMALL LETTER R" 
"00000000: U+0000E9 LATIN SMALL LETTER E WITH ACUTE" 
"00000000: U+000073 LATIN SMALL LETTER S" 
"00000000: U+000075 LATIN SMALL LETTER U" 
"00000000: U+00006D LATIN SMALL LETTER M" 
"00000000: U+0000E9 LATIN SMALL LETTER E WITH ACUTE")

But we are still missing the actual offsets. Let’s fix that now.

Continue reading

Functional programming without feeling stupid, part 3: Higher-order functions

Welcome to the third installment of “Functional programming without feeling stupid”! I originally started to describe my own learnings about FP in general, and Clojure in particular, and soon found myself writing a kind of Clojure tutorial or introduction. It may not be as comprehensive as others out there, and I still don’t think of it as a tutorial — it’s more like a description of a process, and the documented evolution of a tool.

I wanted to use Clojure “in anger”, and found out that I was learning new and interesting stuff quickly. I wanted to share what I’ve learned in the hope that others may find it useful.

Some of the stuff I have done and described here might not be the most optimal, but I see nothing obviously wrong with my approach. Maybe you do; if that is the case, tell me about it in the comments, or contact me otherwise. But please be nice and constructive, because…

…in Part 0 I wrote about how some people may feel put off by the air of “smarter than thou” that sometimes floats around functional programming. I’m hoping to present the subject in a friendly way, because much of the techniques are not obvious to someone (like me) conditioned with a couple of decades of imperative, object-oriented programming. Not nearly as funny as Learn You a Haskell For Great Good, and not as zany as Clojure for the Brave and True — just friendly, and hopefully lucid.

xkcd 1270: Functional
xkcd 1270: Functional. Licensed under Creative Commons Attribution-Non-Commercial License. This is a company blog, so it is kind of commercial by definition. Is that a problem?

In Part 1 we played around with the Clojure REPL, and in Part 2 we started making definitions and actually got some useful results. In this third part we’re going to take a look at Clojure functions and how to use them, and create our own — because that’s what functional programming is all about.

Continue reading

Functional programming without feeling stupid, part 1: The Clojure REPL

In my recent post Functional Programming Without Feeling Stupid I took a quick look at how functional programming can be a little off-putting for the non-initiated. I promised to provide some examples of my own first steps with FP, and now I would like to present some to you.

Advocates of functional programming often refer to increased programmer productivity. At least some of that can be attributed to the REPL, or the Read-Evaluate-Print Loop. We are basically talking about an environment which accepts and parses any code you type in, and gives you a place to experiment and see results quickly. Before interpreted or semi-interpreted languages like Python, Java and JavaScript became mainstream, the typical repeating cycle in software development was Compile-Link-Execute, and debugging meant observing special output on the console. In the 1990s integrated debuggers with watches and breakpoints became the norm, but long before that Lisp-like languages already had a REPL, and Python also acquired one.

If you are thinking about getting intimate with Clojure, you will need to get to know the REPL. It is your playground, and will always be, even if you later start packaging and organizing your code.

Clojure depends on the Java Virtual Machine (JVM) and is actually distributed as a normal JAR file (Java ARchive), like most Java libraries are. You can start Clojure from the JAR, but you will save yourself some trouble and prepare for the future if you install Leiningen, the dependency management tool for Clojure. It is simple to install and run, and I will assume that you will follow the instructions on the Leiningen web site sooner or later. Now would be a good time.

When you’re done with the installation, you only need to say

lein repl

to start a Clojure REPL. I’m using OS X, so what I describe here was done from Terminal. You don’t need to create a project with Leiningen if you just want to play around in the REPL.

Of course, if you don’t have Java installed, you need to get it first. Refer to the Java web site of Oracle for details as necessary. Furthermore, some of the things I will describe require Java version 7 or later.

Continue reading

The FIFA World Cup is not internationalized

At the time of this writing, the FIFA World Cup is almost at its final stage. Throughout the whole month of the tournament I have been amazed to find out that while UEFA has consistently made an effort to have the players’ names written as authentically as possible, FIFA hasn’t. Information conveyed to viewers in televised matches is transliterated, making many players’ names appear irritatingly different than their national, conventional spellings.

It can be argued that FIFA has a tougher job with various languages and characters. After all, the 2014 World Cup had Japan, South Korea, Iran, Russia, and Greece, among others. These countries and their languages alone represent half a dozen character sets. There is a point in trying to achieve some sort of baseline transliteration, so that all names are expressed in the Latin character set. In practice this means that Russian, Greek, and other names are transliterated by default. However, names which can be perfectly written in extended Latin should not be transliterated. Now it seems like everything has been folded down to basic ASCII.

There is no technical reason to limit the displays on television screens to what is essentially a 7-bit character set, but this is what the result amounts to. For example, one of the finalists in this World Cup is Germany, and many of the players have characters with umlauts in their names. Here are a few examples:

Oezil should be Özil
Mueller should be Müller
Goetze should be Götze
Hoewedes should be Höwedes

EDIT: Also, the German coach is Joachim Löw, not Loew (or Low, as the content on the official FIFA apps would have us believe).

The players’ jerseys have the truth anyway, and it’s in direct contrast with what you see when somebody scores a goal, gets booked or gets sent off.

So please, FIFA, take a leaf from UEFA’s book and find out how to use modern broadcasting technology to the advantage of all football fans around the world (even if they might call it soccer). You might also start with a good book like “Unicode Explained” by Jukka K. Korpela.

Ohjelmistot monilla kielillä

Ohjelmistoyrittäjä, tiedetäänkö sinun yrityksessäsi miten tehdä sovelluksia, jotka on helppo sovittaa eri kielille ja kulttuureille?

Lokalisointi on ohjelman käyttöliittymässä esiintyvien tekstien ja kuvien sovittamista eri kielille. Sen tekevät yleensä käännöstoimistot eri toimeksiannosta kun ohjelma alkaa olla valmis julkaisuun.

Pelkkä tekstien käännättäminen ei kuitenkaan riitä, vaan ohjelmisto pitäisi laatia alusta pitäen sellaiseksi, että käännösten hallinnointi on luonteva ja toimiva osa ohjelmistoprosessia. Lisäksi ohjelmiston sovittaminen moniin eri kulttuureihin vaatii muutakin kuin sisällön kääntämistä: esimerkiksi päiväykset ja kellonajat sekä luvut ja rahasummat pitää esittää kohdemaan tavalla. Myös aineiston lajittelu tehdään eri kielialueilla hyvin vaihtelevin tavoin. Monilla ohjelmistoilla on edelleen haasteita siinä miten muita kuin tuttuja latinalaisia aakkosia syötetään ohjelmaan ja käsitellään.

Internationalisointi eli ohjelmiston kansainvälistäminen ei ole vaikeaa, mutta se pitää tehdä jo varhaisessa vaiheessa ohjelmistoprojektia, jotta vältytään tekemästä isoja osia uudelleen. Ei taloakaan tehdessä aleta pystyttää seiniä ennen kuin pohja on kunnossa.

Conifer Productions Oy:n ohjelmiston kansainvälistämisarvio auttaa löytämään piilevät ongelmat, jotka tulevat esiin kun on tarpeen saada ohjelmisto lokalisoitua. Teemme arvion siitä mitä pitää korjata, ja tarvittaessa myös opastamme ja koulutamme.

Tunnemme niin mobiiliohjelmistot kuin perinteiset pöytäkoneiden käyttöjärjestelmät ja Internet-sovelluksetkin. Hallitsemme käyttöliittymät ja asiakasohjelmistot, mutta tiedon liike päätelaitteen ja pilven välillä on meille myös hyvin tuttua.

Lue lisää Services-sivulta, ja ota yhteyttä!

 

HipStyles Season Sale is ON! 50% OFF!

Season’s Greetings!

HipStyles is the ultimate companion app for Hipstamatic and Instagram enthusiasts–mobile photographers with an iPhone.

Now you can get HipStyles for 50% OFF on the App Store! This offer lasts until January 6, 2013, but you should really get it right now, and tell all your friends to do so too! And don’t forget to rate it or review it. You can also let us know what you think on Twitter, or become a Facebook fan, or add our Google+ page to your circles.

HipStyles Season Sale 2012

HipStyles Season Sale 2012 is ON! 50% OFF!

Thank you, and have a good festive time of year! It’s an ON/OFF thing.

The HipStyles Team @ Conifer Productions