A few days ago, I needed to encode numeric identifiers in a short and url-safe format. Something similar to what url shorteners use (e.g. SlvUp7 in http://bit.ly/SlvUp7). Encoding the ids in base 64 would work if an alternative alphabet is provided for the non url-safe symbols. But since I wanted to have only alpha-numeric characters, I chose to use base 62 instead.

While working on this problem, I had the idea of the coding exercise I’m sharing here: a utility for converting back and forth base 10 numerals to strings of base X equivalent –just like itoa and atoi in C (except that atoi doesn’t take a base parameter). In order to make things a bit more interesting, I decided to learn myself some Scala and translate the code to functional style. Both implementations are detailed here and available online under the terms of the Apache version 2 license.

At the end, I didn’t resist the temptation to do a C# vs. Scala performance benchmark. I also coded and added a Java implementation to the tests in order to distinguish the performance change due to switching from the CLR to the JVM, and the one introduced by the translation to functional style and to the Scala runtime.

First, Iterative style

I started by implementing the C# version. It takes less than 30 lines of code:

public static int Decode(string str, int baze)
{
    int result = 0;
    int place = 1;
    int length = str.Length;
    for (int i = 0; i < length; ++i)
    {
        result += Value(str[length - 1 - i]) * place;
        place *= baze;
    }
    return result;
}

public static string Encode(int val, int baze)
{
    var buffer = new char[32];
    int place = 0;
    int q = val;
    do
    {
        buffer[place++] = Symbol(q % baze);
        q /= baze;
    }
    while (q > 0);
    Array.Reverse(buffer, 0, place);
    return new string(buffer, 0, place);
}

Symbol and Value methods convert respectively a numeral to its symbol and vice versa. They are omitted here for the sake of clarity. An example implementation that handles bases from base 2 to base 64 is provided in the C# online sample.

If you can afford having unsafe code in your project you can replace line 16 with something like:

char* buffer = stackalloc char[32];

The string buffer is then allocated on the stack instead of the heap. This is meant to boost performance as it avoids garbage collection for the buffer. The performance of the unsafe version is detailed in the benchmark results below.

I’m not going to list the Java code here since it is very similar to the C# one and it is also available in the online gist.

Enter Scala

I must say that I have little knowledge of functional programming at the time of this writing. Still, having some Java and Ruby background, getting started with Scala was easy and fun. After grasping the bare minimum of the Scala basics, I got back to my C# code and started translating it to the functional language. After all, all I had to do is to translate two methods consisting basically of one plain old iterative loop each, to shiny tail-recursive functions, right? :)

A list based implementation

Most functional languages have built-in immutable singly linked lists as fundamental data-structure. Prepending, getting the first element and getting the tail of the list (i.e. a sublist with all elements but head) are O(1) operations. In addition, since the list is immutable, it can be shared, manipulated and have multiple pointers at different positions without worrying about data-races. Scala makes no exception and provides the class List as a built-in immutable structure with the :: operator and head and tail functions (obviously, there is no append operation since elements are not meant to be added at the end of the list).

Based on this finding, my first attempt to translate the code to Scala uses lists as underlying data structures for both functions.

def encode(i: Int, baze: Int): String = {
  def process(q: Int, baze: Int, list: List[Char]): List[Char] = {
    if (q > 0) process(q / baze, baze, symbol(q % baze) :: list)
    else list
  }
  if (i == 0) symbol(0).toString
  else process(i, baze, Nil).mkString
}

def decode(str: String, baze: Int): Int = {
  def process(acc: Int, place: Int, lst: List[Char]): Int = {
    if (!lst.isEmpty) process(acc + value(lst.head) * place, place * baze, lst.tail) 
    else acc
  }
  process(0, 1, str.reverse.toList)
}

In the encode function, a list is used as buffer where symbols are prepended. The list is converted to a string object when all symbols are computed. In the decode function, the input string is reversed and mapped to a List where symbols are popped one by one until the result is computed. Note that the string has to be reversed first since we cannot traverse the list backward from tail to head (remember, it’s a singly linked list).

The translation was simple to do and I find the resulting Scala code concise and quite elegant. However, the performance of this initial implementation –discussed in more details below– was too slow.

A more pragmatic implementation

Lessons learned from the first version, I refactored my Scala code as follows:

def encode(i: Int, baze: Int): String = {
  def process(q: Int, place: Int, baze: Int, builder: StringBuilder): StringBuilder = {
    if (q > 0) process(q / baze, place + 1, baze, builder += symbol(q % baze))
    else builder
  }
  if (i == 0) symbol(0).toString
  else process(i, 0, baze, new StringBuilder(32)).reverse.toString
}

def decode(str: String, baze: Int): Int = {
  def process(acc: Int, place: Int, str: String, index: Int): Int = {
    if (index >= 0) process(acc + value(str.charAt(index)) * place, place * baze, str, index - 1)
    else acc
  }
  process(0, 1, str, str.length - 1)
}

The decode function now processes directly the input string, without converting it to a list. An additional index variable is introduced in order to keep track of the current symbol. This is less elegant than the list based version, but way more efficient.

The encode function is basically the same as the list based one, except that the list was replaced by a StringBuilder.

Stopwatch in hand, I was finally ready to start benchmarking!

Lies, Damn Lies and Benchmarks

The results presented here are not meant to compare the performance of Scala and C# in general. They are specific to the context of the stated problem and the provided implementations. This benchmark, like most benchmarks, have to be taken with a grain of salt. Here are some reasons why:

  • I only measured raw speed. Raw speed is nice to have but it does not imply scalability.
  • The benchmarked applications are stateless and purely sequential. Results may vary considerably with more complex and/or multi-threaded code.
  • I’ve basically started learning Scala a few hours before writing this benchmark. Albeit simple, the Scala code probably needs some obvious performance tuning that I’m not aware of.
  • I didn’t have a decent server at hand to run the benchmark on. I performed this benchmark on my laptop: an Intel Core i7-720QM Processor (6M Cache, 1.60 GHz) with 4Gb RAM running a 64bit Windows 7.
  • The benchmark consists in calling the encode and decode operations 10M times each with input values 232 -1 and 23527 (a random value I picked for the test) using bases 2, 10 and 64. A more rigorous benchmark should have a wider range of input values.

Despite these reservations, I still find the results of this benchmark worth considering and sharing. They at least show how C#, Java and Scala behave in this kind of contexts.

Results

No need to keep you more in suspense, here are the results:

Execution times are expressed in nanoseconds. The percentages denote the relative performance to the C# safe version taken as a reference. You can see that on average, compared to the C# implementation:

  • The C# unsafe encode is 21% faster.
  • The Scala list-based implementation is terribly slow: 101% slower for encode, 414% slower for decode.
  • Scala code is 18% faster for encode and 46% faster for decode.
  • Java is the fastest: 31% faster for encode and 51% faster for decode.

Discussion

It is quite surprising to see Java and Scala largely outperforming the C# –both safe and unsafe versions– on Windows. The JVM is clearly outperforming the CLR on optimizing and JIT-ting code on the fly. I must confess that I didn’t expect this one :).

Using lists in the Scala code is an overkill, especially for the decode function. The delay introduced by reversing the input string and mapping it to a list is not necessary for processing it. However, the cause of the overhead in the list-based encode is a bit less obvious to me. I didn’t expect that using a buffer can outperform the singly linked list to that extent. It’s probably the buffer’s faster appending operation and/or its better locality of reference that made the difference. I’m curious to dig further and see what went wrong here. But this will be probably the topic of another blog post.

You have an opinion or comment about this article? Please drop a line here. Your feedback is much appreciated!