Rob's Scala Blog: December 2012

[2013-01-07 Update: Results, Discussion and Conclusion updated in light of new benchmarks obtained using a different set-up.]

This post presents a method that does not resort to using arrays, as the built-in methods do, but instead uses scala.collection.immutable.List itself.

Motivation

Finding myself writing Java code making heavy use of arrays, one minute, and then Scala code in which everything should be immutable, the next, I decided to try to rewrite some of the former in the manner of the latter.

The code in question examines a set of 2-d points, and finds all those points which lie on hidden straight lines (with at least four points per line); the object of the exercise was to demonstrate a clever algorithm which employs sorting to reduce the time taken, so that it becomes proportional to N*log(N) rather than N*N (for N points).

Having eventually succeeded in rewriting this in Scala, using immutable lists instead of arrays, I began to wonder how the sort method I was blithely calling was actually implemented.

The answer is that the list is first copied to an array, which is then sorted (using one of Java's built-in sort methods), before being copied to a new immutable list.

Well, well!

No wonder the immutable list version was taking longer than the array one, with all that copying going on.

Why not do the sorting by creating new immutable lists directly?

So I set out to find out.

All been done before?

Searching the web, the best algorithm I could find was one for sorting mutable linked lists, with a sample implementation written in C. Therefore, using this one as a starting point, I began to devise one of my own, using Scala.

This could easily have been one of the FPPinS assignments, but without the skeleton solution!

After a great deal of perseverance, writing out of lists using pencil and paper, and an exercise in divining recursive functions from sets of similar ones, the following algorithm and implementation was arrived at.

Algorithm for sorting an immutable list

Sorting is done by first recognising that the number of elements in the list may be reduced to a sum of factors which are decreasing powers of 2; e.g. 15 = 8 + 4 + 2 + 1. The list to be sorted is

then considered to be the concatenation of subsequences of these lengths, and is sorted in two

stages:

each subsequence, starting with the longest, is sorted and prepended to a list;
each element of the tail of that list is then merged with the head.

Each subsequence in stage 1. is sorted by repeatedly considering it to be the concatenation of pairs of sorted subsequences, initially of length, 1, which are then merged.

In this way, it was found possible to use only constant-time list-operations (head, tail, ::) throughout.

Cases

Each time a pair of subsequences is merged, heads must be prepended to the result first, so that the order of the result is the reverse of those of the subsequences being merged. Therefore, it is important to keep track of the order of each subsequence, and ensure that the order of the result of the very first merge is chosen so that very last one will be from low to high. This was achieved by recognising that there are three cases which can arise, based on the number of stage 1. subsequences, each one with its own sequence of orders (where, in the following, '<' indicates low-to-high, '>', high-to-low):

Single subsequence: <
Even number of subsequences: >(<>...<>)>
Odd number of subsequences (> 1): >(<>...<>)<<

Note that the the number of subsequences may be derived from the number of elements using Java's Integer.bitCount method.

The cases can be neatly implemented using case-classes, of a trait supplying a method that returns the order as a function of both the position of the subsequence and the previous order.

Algorithm analysis

The number (strictly, order of growth) of comparisons of elements (for a list of length, N) appears to be optimal for sorting:

stage 1. N*log(N)
stage 2. N

Stability

If possible, the order of two elements considered equal by the sort criterion should be preserved (in case it might be significant), in which case the sorting algorithm is termed 'stable'. Here, stability was achieved by being more careful when selecting the particular inequality to test for when comparing elements.

Results

The implementation, complete with tests of performance as well as for correctness, is to be found in a GitHub project called list-sort.

Performance

Performance testing was carried out using the rather useful ScalaMeter framework (which I learned of here).

As shown in the figure, the answer to the above question soon became apparent - it takes about 7 times longer to sort an immutable list of 30,000 integers, compared with using the built-in sorted method (which first copies the elements to an array, and sorts that), falling to about 4 times longer for 150,000 integers.

[2013-01-07 The above results were obtained on a dual-core MacBook Pro, using--as pointed out by Eugene Platonov (see comments)--input arrays/lists which were already sorted. So, new results have been obtained, this time using input arrays/lists in reverse-sorted order (thereby representing the worst case), and on two different set-ups (thanks to results submitted by Eugene in the comments).

Note that the 'AMD PC' is a quad-core set-up, running Java 6, versus Java 7 on the MacBook Pro (see comments). 2013-01-07]

Lines of code

Given that it is possible to sort a list in just a few lines of code if the restriction to constant-time list operations is removed, at about 100 lines of code excluding doc-comments, this particular algorithm and implementation are relatively complex and lengthy.

Discussion

One observation which goes some way towards accounting for the relatively poor performance is
that the recursive function to sort the first (largest) power-of-2 elements is not tail-recursive.

Note that this should not however (in itself) result in excessive use of memory, since only log2(len) frames will accumulate (which is only 17 in the largest test case).

[2013-01-07 The new results show a significant difference in performance characteristics between the two set-ups.

With Java 7 on a dual-core (Intel) Mac, the built-in sorting methods perform much better than the current algorithm, and this does not appear to be affected by whether or not the input is already sorted.

With Java 6 on a PC with a quad-core AMD CPU, the current algorithm's performance is on par with that of the built-in method for sorting an array, and exceeds that of the build-in method for sorting a list. 2013-01-07]

Conclusion

A stable algorithm for sorting an immutable list, using only constant-time list-operations, was devised and implemented in Scala.

One of the reasons for going to this trouble was to try to find out if it might be more efficient to sort such a list using functional programming, rather than by first copying it to an array. The above performance testing suggests that it definitely is not, and would appear to be in keeping with advice from Martin Odersky (skip to 1:07:30) about when to 'drop down' to using imperative arrays:

where functional programming really matters is [programming] 'in the large'
okay provided access is controlled and interface is purely functional.

[2013-01-07 The contributed results suggest that, on the contrary, using functional programming to sort a list can indeed be more efficient than using imperative arrays, depending on the particular set-up.

They also suggest that the built-in sorting methods of Java 7 and available on (Intel) Macs are highly optimised in comparison to those of Java 6 and available on AMD hardware. 2013-01-07]

Comments welcome.

Rob's Scala Blog

Saturday, 15 December 2012

Sorting an immutable list (in Scala)