In my last post I showed one way to group consecutive integers in a list. When I realized I needed such a function I first went googling for it but all I found was this Haskell code:

-- | Group consecutive integers: -- -- > groupConsec [1,2,3,5,6,8,9] == [[1,2,3],[5,6],[8,9]] groupConsec :: [Int] -> [[Int]] groupConsec = groupConsec' [] where groupConsec' x [] = x groupConsec' [] (y:ys) = groupConsec' [[y]] ys groupConsec' xs (y:ys) = if y - head (last xs) == length (last xs) then groupConsec' (init xs ++ [last xs ++ [y]]) ys else groupConsec' ( xs ++ [ [y]]) ys

As I was on a C# project it was easier to start from scratch than trying to port this. So I wrote what I showed on the last post and moved on.

But I thought it would be interesting to port it to F#, to see how much different it would be.

First, let's translate it literally, by making as little changes as possible to the original code:

let groupConsec = let head = List.head let last l = List.nth l (l.Length-1) let length = List.length let init l = List.ofSeq (Seq.take ((List.length l)-1) l) let rec groupConsec' a b = match a,b with | x, [] -> x | [], y::ys -> groupConsec' [[y]] ys | xs, y::ys -> if y - head (last xs) = length (last xs) then groupConsec' (init xs @ [last xs @ [y]]) ys else groupConsec' ( xs @ [ [y]]) ys groupConsec' []

At the beginning there's a couple of functions that aren't built into F#, but the rest of it is syntactically pretty much the same.

BUT, syntactic similarity does not imply that they're *semantically* the same. There's a fundamental difference from the original code: Haskell is a lazy language, while F#, coming from the ML-family, does eager evaluation. This doesn't necessarily mean that lazy is always better than eager or eager better than lazy. Lazy languages seem to have their own issues.

So let's just replace all lists in the code with LazyLists and see what happens:

let lazyGroupConsec = let head = LazyList.head /// From Haskell prelude: http://www.haskell.org/onlinereport/standard-prelude.html let rec last = function | LazyList.Nil -> failwith "last: empty list" | LazyList.Cons(x,xs) -> match xs with | LazyList.Nil -> x | _ -> last xs let length = LazyList.length let rec init = function | LazyList.Nil -> failwith "init: empty list" | LazyList.Cons(x,xs) -> match xs with | LazyList.Nil -> LazyList.empty() | _ -> LazyList.cons x (init xs) let (@) x y = LazyList.append x y let lazyList x = LazyList.cons x (LazyList.empty()) let rec groupConsec' a b = match a,b with | x, LazyList.Nil -> x | LazyList.Nil, LazyList.Cons(y,ys) -> groupConsec' (lazyList (LazyList.ofList [y])) ys | xs, LazyList.Cons(y,ys) -> if y - head (last xs) = length (last xs) then groupConsec' (init xs @ lazyList (last xs @ (LazyList.ofList [y]))) ys else groupConsec' ( xs @ lazyList ( (LazyList.ofList [y]))) ys groupConsec' (LazyList.empty())

Ok, now the function is lazy, but the code took a serious hit in readability...

Let's take now a whole different approach: porting from the C# code, but using recursion instead of the loop to avoid the mutable list:

let lazyGroupConsec2 : int seq -> int list seq = let lastOf x = List.nth x ((List.length x) - 1) let rec groupConsec' group groups input = if Seq.isEmpty input then Seq.append groups [group] else let hd = Seq.head input let tl = Seq.skip 1 input if List.length group = 0 || hd - (lastOf group) <= 1 then groupConsec' (group @ [hd]) groups tl else groupConsec' [hd] (Seq.append groups [group]) tl groupConsec' [] []

This one looks pretty neat, right? Too bad it needs a type annotation or it gets a value restriction error.

Now I'm too lazy (no pun intended) to properly analyze the complexity of each of these functions, but at least let's run a simple benchmark over a range of integers with some holes: `let range = {1..2000} |> Seq.filter (fun i -> i % 5 <> 0)`

Function | Time |
---|---|

groupConsec | 00:00.1383291 |

lazyGroupConsec | 00:00.9355807 |

lazyGroupConsec2 | 02:03.2433208 |

In case it's not clear, the first, non-lazy one took about **138ms** and the last one took over** 2** **minutes. **The reason is that seq pretty much sucks when used recursively, as Brian explains in detail in this stackoverflow answer.

Let's try using a LazyList instead for the inner recursive function:

let lazyGroupConsec3 l = let lastOf x = List.nth x ((List.length x) - 1) let (+) x y = LazyList.append x (LazyList.ofList [y]) let rec groupConsec' group groups input = if LazyList.isEmpty input then groups + group else let hd = LazyList.head input let tl = LazyList.tail input if List.length group = 0 || hd - (lastOf group) <= 1 then groupConsec' (group @ [hd]) groups tl else groupConsec' [hd] (groups + group) tl groupConsec' [] (LazyList.empty()) (LazyList.ofSeq l)

This one took **95ms** to process the same range of integers.

**Conclusions:**

- Porting code from Haskell to F# might be tempting but beware of their fundamental differences.
- Because of these differences, the best solution in Haskell might not be the best solution in F#.
- You really need to know how seq, list and lazylist work or you'll have a big performance problem.
- LazyLists are so cool I wish they were more tightly integrated to the language.

Homework:

- Properly analyze big-O complexity, or at least measure it empirically.
- Check that all recursive functions here get properly optimized for tail calls.

## 6 comments:

I don't like the look of that haskell code. It certainly could be simpler, and the complexity makes it hard to be sure it's efficient (and I presume it's not because init, last and multiple ++s rarely are).

Presumably this is the correct and efficient way to do it, but it's more general and less explicit so let me write a version closer to the one you started with (how do I format this? sorry):

groupConsec [] = []

groupConsec [a] = [[a]]

groupConsec (a:b:cs) | a+1==b = (a:xs):xss

| otherwise = [a] : xs : xss

where (xs:xss) = groupConsec (b:cs)

This problem cries for a control break to solve it. I've written about this technique to solve an essentially equivalent problem in OCaml here and here.

Thanks for your comments Greg, Matías! I'm still a bit inexperienced in FP in general, and a total newbie in Haskell. I just copied here the first solution I found. However, it was not the point of the article to come up with the most efficient solution for this concrete problem, but rather to show some common pitfalls when translating Haskell to F# and doing recursion over lists in F#.

The archetypal Haskell solution looks like this:

import Data.List (groupBy)

gs = map (map snd) . groupBy (\(x1,y1) (x2,y2)->x1-x2==y1-y2) . zip [1..]

xs = [1,2,3,5,6,8,9]

main = print $ gs xs

@Anonymous, Greg: there's an even sweeter solution at reddit

Yes, that's exactly the library whose implementation I linked to. :)

Be aware that HT is not a standard Haskell library though.

Post a Comment