Wednesday, August 25, 2010

Enumerable.Skip vs Seq.skip

If you're an F# newbie like me(*) you'll eventually try to use Seq.skip and Seq.take to implement pagination, just like you used Enumerable.Skip and Enumerable.Take (or a IQueryable implementation of them) in C#.

And more sooner than later you find out that they don't behave quite the same. If you haven't realized this yet, read on.

Load up fsi.

> let a = seq { 1..100 };;

An F# seq is a System.IEnumerable, it's just a convenient alias. In C# this would be expressed as:

var a = Enumerable.Range(1, 100);

Now let's paginate that. Assuming a page size of 40 elements, in C# we would do something like this:

var firstPage = a.Skip(0).Take(40).ToArray();
var lastPage = a.Skip(80).Take(40).ToArray();

Now in F# :

> let firstPage = a |> Seq.skip 0 |> Seq.take 40 |> Seq.toArray;;
> let lastPage = a |> Seq.skip 80 |> Seq.take 40 |> Seq.toArray;;

Uh-oh. The last expression throws a System.InvalidOperationException: The input sequence has an insufficient number of elements.

The thing is, Seq.skip and Seq.take are more strict and actually do bounds checking, whereas Enumerable.Skip and Enumerable.Take are more "tolerant" and may process and return fewer items than requested.

So how can we get Enumerable.Skip's behavior? The simplest option would be using it as in C#, e.g.:

> open System.Linq;;
> let lastPage = a.Skip(80).Take(40).ToArray();;

This works, but extension methods are generally not idiomatic in F#. We prefer function piping, currying and composition. So let's wrap them in curried forms, which is trivial:

> let eSkip n sequence = Enumerable.Skip(sequence, n);;
> let eTake n sequence = Enumerable.Take(sequence, n);;

And now we can operate as usual with pipes:

> let lastPage = a |> eSkip 80 |> eTake 40 |> Seq.toArray;;

By the way, F# already defines Seq.truncate which works like Enumerable.Take, so we can drop eTake and just write:

> let lastPage = a |> eSkip 80 |> Seq.truncate 40 |> Seq.toArray;;

(*) I've been learning F# and functional programming for about two years now, yet I still consider myself somewhat of a newbie about it... at least until I learn Haskell :-)


Serge Slipchenko said...

I think the better alternative is to use Seq.windowed

mausch said...

@Serge: if you mean something like "a |> Seq.windowed page_size |> Seq.nth start_index", it will also throw when out of bounds.