Thursday, August 18, 2011

Refactoring to monadic C# - applicative functors

In the last part of this series, we "upgraded" the Maybe monad to an Either monad in order to add error messages when parsing two integers. To recap, the code using the Either monad via LINQ looked like this:

var somethingOrError =
    from userID in FSharpOption.TryParseInt(req_userID).ToFSharpChoice("Invalid User ID")
    from id in FSharpOption.TryParseInt(req_otherID).ToFSharpChoice("Invalid ID")
    select doSomething(userID, id);

somethingOrError.Match(Console.WriteLine, setError);

That code, however, only dealt with setting one error message for the whole computation. When it got the first error, it aborted the rest of the computation.

Sometimes we do want this shortcut evaluation, but more generally, when validating data, we want to validate the whole thing, and then display a list of errors, not abort at the first error we find.

To do this, we need to step down from monads into a weaker abstraction: applicative functors. I've already blogged about applicative functors in F# and C#, mostly in the context of formlets. In a nutshell:

  • Applicative functors are defined by two functions: pure and apply.
  • These functions must satisfy some laws, which I won't mention here.
  • Pure is basically a Func<T, M<T>> for some M, i.e. it puts some value into M
  • Apply is basically a Func<M<Func<A,B>>,M<A>,M<B>>, i.e. it applies a function within M
  • All monads are applicative functors (but not necessarily the other way around).
  • All applicative functors are functors (but not necessarily the other way around), which means you can Select() over them with LINQ.

But I don't want to get into much detail here. What's interesting here is that we won't use the "default" implementation of the Either applicative functor (i.e. the one derived from the monad) because it would only keep the first error message, so it would be pretty useless in this case! The Either applicative functor we'll use will of course collect all validation messages in a list.

But first, for comparison, here's some simple imperative code to parse two numbers and then do something with them or display all errors:

var errors = new List<string>();

int userID;
var userID_ok = int.TryParse(req_userID, out userID);
if (!userID_ok)
    errors.Add("Invalid user ID");

int id;
var id_ok = int.TryParse(req_otherID, out id);
if (!id_ok)
    errors.Add("Invalid ID");

if (errors.Count > 0)
    Console.WriteLine(doSomething(userID, id));

Just like with CsFormlets, this Either applicative functor can be expressed either "directly", or using LINQ as explained by Tomas Petricek here. In this post I'll skip the direct syntax and use LINQ instead. And using LINQ, the above imperative code is translated into this:

var userID = FSharpOption.TryParseInt(req_userID)
    .ToFSharpChoice(FSharpList.New("Invalid User ID"));
var id = FSharpOption.TryParseInt(req_otherID)
    .ToFSharpChoice(FSharpList.New("Invalid ID"));

var result =
    from a in userID
    join b in id on 1 equals 1
    select doSomething(a, b);

result.Match(Console.WriteLine, setErrors);

Note that it's very similar to the monadic code we were first using, only instead of from...from... (which compiles to SelectMany(), i.e. monadic bind) we do from...join... .
"on 1 equals 1" is the syntactic tax we have to pay for bending LINQ like this.

result is of type FSharpChoice<int, FSharpList<string>> , that is it contains either the result of doSomething() or a list of errors.

This was a trivial example of using applicative functors for validation; just an introduction. It doesn't show one of its biggest strengths: composability. In the next post we'll look into validating a more complex domain.

Code is here.


Gauthier Segay said...

Thanks, with this serie I start to grasp few things better.

The definitions you give about applicative functors sound important, but it goes beyond understanding without examples.

Func[T, M[T]] I grasp it could be "wrap an integer into a nullable integer" kind of thing, but Func[M[Func[A,B]],M[A],M[B]] goes away from my current abilities to grasp what's the concept there.

"on 1 equals 1" I happen to use that (rarely) in sql, I don't like from clause with , (comma) in it.

Mauricio Scheffer said...

@Gauthier: thanks for your comments. With this series of posts I intentionally wanted to show things with real-world examples first, and explaining as little as possible. So yeah, there will be holes. But I think most people see functional programming as an academic curiosity that only math and financial people use but has no application beyond that. And that's what I intend to disprove :)
The "wrap into nullable integer" analogy is accurate.
As for the signature of apply... well, function signatures don't really look very intuitive in C#. They're much more readable in ML languages like F#. But I'll post more resources to learn about applicative functors in the next post.

jackfoxy said...

Sorry about the choice of "academic" in a previous question. It only made my question obtuse. I always look forward to your new posts.

Mauricio Scheffer said...

@jackfoxy no need to apologize... I used to think the same way until I started seeing the benefits of FP only a couple of years ago :)
This is just my two cents to help people realize that FP really is useful for everyday programming tasks.

jackfoxy said...

Yeah, my day job keeps me out of hands-on engineering now, but I am definitely excited by FP. It just looks to me like FP is easier to follow and learn from in native languages like F#, rather than C#. Although LINQ does provide a helper asset within C#.

Robin said...

@Gauthier, @Mauritio: About Func[M[Func[A,B]],M[A],M[B]] being hard to imagine - you are right, and it is hard to do without a subtle hint from the original paper.

Let's consider M[Func[A,B]]. What is the sense of this? Since most often we have a Func[A,B], and want to use that on M[A]. Why can't we just pass Func[A,B] to an applicative functor?

The need arises when wanting to apply a multi-arg function, Func[A, B, C] to args M[A] and M[B].

Curry Func[A,B,C] to Func[A, Func[B, C]], then what we would naively do is first apply M[A] for Func[A, Func[B, C]] (possibly through lifting the function with pure), yielding M[Func[B, C]]. Observe that this is exactly what we can do with a (normal) Functor!

But now we have a M[Func[B, C]]. With a Functor, we would be stuck here. But with an Applicative, we can continue applying M[B] to this function, to finally get M[C].

Bottom line: When you use an Applicative, you probably have a Func[A, B, ... C] and M[A], M[B] ... ars. But for the Applicative to work you need the apply signature to be as it is.