Wednesday, August 28, 2013

Objects and functional programming

In a recent question on Stackoverflow, someone asked “when to use interfaces and when to use higher-order functions?”. To summarize, it’s about deciding whether to design for an interface or a function to be passed. It’s expressed in F#, but the same question and arguments can be applied to similar languages like C#, VB.NET or Java.

Functional programming is about programming with mathematical functions, which means no side-effects. Some people say that “functional programming” as a paradigm or concept isn’t useful, and all that really matters is being able to reason about your code. The best way I know to reason about my code is to avoid side-effects or isolate them as much as possible.

In any case, none of this says anything about objects, classes or interfaces. You can represent functions however you like. You can write pure code with objects or without them. OOP is effectively orthogonal to functional programming. In this post I'll use the terms 'objects', 'classes', 'interfaces' somewhat interchangeably, the differences don't matter in this context. Hopefully my point still gets across.

Higher-order functions are of course a very useful tool to raise the level of abstraction. However, many perhaps don’t realize that any function receiving some object as argument is effectively a higher-order function. To quote William Cook in “On Understanding Data Abstraction, Revisited”:

Object interfaces are essentially higher-order types, in the same sense that passing functions as values is higher-order. Any time an object is passed as a value, or returned as a value, the object-oriented program is passing functions as values and returning functions as values. The fact that the functions are collected into records and called methods is irrelevant. As a result, the typical object-oriented program makes far more use of higher-order values than many functional programs.

So in principle there is little difference between passing an interface and passing a function. The only difference here is that an interface is named and has named functions, while a function is anonymous. The cost of the interface is the additional boilerplate, which also means having to keep track of one more thing.

Even more, since objects typically have many functions, you could say that you’re not just passing functions as values, but passing modules as values. To put it clearly: objects are first-class modules.

As an aside, the term “first-class value” doesn’t have a precise definition, but I find it useful to wield it with the definition given in MSDN or the Wikipedia.

Objects are also parametrizable modules because constructors can take parameters. If a constructor takes some other object as parameter, then you could say that you’re parameterizing a module by another module.

In contrast to this, F# modules (more generally, static classes in .NET) are not first-class modules. They can’t be passed as arguments, you can’t do something like creating a list of modules. And they can’t be parameterized either.

So why do we even bother with modules if they’re not first-class? Because it’s easier to pick just one function out of a module to use or to compose. Object composition is more coarse-grained. As Joe Armstrong famously said: “You wanted a banana but you got a gorilla holding the banana”.

Back to the Stackoverflow question, what’s the difference between:

module UserService =
   let getAll memoize f =
       memoize(fun _ -> f)

   let tryGetByID id f memoize =
       memoize(fun _ -> f id)

   let add evict f name keyToEvict  =
       let result = f name
       evict keyToEvict
       result

and

type UserService(cacheProvider: ICacheProvider, db: ITable<User>) =
   member x.GetAll() = 
       cacheProvider.memoize(fun _ -> db |> List.ofSeq)

   member x.TryGetByID id = 
       cacheProvider.memoize(fun _ -> db |> Query.tryFirst <@ fun z -> z.ID = ID @>)

   member x.Add name = 
       let result = db.Add name
       cacheProvider.evict <@ x.GetAll() @> []
       result

The first one has some more parameters passed in from the caller, but you can imagine what it would look like. I probably wouldn’t arrange things like either of them, but for one, both lack side-effects. To the first one, you can pass pure functions. To the second one, you can pass implementations of ICacheProvider and ITable with pure functions.

However if you take a good look at the second one, you’ll see that every method uses both cacheProvider and db. So in this case it’s not so bad to pass a couple of gorillas. And it gives the reader a lot more information about what’s being composed, as opposed to a signature like

add : evict:('a -> unit) -> f:('b -> 'c) -> name:'b -> keyToEvict:'a -> 'c

To summarize: The beauty of functional programming lies in being able to reason about your code. One of the easiest ways to achieve this is to write code without side-effects. Classes, interfaces, objects are not opposed to this. In object-capable languages, objects can be a useful tool. Here I talked about objects as modules, but they can model other things too, like records or algebraic data types. They can be easily overused though, especially by programmers new to functional programming. Consider carefully if you want to be juggling gorillas rather than bananas!

3 comments:

Colin Bull said...
This comment has been removed by the author.
Colin Bull said...

Nice post, although I think you missed a Key point in that the module implementation is actually completely generic it says nothing about what it is getting these are provided by the function (f) that are past in. So in this case the re-use is alot higher.

Mauricio Scheffer said...

Hi Colin,

Both implementations really have the same "genericity". About reusability, it's the "gorilla" factor, i.e. instead of passing one function you have to pass a module with n functions. However, reusability with loose functions has its limit in the lack of contextual meaning. I see this analogous to having bind and return as functions independent of Monad. These functions need to belong to Monad as it's what provides the laws that give them proper meaning. In fact, there was a discussion about Pointed/Copointed in Scalaz not long ago, they decided to remove it because of this: https://groups.google.com/forum/#!topic/scalaz/7OE_Nsreqq0