Friday, July 15, 2011

Composability in HTML generation

I'm bored... I know, let's write write a web page with Wing Beats to show Seth MacFarlane's (creator of Family Guy) tweets! We'll use this nice jQuery plugin to pull the actual tweets.

open WingBeats.Xml
open WingBeats.Xhtml

e = XhtmlElement() let s = e.Shortcut let page = [ e.DocTypeHTML5 e.Html [ e.Head [ e.Title [ &"Tweets!" ] ] e.Body [ e.P [ s.JavascriptFile "" s.JavascriptFile "jquery.tweet.js" e.Div ["id", "tweet"] [] e.Script [ &(@"$(function(){ $('#tweet').tweet({ username: 'sethmacfarlane', avatar_size: 32, count: 4, loading_text: 'searching twitter...' }); });") ] ] ] ] ]

We can render this to a HTML string and print the result just by saying:

printfn "%s" (Renderer.RenderToString page)

So far so good.

Now, I'm also a big fan of Julius Sharpe (a Family Guy writer), so I want to include his tweets too. To avoid repeating ourselves, we'll create a 'twitter' function, parameterized by user name, and then we'll just call this function in the layout:

let twitter username rnd = 
    let divId = sprintf "tweet-%d" rnd
        s.JavascriptFile ""
        s.JavascriptFile "jquery.tweet.js"
        e.Div ["id", divId] []
        e.Script [
                  $('#"+ divId + @"').tweet({
                    username: '"+ username + @"',
                    avatar_size: 32,
                    count: 4,
                    loading_text: 'searching twitter...'
let page seed = 
    let rnd = Random(seed)
        e.Html [
            e.Head [
                e.Title [ &"Tweets!" ]
            e.Body [
                e.P [
                    yield! twitter "juliussharpe" (rnd.Next())
                    yield! twitter "sethmacfarlane" (rnd.Next())

Note how I also now pass a seed to generate div IDs, to keep the code pure. But we have another, bigger problem: when rendering this page we end up with two references to jQuery and two references to jquery.tweet.js! We could move those references out of the 'twitter' function and put them in the layout, but that would pretty much defeat the purpose of this abstraction, it wouldn't be self-contained anymore, it wouldn't be reusable, it wouldn't be composable.

We've all been here. Some people just give up (or don't think much of it) and put the scripts outside the component. Others write a helper or an asset manager (basically, calling a function that keeps track of assets). This leads to a loss of purity, and with it, composability, unless you're willing to thread along this helper's state, which is not a nice prospect unless your whole code is monadic.

Speaking of monads, that's what Yesod (a Haskell web framework, obviously) does to properly encapsulate CSS and JS in reusable, composable widgets.

Another approach to the problem is to just remove the duplicate <script> tags. This is what Lift (a Scala web framework) does: you can insert <head> elements anywhere you want, and before rendering, Lift will move the contents of all inner <head>s to the one and only <head> that should be in a HTML document, deduplicating elements in the process.

Implementing something similar with Wing Beats is quite easy. Instead of scanning for <head> elements as Lift, we'll just scan for <script> elements. Also, we won't move these elements, we'll just remove the duplicates, i.e. leave the first occurrence of each script. Here's some code that does this:

let isSrc = fst >> (fun n -> n.Name = "src")
let tryGetSrc attr = attr |> List.tryFind isSrc |> snd

let rec deduplicateScripts state =
    | TagPairNode(name, attr, children) ->
        let state, children = deduplicateScriptsForest state children
        let node = TagPairNode(name, attr, children)
        if name.Name <> "script"
            then state, node
                match tryGetSrc attr with
                | Some src -> 
                    if Set.contains src state
                        then state, NoNode
                            let state = Set.add src state
                            state, node
                | _ -> state, node
    | x -> state,x
and deduplicateScriptsForest state nodes =
    let folder (state,nodes) n =
        let state, node = deduplicateScripts state n
        state, node::nodes
    let state, nodes = Seq.fold folder (state,[]) nodes
    state, List.rev nodes

Here's how to use these deduplication functions:

let pp = deduplicateScriptsForest Set.empty (page Environment.TickCount) |> snd
printfn "%s" (Renderer.RenderToString pp)

And the result:

<!DOCTYPE html >
        <script type="text/javascript" src=""></script>
        <script type="text/javascript" src="jquery.tweet.js"></script>
        <div id="tweet-1035493420">
            $(function () {
                    username: 'juliussharpe',
                    avatar_size: 32,
                    count: 4,
                    loading_text: 'searching twitter...'
        <div id="tweet-1634829813">
            $(function () {
                    username: 'sethmacfarlane',
                    avatar_size: 32,
                    count: 4,
                    loading_text: 'searching twitter...'

Now, this deduplication function isn't particularly efficient: no tail calls, List.rev... it will blow the stack if given a sufficiently deeply nested structure (a few tests indicate that it dies at around 2600 nested elements, not a bad number nevertheless). More generically, we'd want to define a generic catamorphism over the Wing Beats tree (check out Brian McNamara's excellent series on catamorphisms) and then write deduplication (or any other kind of tree processing) using that fold.

The point is, as you can see it was pretty easy to manipulate and abstract HTML fragments to simple, reusable, pure functions... because Wing Beats makes HTML elements truly first-class citizens. It models HTML as a tree, and not just as a string.

Erik Meijer already showed 11 years ago that view engines that don't make HTML fragments first-class citizens don't compose. Back to 2011, lots of view engines still suffer from this (Razor included).

Another example: if you want to test a view, you have to either use a string assert on the rendered output, or render and then parse back the output in order to test it like structured HTML. It doesn't make sense. Why not just create structured HTML from the start?

Wing Beats is not the only HTML DSL in .Net: WebSharper includes one, and in C# there's SharpDOM and CityLizard.

You may be thinking that HTML DSLs are ugly and not designer-friendly... but you can also do this kind of things with XML literals in Scala, Nemerle or VB.NET (which is mostly the brainchild of Erik Meijer, not by coincidence). There's even an ASP.NET MVC view engine that uses VB.NET's XML literals.

Bottom line: if you're templating unstructured text, then by all means use a generic text template engine. But if you're writing a web application and dealing with HTML, treating HTML as first-class values instead of unstructured text buys you composability: you can use the full power of the host language and you can handle HTML directly as a tree.

In a future post about this subject, I'll try to categorize the different approaches to HTML generation and analyze them from the point of view of composability.

Wednesday, July 6, 2011

Content negotiation with Figment and FsConneg

A couple of posts ago I introduced FsConneg, a stand-alone HTTP content negotiation library written in F#. One of my goals in making it stand-alone is that it could be reused across projects, maybe eventually getting integrated into Frank or MonoRail3. Here I will show some potential integrations with Figment.

Let's start with a trivial function:

let connegAction _ = "hello world"

We want to bind this to an URL (i.e. GETting the URL would return "hello world" obviously), and negotiate the response media type with the user agent. The server will support XML and JSON.

We can build a dispatch table with the supported media types and corresponding actions:

let writers = [ 
                ["text/xml"; "application/xml"], Result.xml 
                ["application/json"], Result.json 

Just as a reminder, Result.xml and Result.json are of type 'a -> FAction, that is, they take some value and return an action where the value is serialized as XML or JSON respectively.

Wrapping actions

Now with FsConneg and this table, we write a generic function that wraps the action with content negotiation (this is all framework-level code):

let internal accepted (ctx: ControllerContext) = 

let negotiateActionMediaType writers action = 
    let servedMedia = List.collect fst writers 
    let bestOf = accepted >> FsConneg.bestMediaType servedMedia >> fst 
    let findWriterFor mediaType = List.find (fst >> List.exists ((=)mediaType)) >> snd 
    fun ctx -> 
        let a = 
            match bestOf ctx with 
            | Some mediaType -> 
                let writer = writers |> findWriterFor mediaType 
                action >>= writer >>. vary "Accept" 
            | _ -> Result.notAcceptable 
        a ctx

Briefly, this function takes a table of acceptable media types and associated writers (just like the table we created above) and a "partial" action, and returns an action where the media type is negotiated with the user agent.

Armed with this function, let's bind the negotiated action to an URL:

get "conneg1" (negotiateActionMediaType writers connegAction)

For a second URL we'd also like to offer text/html. Here's a simple parameterized Wing Beats page:

let wbpage title =
    [e.Html [ 
        e.Head [ 
            e.Title [ &title ] 
        e.Body [ 
            e.H1 [ &title ] 

We want to make this an action:

let html = wbpage >> wbview

I defined wbview in a previous article, it's not instrumental to this post. What's important is that html is a function string -> FAction so we can now add it to our dispatch table:

let conneg2writers = (["text/html"], html)::writers

and bind it to an URL:

get "conneg2" (negotiateActionMediaType conneg2writers connegAction)

Using routing

An entirely different approach is to use routing to select the appropriate 'writer' (or serializer, or formatter, whatever you want to call it)

let ifConneg3Get = ifMethodIsGet &&. ifPathIs "conneg3" 
action (ifConneg3Get &&. ifAcceptsAny ["application/xml"; "text/xml"]) (connegAction >>= Result.xml)

I think this snippet is pretty intuitive, even if you're not familiar with Figment or functional programming. I'll explain anyway:

ifMethodIsGet and ifPathIs are routing functions built into Figment. The &&. operator composes these routing functions as expected, i.e. the resulting routing function must satisfy both conditions. This is explained in more detail in my introduction to Figment.

The >>= operator is a monadic bind. The function

connegAction >>= Result.xml

is equivalent to:

result { 
    let! result = connegAction 
    return! Result.xml result }


fun ctx -> 
    let result = connegAction ctx 
    Result.xml result ctx

Except the first one is evidently more concise. I explained this in more detail in my last post.

ifAcceptsAny uses FsConneg to determine if any of the media types in the list is acceptable to the client. Its definition is quite simple:

let ifAcceptsAny media = 
    fun (ctx: HttpContextBase, _) -> 
        let acceptable = FsConneg.negotiateMediaType media ctx.Request.Headers.["Accept"] 
        acceptable.Length > 0

Similarly, let's add JSON and HTML support:

action (ifConneg3Get &&. ifAcceptsAny ["application/json"]) (connegAction >>= Result.json) 
action (ifConneg3Get &&. ifAcceptsAny ["text/html"]) (connegAction >>= html)

We close by stating that all other media types are not acceptable:

action ifConneg3Get Result.notAcceptable

The HTTP RFC says it's ok to respond with some non-acceptable media type, so you could also use this to define a default media type instead of a "not acceptable".

The important thing to notice about this last example is that using routing like this doesn't yield proper content negotiation. If a user-agent requests with "Accept: application/json, application/xml;q=0.8" (i.e. prefers application/json), the code above will respond with application/xml, disregarding the client's preferences, simply because the action for application/xml was defined before application/json.

Many frameworks don't handle this properly. If you're planning to build a RESTful application I recommend testing the framework you'll use for this. For example, OpenRasta does the right thing, but Lift had issues until some months ago, and WCF Web API doesn't handle it correctly at the moment.

Using extensions

A common way to select a response media type is using extensions in the URL. Twitter used to do this, until they scrapped XML support altogether. MySpace still does it. Similarly, others use a query string parameter to select the media type, like

This isn't really content negotiation as defined by HTTP, but some people do it in the name of simplicity, or to work around client issues. It could be considered as part of a client-driven negotiation, though.

At any rate, implementing extension-driven media types is quite easy. Similarly to the first example, we can build a dispatch table of extensions and then act on it:

let extensions = [
                    "xml", Result.xml
                    "json", Result.json
                    "html", html
for ext,writer in extensions do
    let ifConneg4 = ifPathIsf "conneg4.%s" ext
    action (ifMethodIsGet &&. ifConneg4) (connegAction >>= writer)

Using extensions + conneg

Some prefer a compromise between conneg and extensions, by implementing an extensionless URL that supports content negotiation and then the same URL with extensions as a means to override conneg and work around possible client issues, or just selecting a media type without messing with headers.

For our example we'd want an URL /conneg5 that supports content negotiation, plus /conneg5.xml, /conneg5.json and /conneg5.html to force a particular media type.

As with the previous approaches, let's build a table:

let writers = [ 
                "xml", ["application/xml"; "text/xml"], Result.xml 
                "json", ["application/json"], Result.json 
                "html", ["text/html"], html 

Now let's map the extensions, just as in the last example:

let basePath = "conneg5"

for ext,_,writer in writers do 
    let ifBasePath = ifPathIsf "%s.%s" basePath ext 
    action (ifMethodIsGet &&. ifBasePath) (connegAction >>= writer)

Finally, the conneg'd URL:

let mediaTypes = (fun (_,a,b) -> a,b) writers 
let ifBasePath = ifPathIs basePath 
action (ifMethodIsGet &&. ifBasePath) (negotiateActionMediaType mediaTypes connegAction)

If you do this for all your actions you could easily extract this to a reusable function.

Final words

As I said before, these are all potential integrations. I specifically want these libraries to be as non-opinionated as possible. But at the same time, I want to provide all the tools to let the developer easily create her own opinions/conventions to fit her project, using library calls and standard language constructs instead of having to learn framework extension points. For example, notice that the dispatch tables I explained are all regular lists of strings and functions, which are then mapped and iterated just like any other list. More on this in a future post.

All code posted here is part of the FigmentPlayground repository.