tag:blogger.com,1999:blog-86438578998061622802024-03-06T02:21:36.808-03:00Bug squashSquashing bugs all day longMauricio Schefferhttp://www.blogger.com/profile/15247972578064164206noreply@blogger.comBlogger15015tag:blogger.com,1999:blog-8643857899806162280.post-56868206398819359892014-05-23T19:25:00.001-03:002014-05-23T19:29:13.167-03:00Mapping JSON to objects with Fleece<p>In the <a href="http://bugsquash.blogspot.com/2014/05/mapping-objects-to-json-with-fleece.html">last post</a> I introduced <a href="https://github.com/mausch/Fleece">Fleece</a> and briefly explained how to use it to map objects to JSON. Sometimes I say “serialize” instead of “map”, but since the actual serialization is done by <a href="http://msdn.microsoft.com/en-us/library/system.json(v=vs.110).aspx">System.Json</a> I think the right term to use here is “map” (as in mapping an object to a tree of JsonValues), or maybe "encoding" / "decoding".</p> <p>Fleece can also do the opposite operation: map JSON to objects. There’s already an excellent F# library to deserialize JSON to typed objects: the <a href="http://fsharp.github.io/FSharp.Data/library/JsonProvider.html">JSON type provider from FSharp.Data</a> (previously <a href="http://www.navision-blog.de/blog/2012/03/25/typed-access-to-json-and-xml/">implemented in FSharpx</a>), and so it’s impossible to avoid comparisons.</p> <h3>Some drawbacks of the JSON type provider</h3> <p>Whenever you need to deserialize JSON, I recommend you to try the JSON type provider first. When the conditions are right, nothing beats its simplicity.</p> <p>But the conditions aren’t always right, and so the JSON type provider is sometimes not the best tool to use to deserialize JSON. Some of its drawbacks are:</p> <p>1. Not total: throws exceptions when parsing fails. Exceptions hurt composability and your ability to reason about the code. This is mostly just an annoyance, as we can easily work around it with a small higher-order function: </p> <pre class="code"><span style="background: white; color: blue">let inline </span><span style="background: white; color: black">protect f x =
</span><span style="background: white; color: blue">try
</span><span style="background: white; color: black">Choice1Of2 (f x)
</span><span style="background: white; color: blue">with </span><span style="background: white; color: black">e </span><span style="background: white; color: blue">-> </span><span style="background: white; color: black">Choice2Of2 e
</span></pre>
<p>(This is already part of <a href="https://github.com/fsprojects/fsharpx">FSharpx</a> by the way).</p>
<p>2. Another annoyance is that the current implementation of the JSON type provider outputs erased types, so the types inferred from the JSON sample are only available in F#, not other languages. So you if you ever need to consume these types from C# or VB.NET you'll have to copy the implicit types explicitly and write the code to map them. This defeats the “no-code” benefit of using a type provider.</p>
<p>3. You’ll also usually want to write explicit types if you want to perform some additional validations. Think for example a <a href="http://bugsquash.blogspot.com/2012/09/a-non-empty-list-type-for-net.html">NonEmptyList</a>. In fact <a href="http://stackoverflow.com/questions/20421160/providing-a-discriminated-union-from-an-f-type-provider">the type provider mechanism can’t generate records or discriminated unions</a>, so if you want precise typing you have no choice but to write your types and then map them from the output of the type provider, again defeating the “no-code” benefit of using a type provider.</p>
<p>4. If the JSON input is “dynamic”, i.e. its structure is not exactly always the same, it varies depending on some request parameters, etc, then the type provider becomes useless because you can’t rely on a single sample (or a manageable number of samples) to infer the types. In this case you want to start working with the types, not with JSON samples. That’s why FSharp.Data also <a href="http://fsharp.github.io/FSharp.Data/library/JsonValue.html">exposes and documents its underlying JSON parser/reader</a>.</p>
<p>5. The parser generated by the type provider is monolithic, so you can’t “customize” a parser, or introduce validations while parsing/mapping, etc.</p>
<p>I don’t mean to make this post about criticizing the JSON type provider so I won’t go into detail about each of these points. As an exercise to illustrate these points try to use the JSON type provider to build a generic parser for the <a href="http://wiki.apache.org/solr/SolJSON">JSON output from Solr</a> that can be consumed from any .NET language.</p>
<p>The concrete case that motivated me to write Fleece was the <a href="https://github.com/mausch/EdmundsNet">Edmunds API</a>, which is very irregular (or “dynamic” depending on the point of view), plus I wanted specific types and needed to consume these API bindings from C#.</p>
<p>In a future post I might explore how to combine Fleece and the JSON type provider to take advantage of the benefits of each one where they are strong.</p>
<h3>Fleece: the FromJSON typeclass</h3>
<p>Back to Fleece: just as serialization is overloaded with the ToJSON fake typeclass, deserialization is built around a FromJSON typeclass. </p>
<p>The signature of the overloaded fromJSON function is:</p>
<pre class="code"><span style="background: white; color: black">fromJSON : JsonValue </span><span style="background: white; color: blue">-> </span><span style="background: white; color: black">'a ParseResult</span></pre>
<p>where 'a is the type to decode (it must be overloaded in the FromJSON typeclass) and ParseResult a simple alias to <code>Choice<'a, string></code>, i.e. you get either the decoded value or an error. </p>
<p>There’s also a convenience function <code>parseJSON: string -> 'a ParseResult</code> that takes a raw JSON string as input.</p>
<p>Let’s start with a simple example. We have this tree of people and their children:</p>
<pre class="code"><span style="background: white; color: blue">let </span><span style="background: white; color: black">personJson = </span><span style="background: white; color: #a31515">"""
{
"name": "John",
"age": 44,
"children": [{
"name": "Katy",
"age": 5,
"children": []
}, {
"name": "Johnny",
"age": 7,
"children": []
}]
}
"""</span></pre>
<p>We can represent this with the following recursive type:</p>
<pre class="code"><span style="background: white; color: blue">type </span><span style="background: white; color: black">Person = {
Name: string
Age: int
Children: Person list
}</span></pre>
<p>Here’s one way to define FromJSON for the Person type:</p>
<pre class="code"><span style="background: white; color: blue">type </span><span style="background: white; color: black">Person </span><span style="background: white; color: blue">with
static member </span><span style="background: white; color: black">FromJSON (_: Person) =
</span><span style="background: white; color: blue">function
</span><span style="background: white; color: black">| JObject o </span><span style="background: white; color: blue">->
let </span><span style="background: white; color: black">name = o .@ </span><span style="background: white; color: #a31515">"name"
</span><span style="background: white; color: blue">let </span><span style="background: white; color: black">age = o .@ </span><span style="background: white; color: #a31515">"age"
</span><span style="background: white; color: blue">let </span><span style="background: white; color: black">children = o .@ </span><span style="background: white; color: #a31515">"children"
</span><span style="background: white; color: blue">match </span><span style="background: white; color: black">name, age, children </span><span style="background: white; color: blue">with
</span><span style="background: white; color: black">| Success name, Success age, Success children </span><span style="background: white; color: blue">->
</span><span style="background: white; color: black">Success {
Person.Name = name
Age = age
Children = children
}
| x </span><span style="background: white; color: blue">-> </span><span style="background: white; color: black">Failure (sprintf </span><span style="background: white; color: #a31515">"Error parsing person: %A" </span><span style="background: white; color: black">x)
| x </span><span style="background: white; color: blue">-> </span><span style="background: white; color: black">Failure (sprintf </span><span style="background: white; color: #a31515">"Expected person, found %A" </span><span style="background: white; color: black">x)</span></pre>
<p>Note the unused parameter of type Person: this is needed to make overloads unique and get the compiler to choose the right overloads.</p>
<p>Other than that, this is a function <code>JsonValue -> Person ParseResult</code>. </p>
<p><code>JObject</code> is an <a href="http://msdn.microsoft.com/en-us/library/dd233248.aspx">active pattern</a> identifying a JSON object (as opposed to a string, number, null, etc).</p>
<p><code>Success</code> and <code>Failure</code> are simple aliases for the constructors <code>Choice1Of2</code> and <code>Choice2Of2</code> respectively, giving them more meaningful names. They’re also available as active patterns so we can use them in pattern matching.</p>
<p>The <code>.@</code> operator tries to get a mapped value from a JSON object by key. That is, you can only call it for types that have a suitable <code>FromJSON</code> defined. Otherwise you get a compile-time error.</p>
<p>There’s also an operator <code>.@?</code> intended for optional keys in a JSON object, i.e. it returns <code>Success None</code> when the key isn't found, whereas <code>.@</code> returns <code>Failure "key xxx not found"</code></p>
<p>If you don’t like operators you can use the equivalent named functions <code>jget</code> / <code>jgetopt</code>.</p>
<p>That’s it, now we can parse JSON into Person:</p>
<pre class="code"><span style="background: white; color: blue">let </span><span style="background: white; color: black">john : Person ParseResult = parseJSON personJson</span></pre>
<p>Just as with serialization, deserialization in Fleece is total and ad-hoc polymorphic, and we get full compile-time checking. The same <a href="http://bugsquash.blogspot.com/2014/05/on-parametric-polymorphism-and-json.html">arguments about not breaking parametricity</a> apply here.</p>
<p>Now, pattern matching each parsed value for Success/Failure doesn't sound like fun. Since these parsers return <code>Choice<’value, ‘error></code> we can code monadically instead, so we can focus on the happy path, as <a href="https://gist.github.com/mausch/8227399">Erik Meijer says</a>. We can use the <code>Choice.choose</code> computation expression in <a href="http://www.nuget.org/packages/FSharp.Core/">FSharpx.Core</a>, or the generic monad computation expression in <a href="http://www.nuget.org/packages/FSharpPlus/">FSharpPlus</a>. Since Fleece already has a dependency on FSharpPlus, let’s use that:</p>
<pre class="code"><span style="background: white; color: blue">type </span><span style="background: white; color: black">Person </span><span style="background: white; color: blue">with
static member </span><span style="background: white; color: black">FromJSON (_: Person) =
</span><span style="background: white; color: blue">function
</span><span style="background: white; color: black">| JObject o </span><span style="background: white; color: blue">->
</span><span style="background: white; color: black">monad {
</span><span style="background: white; color: blue">let! </span><span style="background: white; color: black">name = o .@ </span><span style="background: white; color: #a31515">"name"
</span><span style="background: white; color: blue">let! </span><span style="background: white; color: black">age = o .@ </span><span style="background: white; color: #a31515">"age"
</span><span style="background: white; color: blue">let! </span><span style="background: white; color: black">children = o .@ </span><span style="background: white; color: #a31515">"children"
</span><span style="background: white; color: blue">return </span><span style="background: white; color: black">{
Person.Name = name
Age = age
Children = children
}
}
| x </span><span style="background: white; color: blue">-> </span><span style="background: white; color: black">Failure (sprintf </span><span style="background: white; color: #a31515">"Expected person, found %A" </span><span style="background: white; color: black">x)</span></pre>
<p>This reads much better. The ‘monad’ computation expression gets the compiler to infer the concrete type for the monad, in this case <code>Choice<'a, 'e></code>. </p>
<p>We can write it even more compactly using applicative functors, though we need a curried constructor for Person:</p>
<pre class="code"><span style="background: white; color: blue">type </span><span style="background: white; color: black">Person </span><span style="background: white; color: blue">with
static member </span><span style="background: white; color: black">Create name age children = { Person.Name = name; Age = age; Children = children }
</span><span style="background: white; color: blue">static member </span><span style="background: white; color: black">FromJSON (_: Person) =
</span><span style="background: white; color: blue">function
</span><span style="background: white; color: black">| JObject o </span><span style="background: white; color: blue">-> </span><span style="background: white; color: black">Person.Create <!> (o .@ </span><span style="background: white; color: #a31515">"name"</span><span style="background: white; color: black">) <*> (o .@ </span><span style="background: white; color: #a31515">"age"</span><span style="background: white; color: black">) <*> (o .@ </span><span style="background: white; color: #a31515">"children"</span><span style="background: white; color: black">)
| x </span><span style="background: white; color: blue">-> </span><span style="background: white; color: black">Failure (sprintf </span><span style="background: white; color: #a31515">"Expected person, found %A" </span><span style="background: white; color: black">x)</span></pre>
<p>FSharpPlus already comes with overloaded applicative operators for the common applicatives (Choice, Option, etc).</p>
<p>We could go further and wrap that pattern match into a function, but let's just leave it at that.</p>
<p>What’s different about decoding JSON with Fleece is in the <code>.@</code> operator. Just as with the <code>.=</code> operator does for encoding and the <code>ToJSON</code> typeclass, <code>.@</code> only works for types that have a suitable <code>FromJSON</code> defined. Otherwise you get a compile-time error.</p>
<p>Not only that, but you also get decoder composition “for free”. Note how in the previous example <code>o .@ "children"</code> is inferring a <code>Person list</code>, which composes the decoder for <code>'a list</code> with the very same decoder we’re defining for Person.
<br />Fleece includes many decoders for common types, so if you want to decode, say, a <code>Choice<(int * string) list, Choice<decimal option, string>></code> you don’t really need to do anything, and it’s all statically type-safe, pure and total, not breaking parametricity.</p>
<h3>Roundtrips</h3>
<p>When you need to both serialize and deserialize a type to JSON, it’s useful to make it roundtrip-safe, i.e. if you call <code>toJSON</code> and then <code>fromJSON</code> you get the original value again.</p>
<p>You can encode this property like this:</p>
<pre class="code"><span style="background: white; color: blue">let inline </span><span style="background: white; color: black">roundtrip p =
</span><span style="background: white; color: blue">let </span><span style="background: white; color: black">actual = p |> toJSON |> fromJSON
actual = Success p</span></pre>
<p>Use <a href="https://github.com/fsharp/FsCheck">FsCheck</a> to check this property, which will run it through a large number of instances for the type you want to check. <a href="https://github.com/mausch/Fleece/blob/630477e479f3ea8776ab0d01b4089686aecc6afd/Tests/Tests.fs#L201">Fleece does this</a> for primitive types.</p>
<p>Also note the “inline” in this definition, which makes it possible to write generic code without having to specify any particular type. If you hover the mouse in Visual Studio over “roundtrip” it says <code>val roundtrip: ('a -> bool) (requires member ToJSON and member FromJSON and equality)</code>, which means that the compiler is inferring the necessary constraints that the type must satisfy.</p>
<p>In the next post we’ll do some deeper analysis of the pros and cons of this typeclass-based approach to JSON encoding.</p> Mauricio Schefferhttp://www.blogger.com/profile/15247972578064164206noreply@blogger.com1tag:blogger.com,1999:blog-8643857899806162280.post-7982699267737529122014-05-13T16:03:00.001-03:002014-05-13T16:03:51.868-03:00Mapping objects to JSON with Fleece<p>In the <a href="http://bugsquash.blogspot.com/2014/05/on-parametric-polymorphism-and-json.html">last post</a> I briefly explained some of the consequences of breaking parametricity, in particular for JSON serialization. I used JSON serialization here as a particular case only to introduce <a href="https://github.com/mausch/Fleece">Fleece</a>, but breaking parametricity anywhere has similar consequences.</p> <p>So how can we serialize JSON without breaking parametricity?</p> <p>The simplest thing we could do is to write multiple monomorphic Serialize functions, one for each type we want to serialize, e.g:</p> <pre class="code"><span style="background: white; color: blue">class </span><span style="background: white; color: #2b91af">Person </span><span style="background: white; color: black">{
</span><span style="background: white; color: blue">public readonly int </span><span style="background: white; color: black">Id;
</span><span style="background: white; color: blue">public readonly string </span><span style="background: white; color: black">Name;
</span><span style="background: white; color: blue">public </span><span style="background: white; color: black">Person(</span><span style="background: white; color: blue">int </span><span style="background: white; color: black">id, </span><span style="background: white; color: blue">string </span><span style="background: white; color: black">name) {
Id = id;
Name = name;
}
}
</span><span style="background: white; color: blue">string </span><span style="background: white; color: black">Serialize(Person p) {
</span><span style="background: white; color: blue">return </span><span style="background: white; color: #a31515">@"{""Id"": " </span><span style="background: white; color: black">+ p.Id + </span><span style="background: white; color: #a31515">@", ""Name"": """ </span><span style="background: white; color: black">+ p.Name + </span><span style="background: white; color: #a31515">@"""}"</span><span style="background: white; color: black">;
}
</span></pre>
<p>You’ll notice that this code doesn’t escape the <code>Name</code> value and therefore will generate broken JSON in general. Why doesn’t this happen with the <code>Id</code>? Types! The lowly <code>int</code> type has restricted the values it can have and therefore we can statically assure that it doesn’t need any escaping. Cool, huh?</p>
<p>So how do we solve the escaping problem for the Name field? With more types, of course! Instead of handling JSON as strings, we create some types to model more precisely what JSON can do, and let the compiler check that for us. <a href="http://msdn.microsoft.com/en-us/library/system.json(v=vs.110).aspx">System.Json</a> does just that, so let’s use it. Strictly speaking it’s still a too loose but it’s decent enough. Our code becomes:</p>
<pre class="code"><span style="background: white; color: black">JsonObject Serialize(Person p) {
</span><span style="background: white; color: blue">return new </span><span style="background: white; color: black">JsonObject {
{</span><span style="background: white; color: #a31515">"Id"</span><span style="background: white; color: black">, </span><span style="background: white; color: red">p</span><span style="background: white; color: black">.Id},
{</span><span style="background: white; color: #a31515">"Name"</span><span style="background: white; color: black">, p.Name},
};
}
</span></pre>
<p>Now we don’t have to worry about encoding issues, <code>JsonObject</code> takes care of that. Also the function now returns a <code>JsonObject</code> instead of a string, which allows us to safely compose the result into larger JSON objects. When we’re done composing, we call <code>ToString()</code> to get the serialized JSON.
<br />One way to look at this is that we’re defining a tiny language here, with <code>JsonValue.ToString()</code> being one possible interpreter.
<br />I’m sorry if these explanations sound a bit patronizing, but I really want to emphasize how types make our job easier.</p>
<p>Let’s raise the abstraction bar a bit and write a function that serializes an <code>IEnumerable<T></code>. Since we don’t want to break parametricity, we must ensure that this will work for any type T such that T can itself be serialized. But we don’t know how to turn any arbitrary type T into a <code>JsonValue</code>, and we’ve ruled out runtime type inspection, so we have to pass the conversion explicitly:</p>
<pre class="code"><span style="background: white; color: #2b91af">JsonArray </span><span style="background: white; color: black">Serialize<T>(</span><span style="background: white; color: #2b91af">IEnumerable</span><span style="background: white; color: black"><T> list, </span><span style="background: white; color: #2b91af">Func</span><span style="background: white; color: black"><T, </span><span style="background: white; color: #2b91af">JsonValue</span><span style="background: white; color: black">> serializerT) {
</span><span style="background: white; color: blue">return new </span><span style="background: white; color: #2b91af">JsonArray</span><span style="background: white; color: black">(list.Select(serializerT));
}
</span></pre>
<p>That works, but it’s not very nice. We have to compose these serializer functions manually! Every time we want to serialize an <code>IEnumerable<T></code> we’ll have to look up where the function to serialize <code>T</code> is. Is there any way to avoid that?</p>
<p>We could pass the conversion by using interfaces. If we have:</p>
<pre class="code"><span style="background: white; color: blue">public interface </span><span style="background: white; color: #2b91af">IToJson </span><span style="background: white; color: black">{
</span><span style="background: white; color: #2b91af">JsonValue </span><span style="background: white; color: black">ToJson();
}
</span></pre>
<p>We could make <code>Person</code> implement <code>IToJson</code> simply by moving the <code>Serialize(Person p)</code> function above to the definition of <code>Person</code>:</p>
<pre class="code"><span style="background: white; color: blue">class </span><span style="background: white; color: #2b91af">Person</span><span style="background: white; color: black">: </span><span style="background: white; color: #2b91af">IToJson </span><span style="background: white; color: black">{
</span><span style="background: white; color: blue">public readonly int </span><span style="background: white; color: black">Id;
</span><span style="background: white; color: blue">public readonly string </span><span style="background: white; color: black">Name;
</span><span style="background: white; color: blue">public </span><span style="background: white; color: black">Person2(</span><span style="background: white; color: blue">int </span><span style="background: white; color: black">id, </span><span style="background: white; color: blue">string </span><span style="background: white; color: black">name) {
Id = id;
Name = name;
}
</span><span style="background: white; color: blue">public </span><span style="background: white; color: #2b91af">JsonValue </span><span style="background: white; color: black">ToJson() {
</span><span style="background: white; color: blue">return new </span><span style="background: white; color: #2b91af">JsonObject </span><span style="background: white; color: black">{
{</span><span style="background: white; color: #a31515">"Id"</span><span style="background: white; color: black">, Id},
{</span><span style="background: white; color: #a31515">"Name"</span><span style="background: white; color: black">, Name},
};
}
}
</span></pre>
<p>Then serializing a list can be restricted to types implementing IToJson:</p>
<pre class="code"><span style="background: white; color: #2b91af">JsonValue </span><span style="background: white; color: black">ToJson<T>(</span><span style="background: white; color: blue">this </span><span style="background: white; color: #2b91af">IEnumerable</span><span style="background: white; color: black"><T> list) </span><span style="background: white; color: blue">where </span><span style="background: white; color: black">T: </span><span style="background: white; color: #2b91af">IToJson </span><span style="background: white; color: black">{
</span><span style="background: white; color: blue">return new </span><span style="background: white; color: #2b91af">JsonArray</span><span style="background: white; color: black">(list.Select(x => x.ToJson()));
}
</span></pre>
<p>But this only works for types we control. We can’t make <code>IEnumerable<T></code> implement <code>IToJson</code>. We can wrap it:</p>
<pre class="code"><span style="background: white; color: blue">class </span><span style="background: white; color: #2b91af">JsonList</span><span style="background: white; color: black"><T>: </span><span style="background: white; color: #2b91af">IToJson </span><span style="background: white; color: blue">where </span><span style="background: white; color: black">T: </span><span style="background: white; color: #2b91af">IToJson </span><span style="background: white; color: black">{
</span><span style="background: white; color: blue">public readonly </span><span style="background: white; color: #2b91af">IEnumerable</span><span style="background: white; color: black"><T> List;
</span><span style="background: white; color: blue">public </span><span style="background: white; color: black">JsonList(</span><span style="background: white; color: #2b91af">IEnumerable</span><span style="background: white; color: black"><T> list) {
List = list;
}
</span><span style="background: white; color: blue">public </span><span style="background: white; color: #2b91af">JsonValue </span><span style="background: white; color: black">ToJson() {
</span><span style="background: white; color: blue">return new </span><span style="background: white; color: #2b91af">JsonArray</span><span style="background: white; color: black">(List.Select(x => x.ToJson()));
}
}
</span></pre>
<p>But this is inconvenient. Suppose we have a class <code>Company</code> with a list of <code>Person</code> as employees. Here’s how serialization would look like:</p>
<pre class="code"><span style="background: white; color: blue">class </span><span style="background: white; color: #2b91af">Company</span><span style="background: white; color: black">: </span><span style="background: white; color: #2b91af">IToJson </span><span style="background: white; color: black">{
</span><span style="background: white; color: blue">public readonly string </span><span style="background: white; color: black">Name;
</span><span style="background: white; color: blue">public readonly </span><span style="background: white; color: #2b91af">List</span><span style="background: white; color: black"><</span><span style="background: white; color: #2b91af">Person</span><span style="background: white; color: black">> Employees;
</span><span style="background: white; color: blue">public </span><span style="background: white; color: black">Company(</span><span style="background: white; color: blue">string </span><span style="background: white; color: black">name, </span><span style="background: white; color: #2b91af">List</span><span style="background: white; color: black"><</span><span style="background: white; color: #2b91af">Person</span><span style="background: white; color: black">> employees) {
Name = name;
Employees = employees;
}
</span><span style="background: white; color: blue">public </span><span style="background: white; color: #2b91af">JsonValue </span><span style="background: white; color: black">ToJson() {
</span><span style="background: white; color: blue">return new </span><span style="background: white; color: #2b91af">JsonObject </span><span style="background: white; color: black">{
{</span><span style="background: white; color: #a31515">"Name"</span><span style="background: white; color: black">, Name},
{</span><span style="background: white; color: #a31515">"Employees"</span><span style="background: white; color: black">, </span><span style="background: white; color: blue">new </span><span style="background: white; color: #2b91af">JsonList</span><span style="background: white; color: black"><</span><span style="background: white; color: #2b91af">Person</span><span style="background: white; color: black">>(Employees).ToJson()}
};
}
}
</span></pre>
<p>That’s not much better than manually inlining the code for <code>JsonList.ToJson()</code>. Ideally, we just want to call a simple function <code>ToJson</code> and have the compiler somehow figure out if there’s a suitable function declared for the type of the argument. Overloading would be great if the generic <code>ToJson</code> function for lists could somehow recursively look into what overloads are defined if there’s a match for <code>T</code>, at compile time.</p>
<p>As far as I know C# can’t do that but it turns out that F# can.</p>
<h3>Enter Fleece</h3>
<p>With <a href="https://github.com/mausch/Fleece">Fleece</a>, converting Person and Company to JSON goes like this, without implementing any IToJson interface:</p>
<pre class="code"><span style="background: white; color: blue">type </span><span style="background: white; color: black">Person </span><span style="background: white; color: blue">with
static member </span><span style="background: white; color: black">ToJSON (x: Person) =
jobj [
</span><span style="background: white; color: #a31515">"Id" </span><span style="background: white; color: black">.= x.Id
</span><span style="background: white; color: #a31515">"Name" </span><span style="background: white; color: black">.= x.Name
]
</span><span style="background: white; color: blue">type </span><span style="background: white; color: black">Company </span><span style="background: white; color: blue">with
static member </span><span style="background: white; color: black">ToJSON (x: Company) =
jobj [
</span><span style="background: white; color: #a31515">"Name" </span><span style="background: white; color: black">.= x.Name
</span><span style="background: white; color: #a31515">"Employees" </span><span style="background: white; color: black">.= x.Employees
]
</span><span style="background: white; color: blue">let </span><span style="background: white; color: black">company = { Company.Name = </span><span style="background: white; color: #a31515">"Double Fine"</span><span style="background: white; color: black">; Employees = [{ Employee.Id = 1; Name = </span><span style="background: white; color: #a31515">"Tim Schafer"</span><span style="background: white; color: black">}] }
</span><span style="background: white; color: blue">let </span><span style="background: white; color: black">jsonAsString = (toJSON company).ToString()
</span></pre>
<p>Fleece here is statically ensuring that the type of the argument of <code>toJSON</code> has a definition of a static member <code>ToJSON</code>. It also checks statically that every type to the right of a <code>.=</code> expression has a suitable definition of <code>ToJSON</code>, and chooses it automatically. If there isn’t a suitable definition, you get a compile-time error.</p>
<p>Moreover, it "composes" (probably not the right term to use here) the serializers automatically at compile-time. In previous examples, we had to compose the list serializer with the <code>Person</code> serializer “manually” to serialize a concrete list of persons. Fleece defines a parametric serializer for lists, and then the compiler composes this serializer with the serializer for <code>Person</code> we have defined here.</p>
<p>What makes this possible in F# are inlines and static member constraints. Here’s how the definition of <code>ToJSON</code> for a list looks like:</p>
<pre class="code"><span style="background: white; color: blue">type </span><span style="background: white; color: black">ToJSONClass </span><span style="background: white; color: blue">with
static member inline </span><span style="background: white; color: black">ToJSON (x: 'a list) =
JArray (listAsReadOnly (List.map toJSON x))
</span></pre>
<p>If you hover in Visual Studio over the ToJSON keyword in this definition, it says <code>ToJSON: 'a list -> JsonValue (requires member ToJSON)</code>. This means that F# is inferring that the type <code>'a</code> must have a <code>ToJSON</code> member in order to call <code>toJSON</code> on a list of <code>'a</code>. </p>
<p>This is equivalent to Haskell’s:</p>
<pre>instance (ToJSON a) => ToJSON [a] where
toJSON = Array . V.fromList . map toJSON</pre>
<p>Haskell has dedicated syntax for this which makes the constraint explicit compared to F#.</p>
<p>This has long been used in F# for <a href="http://tomasp.net/blog/fsharp-generic-numeric.aspx/">generic math</a>. F# also has an ad-hoc, limited form of this in the <a href="http://msdn.microsoft.com/en-us/library/ee353419.aspx">EqualityConditionalOn attribute</a>, where the equality of a type depends on a type argument having “equality”. So for example, this in F#:</p>
<pre class="code"><span style="background: white; color: blue">type </span><span style="background: white; color: black">MyBox<[<EqualityConditionalOn>] 'a> = ...</span></pre>
<p>Roughly corresponds to Haskell’s:</p>
<pre>instance (Eq a) => Eq (MyBox a) where ...</pre>
<p>With the difference that <a href="http://blogs.msdn.com/b/dsyme/archive/2009/11/08/equality-and-comparison-constraints-in-f-1-9-7.aspx">F# can sometimes derive equality automatically depending on the equality of type arguments</a>.</p>
<p>The bottom line here is that the toJSON function is overloaded only for supported types, instead of being parametrically polymorphic, so it does not break parametricity. This overloading is also called <a href="http://www.haskell.org/haskellwiki/Polymorphism#Ad-hoc_polymorphism">ad-hoc polymorphism</a>. Haskell achieves ad-hoc polymorphism thanks to typeclasses, and this technique is based on that. In fact, Fleece is based on <a href="http://hackage.haskell.org/package/aeson">Aeson</a>, the de-facto standard JSON library for Haskell.</p>
<p>Anton Tayanovskyy had also <a href="http://t0yv0.blogspot.com/2011/12/hacking-type-classes-in-f.html">prototyped</a> a similar typeclass-based JSON library for F# some time ago. </p>
<p>Fleece only scratches the surface of what’s possible with inline-encoded typeclasses (a.k.a. “type methods”). I recommend reading <a href="http://www.nut-cracker.com.ar/">Gustavo León’s blog</a> to learn more about this technique and checking out the <a href="https://github.com/gmpl/FsControl">FsControl</a> and <a href="https://github.com/gmpl/FSharpPlus">FSharpPlus</a> projects.</p>
<h3>Refactor safety</h3>
<p>Now, in my last post I claimed that this approach would save code by saving tests. But here we see that we have to write serializer code for each type that we want to serialize, unlike reflection-based libraries! That’s quite a bit of boilerplate! I could argue that it’s a small price to pay to get code you can reason about, but the truth is this is pretty trivial code that could be easily generated at compile-time.
<br />In fact, Haskell can do that, either with Template Haskell or the more recent GHC.Generics, in which case you only need to <a href="http://no-fucking-idea.com/blog/2014/03/23/shortest-way-to-work-with-json-in-haskell/">make your type derive Generic and declare the typeclass instance</a>. The actual serialization code is filled in by the compiler.</p>
<p>But there is a bigger problem with both reflection- and GHC.Generics-derived serialization: they tie JSON output to identifiers in your code. So whenever you rename some identifier (for example a field name in a record) in a type that is used to model some JSON output, <em>you’re implicitly changing your JSON schema</em>. You’re making a breaking change in the output of the program in what’s normally a safe operation. To quote a tweet:</p>
<blockquote lang="en" class="twitter-tweet">
<p>any reason, other than a fondness of imminent schema and versioning issues, to use reflection-based JSON serializer in F#?</p>
— Lev Gorodinski (@eulerfx) <a href="https://twitter.com/eulerfx/statuses/451122104768151552">April 1, 2014</a></blockquote>
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>Or more bluntly:</p>
<blockquote lang="en" class="twitter-tweet">
<p>OH on <a href="https://twitter.com/search?q=%23scalaz&src=hash">#scalaz</a>: “If I walk up to your code and rename all the fields [or anything!] and the program changes, I think you should go to jail.”</p>
— Sukant Hajra (@shajra) <a href="https://twitter.com/shajra/statuses/421407115995926528">January 9, 2014</a></blockquote>
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>Still, it might be useful to start prototyping with reflection-based serializer while being aware of these issues, then switch to explicit serialization once the initial prototype stage is done. Some haskellers do this with GHC.Generics-derived serialization (thanks to <a href="https://twitter.com/jfischoff">Jonathan Fischoff</a> for confirming this on #haskell IRC).</p>
<p>Even in C#, without these F# fake typeclases, you should be very wary of breaking parametricity and the cost on maintenance it implies. In my opinion, the boilerplate is worth it to avoid breaking parametricity.</p>
<p>In the next post we’ll see how to use Fleece to map in the opposite direction: JSON to objects. We’ll also see some of the drawbacks of these fake typeclasses.</p> Mauricio Schefferhttp://www.blogger.com/profile/15247972578064164206noreply@blogger.com3tag:blogger.com,1999:blog-8643857899806162280.post-74891283230535399682014-05-06T14:10:00.001-03:002014-05-13T16:04:32.135-03:00On parametric polymorphism and JSON serialization<p>A couple of months ago I wrote <a href="https://github.com/mausch/Fleece">Fleece</a>, a JSON mapper for F#. What does that mean? It provides a library of functions to help map a JSON value tree onto .NET typed instances. And some more functions to do the inverse operation: map typed .NET values onto a JSON tree. <br />Fleece delegates the actual JSON parsing and serialization to <a href="http://msdn.microsoft.com/en-us/library/system.json(v=vs.110).aspx">System.Json</a>.</p> <p>But what’s the purpose of Fleece? Why another JSON library? Why F#? Is Fleece merely another case of Not-Invented-Here Syndrome? After all, anyone can serialize things easily with something like <a href="https://github.com/ServiceStack/ServiceStack.Text">ServiceStack.Text</a>, for example:</p> <pre class="code"><span style="background: white; color: blue">class </span><span style="background: white; color: #2b91af">Person </span><span style="background: white; color: black">{
</span><span style="background: white; color: blue">public int </span><span style="background: white; color: black">Id { </span><span style="background: white; color: blue">get</span><span style="background: white; color: black">; </span><span style="background: white; color: blue">set</span><span style="background: white; color: black">; }
</span><span style="background: white; color: blue">public string </span><span style="background: white; color: black">Name { </span><span style="background: white; color: blue">get</span><span style="background: white; color: black">; </span><span style="background: white; color: blue">set</span><span style="background: white; color: black">; }
}
</span></pre>
<pre class="code"><span style="background: white; color: blue">var </span><span style="background: white; color: black">john = ServiceStack.Text.</span><span style="background: white; color: #2b91af">JsonSerializer</span><span style="background: white; color: black">.SerializeToString(</span><span style="background: white; color: blue">new </span><span style="background: white; color: #2b91af">Person </span><span style="background: white; color: black">{Id = 1, Name = </span><span style="background: white; color: #a31515">"John"</span><span style="background: white; color: black">});</span></pre>
<p>Right?</p>
<p>However there’s a problem here. How do we know that the code above works for this definition of <code>Person</code>? Well, of course we can just run it and get the expected result stored in the <code>john</code> variable. But can you be sure that it will work for this <code>Person</code> type, without running it? It seems obvious that it will work, otherwise the library wouldn’t be useful, would it?</p>
<p>And yet, if we slightly change the definition of <code>Person</code>:</p>
<pre class="code"><span style="background: white; color: blue">class </span><span style="background: white; color: #2b91af">Person </span><span style="background: white; color: black">{
</span><span style="background: white; color: blue">public readonly int </span><span style="background: white; color: black">Id;
</span><span style="background: white; color: blue">public readonly string </span><span style="background: white; color: black">Name;
</span><span style="background: white; color: blue">public </span><span style="background: white; color: black">Person(</span><span style="background: white; color: blue">int </span><span style="background: white; color: black">id, </span><span style="background: white; color: blue">string </span><span style="background: white; color: black">name) {
Id = id;
Name = name;
}
}
</span></pre>
<pre class="code"><span style="background: white; color: blue">var </span><span style="background: white; color: black">john = ServiceStack.Text.</span><span style="background: white; color: #2b91af">JsonSerializer</span><span style="background: white; color: black">.SerializeToString(</span><span style="background: white; color: blue">new </span><span style="background: white; color: #2b91af">Person</span><span style="background: white; color: black">(id: 1, name: </span><span style="background: white; color: #a31515">"John"</span><span style="background: white; color: black">));</span></pre>
<p>It will compile, but <code>john</code> will contain the string "{}", i.e. an empty object. Definitely not what anyone would want! Yes, <a href="http://stackoverflow.com/questions/17315543/servicestack-jsonserializer-not-serializing-public-members/17321711#17321711">set the magic <code>IncludePublicFields</code> flag</a> and it works, but why do we have to guess this? Would it make any difference if it throwed an exception instead of generating an empty JSON object? We spend a lot of time compiling things, can’t the compiler check this for us? </p>
<p>Even worse, <code>SerializeToString</code> will happily accept any instance of any type, even if it doesn’t make any sense:</p>
<pre class="code"><span style="background: white; color: blue">var </span><span style="background: white; color: black">json = ServiceStack.Text.</span><span style="background: white; color: #2b91af">JsonSerializer</span><span style="background: white; color: black">.SerializeToString(</span><span style="background: white; color: blue">new </span><span style="background: white; color: #2b91af">Func</span><span style="background: white; color: black"><</span><span style="background: white; color: blue">int</span><span style="background: white; color: black">, </span><span style="background: white; color: blue">int</span><span style="background: white; color: black">>(x => x + 1));
</span><span style="background: white; color: #2b91af">Console</span><span style="background: white; color: black">.WriteLine(json); </span><span style="background: white; color: green">// empty string
</span></pre>
<p>By the way, don’t think this is to bash ServiceStack in particular. Most JSON libraries for .NET have this problem:</p>
<pre class="code"><span style="background: white; color: blue">static void </span><span style="background: white; color: black">NewtonsoftJson() {
</span><span style="background: white; color: blue">var </span><span style="background: white; color: black">json = Newtonsoft.Json.</span><span style="background: white; color: #2b91af">JsonConvert</span><span style="background: white; color: black">.SerializeObject(</span><span style="background: white; color: blue">new </span><span style="background: white; color: #2b91af">Func</span><span style="background: white; color: black"><</span><span style="background: white; color: blue">int</span><span style="background: white; color: black">, </span><span style="background: white; color: blue">int</span><span style="background: white; color: black">>(x => x + 1));
</span><span style="background: white; color: #2b91af">Console</span><span style="background: white; color: black">.WriteLine(json);
</span><span style="background: white; color: green">// {"Delegate":{},"method0":{"Name":"<Newtonsoft>b__3","AssemblyName":"SerializationLies, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null","ClassName":"SerializationLies.Program","Signature":"Int32 <Newtonsoft>b__3(Int32)","Signature2":"System.Int32 <Newtonsoft>b__3(System.Int32)","MemberType":8,"GenericArguments":null}}
</span><span style="background: white; color: black">}
</span></pre>
<p><code>DataContractJsonSerializer</code> at least acknowledges its partiality and throws:</p>
<pre class="code"><span style="background: white; color: blue">static void </span><span style="background: white; color: black">DataContract() {
</span><span style="background: white; color: blue">using </span><span style="background: white; color: black">(</span><span style="background: white; color: blue">var </span><span style="background: white; color: black">ms = </span><span style="background: white; color: blue">new </span><span style="background: white; color: #2b91af">MemoryStream</span><span style="background: white; color: black">()) {
</span><span style="background: white; color: blue">new </span><span style="background: white; color: #2b91af">DataContractJsonSerializer</span><span style="background: white; color: black">(</span><span style="background: white; color: blue">typeof</span><span style="background: white; color: black">(</span><span style="background: white; color: #2b91af">Func</span><span style="background: white; color: black"><</span><span style="background: white; color: blue">int</span><span style="background: white; color: black">, </span><span style="background: white; color: blue">int</span><span style="background: white; color: black">>)).WriteObject(ms, </span><span style="background: white; color: blue">new </span><span style="background: white; color: #2b91af">Func</span><span style="background: white; color: black"><</span><span style="background: white; color: blue">int</span><span style="background: white; color: black">, </span><span style="background: white; color: blue">int</span><span style="background: white; color: black">>(x => x + 1));
</span><span style="background: white; color: #2b91af">Console</span><span style="background: white; color: black">.WriteLine(</span><span style="background: white; color: #2b91af">Encoding</span><span style="background: white; color: black">.ASCII.GetString(ms.ToArray()));
}
</span><span style="background: white; color: green">/* throws:
Unhandled Exception: System.Runtime.Serialization.SerializationException: DataContractJsonSerializer does not support the setting of the FullTypeName of the object to be serialized to a value other than the default FullTypeName.
Attempted to serialize object with full type name 'System.DelegateSerializationHolder' and default full type name
'System.Func`2[[System.Int32, mscorlib, Version=4.0.0.0,Culture=neutral, PublicKeyToken=b77a5c561934e089],[System.Int32, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]]'.
*/
</span><span style="background: white; color: black">}
</span></pre>
<p>And by extension, pretty much all web frameworks (probably the most common producers of JSON) have this problem too:</p>
<p>ASP.NET MVC 4</p>
<pre class="code"><span style="background: white; color: blue">public class </span><span style="background: white; color: #2b91af">HomeController </span><span style="background: white; color: black">: Controller {
</span><span style="background: white; color: blue">public </span><span style="background: white; color: black">ActionResult Index() {
</span><span style="background: white; color: blue">return </span><span style="background: white; color: black">Json(</span><span style="background: white; color: blue">new </span><span style="background: white; color: #2b91af">Func</span><span style="background: white; color: black"><</span><span style="background: white; color: blue">int</span><span style="background: white; color: black">, </span><span style="background: white; color: blue">int</span><span style="background: white; color: black">>(x => x + 1), JsonRequestBehavior.AllowGet);
}
}
</span></pre>
<p>Throws <code>System.InvalidOperationException: A circular reference was detected while serializing an object of type 'System.Reflection.RuntimeModule'.</code></p>
<p>And therefore also <a href="http://bugsquash.blogspot.com/search/label/figment">Figment</a>, since I based it on ASP.NET MVC:</p>
<pre>get "/" (json ((+) 1))</pre>
<p>In ASP.NET Web API your controllers can return any object and the serializers will try to serialize it according to the result of content negotiation. In the case of JSON, the default serializer is Newtonsoft so you end up with the same result I showed above for Newtonsoft.</p>
<p>In <a href="http://nancyfx.org/">NancyFX</a>:</p>
<pre class="code"><span style="background: white; color: blue">public class </span><span style="background: white; color: #2b91af">Home </span><span style="background: white; color: black">: NancyModule {
</span><span style="background: white; color: blue">public </span><span style="background: white; color: black">Home() {
Get[</span><span style="background: white; color: #a31515">"/home"</span><span style="background: white; color: black">] = _ => </span><span style="background: white; color: blue">new </span><span style="background: white; color: #2b91af">Func</span><span style="background: white; color: black"><</span><span style="background: white; color: blue">int</span><span style="background: white; color: black">, </span><span style="background: white; color: blue">int</span><span style="background: white; color: black">>(x => x + 1);
}
}
</span></pre>
<p>Visit <code>/home.json</code> and get an <code>InvalidOperationException: Circular reference detected.</code></p>
<p>In <a href="https://servicestack.net/">ServiceStack</a>:</p>
<pre class="code"><span style="background: white; color: black">[Route(</span><span style="background: white; color: #a31515">"/home"</span><span style="background: white; color: black">)]
</span><span style="background: white; color: blue">public class </span><span style="background: white; color: #2b91af">Home </span><span style="background: white; color: black">{ }
</span><span style="background: white; color: blue">public class </span><span style="background: white; color: #2b91af">HomeService </span><span style="background: white; color: black">: Service {
</span><span style="background: white; color: blue">public object </span><span style="background: white; color: black">Any(</span><span style="background: white; color: #2b91af">Home </span><span style="background: white; color: black">h) {
</span><span style="background: white; color: blue">return new </span><span style="background: white; color: #2b91af">Func</span><span style="background: white; color: black"><</span><span style="background: white; color: blue">int</span><span style="background: white; color: black">, </span><span style="background: white; color: blue">int</span><span style="background: white; color: black">>(x => x + 1);
}
}
</span></pre>
<p>Visit <code>/home?format=json</code> and you get:</p>
<pre>{"Method":{"__type":"System.Reflection.RuntimeMethodInfo, mscorlib","Name":"<any>b__0","DeclaringType":"ServiceStackTest.HomeService, ServiceStackTest","ReflectedType":"ServiceStackTest.HomeService, ServiceStackTest","MemberType":8,"MetadataToken":100663310,"Module":{"__type":"System.Reflection.RuntimeModule, mscorlib","MDStreamVersion":131072,"FullyQualifiedName":"G:\\Windows\\Microsoft.NET\\Framework\\v4.0.30319\\Temporary ASP.NET Files\\root\\c11d5664\\d0efee07\\assembly\\dl3\\51c16a64\\5cdba327_3150cf01\\ServiceStackTest.dll","ModuleVersionId":"125112c7a82d4c2099718b901637e950","MetadataToken":1,"ScopeName":"ServiceStackTest.dll","Name":"ServiceStackTest.dll","Assembly":{"__type":"System.Reflection.RuntimeAssembly, mscorlib","CodeBase":"file:///g:/prg/ServiceStackTest/ServiceStackTest/bin/ServiceStackTest.DLL","FullName":"ServiceStackTest, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null","DefinedTypes":["ServiceStackTest.Global, ServiceStackTest","ServiceStackTest.AppHost, ServiceStackTest","Service{"ResponseStatus":{"ErrorCode":"InvalidCastException","Message":"Unable to cast object of type 'System.Security.Policy.Zone' to type 'System.Security.Policy.Url'.","StackTrace":" at ServiceStack.Text.Common.WriteType`2.WriteProperties(TextWriter writer, Object value)\r\n at ServiceStack.Text.Common.WriteListsOfElements`1.WriteIEnumerable(TextWriter writer, Object oValueCollection)\r\n at ServiceStack.Text.Common.WriteType`2.WriteProperties(TextWriter writer, Object value)\r\n at ServiceStack.Text.Common.WriteType`2.WriteAbstractProperties(TextWriter writer, Object value)\r\n at ServiceStack.Text.Common.WriteType`2.WriteProperties(TextWriter writer, Object value)\r\n at ServiceStack.Text.Common.WriteType`2.WriteAbstractProperties(TextWriter writer, Object value)\r\n at ServiceStack.Text.Common.WriteType`2.WriteProperties(TextWriter writer, Object value)\r\n at ServiceStack.Text.Common.WriteType`2.WriteAbstractProperties(TextWriter writer, Object value)\r\n at ServiceStack.Text.Common.WriteType`2.WriteProperties(TextWriter writer, Object value)\r\n at ServiceStack.Text.JsonSerializer.SerializeToStream(Object value, Type type, Stream stream)\r\n at ServiceStack.Text.JsonSerializer.SerializeToStream[T](T value, Stream stream)\r\n at ServiceStack.Serialization.JsonDataContractSerializer.SerializeToStream[T](T obj, Stream stream)\r\n at ServiceStack.Host.ContentTypes.<getstreamserializer>b__5(IRequest r, Object o, Stream s)\r\n at ServiceStack.Host.ContentTypes.<>c__DisplayClass2.<getresponseserializer>b__1(IRequest httpReq, Object dto, IResponse httpRes)\r\n at ServiceStack.HttpResponseExtensionsInternal.WriteToResponse(IResponse response, Object result, ResponseSerializerDelegate defaultAction, IRequest request, Byte[] bodyPrefix, Byte[] bodySuffix)"}}</pre>
<p>You get the point.</p>
<p>I don’t think anyone really wants this, but what are the alternatives?</p>
<p>Many would say “just write a test for it”, but that would mean writing a test for every type we serialize, hardly a good use of our time and very easy to forget. Since we’re working in a statically-typed language, can’t we make the compiler work for us?</p>
<p>The first step is understanding why this is really wrong. When you write a function signature like this in C#:</p>
<pre class="code"><span style="background: white; color: blue">string </span><span style="background: white; color: black">Serialize<T>(T obj)</span></pre>
<p>when interpreted under the <a href="http://en.wikibooks.org/wiki/Haskell/The_Curry-Howard_isomorphism">Curry-Howard isomorphism</a> this is actually saying: “I propose a function named ‘Serialize’ which turns any value of any type into a string”. Which is a blatant lie when implemented, since we’ve seen that you can’t get a meaningful string out of many types. The only logical implementation for such a function, without breaking <a href="https://en.wikipedia.org/wiki/Parametricity">parametricity</a>, is a constant string. </p>
<p>Well, in .NET we could also implement it by calling ToString() on the argument, but you’ll notice that Object.ToString() has essentially the same signature as our Serialize above, and therefore the same arguments apply.</p>
<p>And you have to give up side effects too, otherwise you could implement this function simply by ignoring the argument and reading a string from a file. Runtime type inspection (i.e. reflection) too, as it breaks parametricity.</p>
<p>These restrictions and the property of parametricity are important because they enable code you can <a href="http://daniel.yokomizo.org/2011/12/understanding-higher-order-code-for.html">reason about</a>, and <a href="http://dl.dropboxusercontent.com/u/7810909/media/doc/parametricity.pdf">very precisely</a>. They will help you <a href="http://davesquared.net/2014/02/splitting-responsibilities.html">refactor towards more general code</a>. You can even <a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.38.9875">derive theorems</a> just from their signatures.</p>
<p>I won't explain Curry-Howard or parametricity here, but if you're not familiar with these concepts I highly recommend following the links above. I found them to be very important concepts, especially in statically-typed languages.</p>
<p>You may think that all this reasoning and theorems is only for academics and doesn’t apply to your job, but we’ll see how by following these restrictions we can get the compiler to check what otherwise would have taken a test for each serialized type. This means less code and less runtime errors, a very real benefit! The more you abide by these restrictions, the more the compiler and types will help you. </p>
<p>The opposite of this programming by reasoning is <a href="http://pragprog.com/the-pragmatic-programmer/extracts/coincidence">programming by coincidence</a>. You start "trying out things" to somehow hit those magical lines of code that do what you wanted to do... at least for the inputs you have considered.</p>
<p>So to sum up: use of unconstrained type parameters (generics) in concrete methods/classes for things that don’t truly represent “for all types” is a logic flaw.</p>
<p>Now that I briefly and badly explained the problem, what not to do and why, in the <a href="http://bugsquash.blogspot.com/2014/05/mapping-objects-to-json-with-fleece.html">next post we'll see what we can do</a>.</p> Mauricio Schefferhttp://www.blogger.com/profile/15247972578064164206noreply@blogger.com0tag:blogger.com,1999:blog-8643857899806162280.post-33567624696723155172014-03-20T13:44:00.001-03:002014-04-19T20:34:00.337-03:00Better dictionary types<p>Say you need to call a function like:</p> <!-- HTML generated using hilite.me --> <div style="border-top-style: none; overflow: auto; width: auto; border-bottom-style: none; border-right-style: none; border-left-style: none"> <pre style="margin: 0px; line-height: 125%"><span style="font-weight: bold; color: #204a87">int</span> <span style="color: #000000">DoSomething</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">Dictionary</span><span style="font-weight: bold; color: #000000"><</span><span style="font-weight: bold; color: #204a87">string</span><span style="font-weight: bold; color: #000000">,</span> <span style="font-weight: bold; color: #204a87">string</span><span style="font-weight: bold; color: #000000">></span> <span style="color: #000000">data</span><span style="font-weight: bold; color: #000000">)</span></pre>
</div>
<p>Do you know what kind of data you should feed to that function? Evidently, the keys and values of the dictionary are strings, but what should the comparer for the keys be? Should keys be case-sensitive or case-insensitive? Does it even matter for this function?</p>
<p>For example, if DoSomething was processing HTTP headers, it would need to receive a case-insensitive dictionary, as <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2">HTTP header names are case-insensitive</a>. Yet the type doesn’t enforce it, it doesn’t even give us a hint.</p>
<p>How do other typed languages deal with this? Let’s take a look at Haskell first.</p>
<p>Haskell’s <a href="http://www.haskell.org/ghc/docs/7.6.2/html/libraries/containers-0.5.0.0/Data-Map.html">Data.Map</a> requires the key type to have an instance for the Ord typeclass. Since you can’t have more than one typeclass instance per type, there is no possible ambiguity about how keys are compared. This property of having at most one instance per typeclass per type is called “coherence” and it’s a good property to have in a typeclass system as it keeps things simple, both for the programmer and the compiler. If you wanted a case-insensitive Map, you’d use <a href="http://hackage.haskell.org/package/case-insensitive">Data.CaseInsensitive</a> and your concrete Map type would reflect its case-insensitive behavior, e.g. </p>
<!-- HTML generated using hilite.me -->
<div style="overflow: auto; width: auto; padding-bottom: 0.2em; padding-top: 0.2em; padding-left: 0.6em; padding-right: 0.6em">
<pre style="margin: 0px; line-height: 125%"><span style="font-weight: bold; color: #204a87">import</span> <span style="font-weight: bold; color: #204a87">qualified</span> <span style="color: #000000">Data.Map</span> <span style="font-weight: bold; color: #204a87">as</span> <span style="color: #000000">M</span>
<span style="font-weight: bold; color: #204a87">import</span> <span style="font-weight: bold; color: #204a87">qualified</span> <span style="color: #000000">Data.CaseInsensitive</span> <span style="font-weight: bold; color: #204a87">as</span> <span style="color: #000000">CI</span> <span style="font-weight: bold; color: #000000">(</span> <span style="color: #000000">mk</span> <span style="font-weight: bold; color: #000000">)</span>
<span style="color: #000000">main</span> <span style="font-weight: bold; color: #204a87">=</span> <span style="font-weight: bold; color: #204a87">do</span>
<span style="font-weight: bold; color: #204a87">let</span> <span style="color: #000000">m</span> <span style="font-weight: bold; color: #204a87">=</span> <span style="font-weight: bold; color: #204a87">M</span><span style="font-weight: bold; color: #ce5c00">.</span><span style="color: #000000">fromList</span> <span style="font-weight: bold; color: #000000">[(</span><span style="font-weight: bold; color: #204a87">CI</span><span style="font-weight: bold; color: #ce5c00">.</span><span style="color: #000000">mk</span> <span style="color: #4e9a06">"one"</span><span style="font-weight: bold; color: #000000">,</span> <span style="font-weight: bold; color: #0000cf">1</span><span style="font-weight: bold; color: #000000">)]</span>
<span style="color: #000000">print</span> <span style="font-weight: bold; color: #ce5c00">$</span> <span style="font-weight: bold; color: #204a87">M</span><span style="font-weight: bold; color: #ce5c00">.</span><span style="color: #000000">lookup</span> <span style="font-weight: bold; color: #000000">(</span><span style="font-weight: bold; color: #204a87">CI</span><span style="font-weight: bold; color: #ce5c00">.</span><span style="color: #000000">mk</span> <span style="color: #4e9a06">"One"</span><span style="font-weight: bold; color: #000000">)</span> <span style="color: #000000">m</span></pre>
</div>
<p>Here the type of <code>m</code> is <code>Map (CI String) Integer</code>. You can’t confuse it with the case-sensitive <code>Map String Integer</code> because the compiler simply won’t let you. They’re different types!</p>
<p>.NET doesn’t have typeclasses but we could achieve something similar in this case if we could redesign <code>System.Collections.Generic.Dictionary</code> and remove the constructors that admit an instance of <code>IEqualityComparer<TKey></code> . That means it would always use the default comparer for the key type. And if we wanted a case-insensitive dictionary, we’d just wrap our string keys in a type implementing case-insensitive equality, e.g.:</p>
<!-- HTML generated using hilite.me -->
<div style="border-top-style: none; overflow: auto; width: auto; border-bottom-style: none; border-right-style: none; border-left-style: none">
<pre style="margin: 0px; line-height: 125%"><span style="font-weight: bold; color: #204a87">sealed</span> <span style="font-weight: bold; color: #204a87">class</span> <span style="color: #000000">CaseInsensitiveString</span> <span style="font-weight: bold; color: #000000">{</span>
<span style="font-weight: bold; color: #204a87">public</span> <span style="font-weight: bold; color: #204a87">readonly</span> <span style="font-weight: bold; color: #204a87">string</span> <span style="color: #000000">Value</span><span style="font-weight: bold; color: #000000">;</span>
<span style="font-weight: bold; color: #204a87">public</span> <span style="color: #000000">CaseInsensitiveString</span><span style="font-weight: bold; color: #000000">(</span><span style="font-weight: bold; color: #204a87">string</span> <span style="font-weight: bold; color: #204a87">value</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">{</span>
<span style="color: #000000">Value</span> <span style="font-weight: bold; color: #000000">=</span> <span style="font-weight: bold; color: #204a87">value</span><span style="font-weight: bold; color: #000000">;</span>
<span style="font-weight: bold; color: #000000">}</span>
<span style="font-weight: bold; color: #204a87">public</span> <span style="font-weight: bold; color: #204a87">override</span> <span style="font-weight: bold; color: #204a87">bool</span> <span style="color: #000000">Equals</span><span style="font-weight: bold; color: #000000">(</span><span style="font-weight: bold; color: #204a87">object</span> <span style="color: #000000">obj</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">{</span>
<span style="font-weight: bold; color: #204a87">if</span> <span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">ReferenceEquals</span><span style="font-weight: bold; color: #000000">(</span><span style="font-weight: bold; color: #204a87">null</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">obj</span><span style="font-weight: bold; color: #000000">))</span> <span style="font-weight: bold; color: #204a87">return</span> <span style="font-weight: bold; color: #204a87">false</span><span style="font-weight: bold; color: #000000">;</span>
<span style="font-weight: bold; color: #204a87">if</span> <span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">ReferenceEquals</span><span style="font-weight: bold; color: #000000">(</span><span style="font-weight: bold; color: #204a87">this</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">obj</span><span style="font-weight: bold; color: #000000">))</span> <span style="font-weight: bold; color: #204a87">return</span> <span style="font-weight: bold; color: #204a87">true</span><span style="font-weight: bold; color: #000000">;</span>
<span style="font-weight: bold; color: #204a87">if</span> <span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">obj</span><span style="font-weight: bold; color: #000000">.</span><span style="color: #000000">GetType</span><span style="font-weight: bold; color: #000000">()</span> <span style="font-weight: bold; color: #000000">!=</span> <span style="color: #000000">GetType</span><span style="font-weight: bold; color: #000000">())</span> <span style="font-weight: bold; color: #204a87">return</span> <span style="font-weight: bold; color: #204a87">false</span><span style="font-weight: bold; color: #000000">;</span>
<span style="font-weight: bold; color: #204a87">return</span> <span style="color: #000000">StringComparer</span><span style="font-weight: bold; color: #000000">.</span><span style="color: #000000">InvariantCultureIgnoreCase</span><span style="font-weight: bold; color: #000000">.</span><span style="color: #000000">Equals</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">Value</span><span style="font-weight: bold; color: #000000">,</span> <span style="font-weight: bold; color: #000000">((</span><span style="color: #000000">CaseInsensitiveString</span><span style="font-weight: bold; color: #000000">)</span><span style="color: #000000">obj</span><span style="font-weight: bold; color: #000000">).</span><span style="color: #000000">Value</span><span style="font-weight: bold; color: #000000">);</span>
<span style="font-weight: bold; color: #000000">}</span>
<span style="font-weight: bold; color: #204a87">public</span> <span style="font-weight: bold; color: #204a87">override</span> <span style="font-weight: bold; color: #204a87">int</span> <span style="color: #000000">GetHashCode</span><span style="font-weight: bold; color: #000000">()</span> <span style="font-weight: bold; color: #000000">{</span>
<span style="font-weight: bold; color: #204a87">return</span> <span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">Value</span> <span style="font-weight: bold; color: #000000">!=</span> <span style="font-weight: bold; color: #204a87">null</span> <span style="font-weight: bold; color: #000000">?</span> <span style="color: #000000">StringComparer</span><span style="font-weight: bold; color: #000000">.</span><span style="color: #000000">InvariantCultureIgnoreCase</span><span style="font-weight: bold; color: #000000">.</span><span style="color: #000000">GetHashCode</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">Value</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">:</span> <span style="font-weight: bold; color: #0000cf">0</span><span style="font-weight: bold; color: #000000">);</span>
<span style="font-weight: bold; color: #000000">}</span>
<span style="font-weight: bold; color: #204a87">public</span> <span style="font-weight: bold; color: #204a87">override</span> <span style="font-weight: bold; color: #204a87">string</span> <span style="color: #000000">ToString</span><span style="font-weight: bold; color: #000000">()</span> <span style="font-weight: bold; color: #000000">{</span>
<span style="font-weight: bold; color: #204a87">return</span> <span style="color: #000000">Value</span><span style="font-weight: bold; color: #000000">;</span>
<span style="font-weight: bold; color: #000000">}</span>
<span style="font-weight: bold; color: #204a87">public</span> <span style="font-weight: bold; color: #204a87">static</span> <span style="font-weight: bold; color: #204a87">implicit</span> <span style="font-weight: bold; color: #204a87">operator</span> <span style="color: #000000">CaseInsensitiveString</span> <span style="font-weight: bold; color: #000000">(</span><span style="font-weight: bold; color: #204a87">string</span> <span style="color: #000000">a</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">{</span>
<span style="font-weight: bold; color: #204a87">return</span> <span style="font-weight: bold; color: #204a87">new</span> <span style="color: #000000">CaseInsensitiveString</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">a</span><span style="font-weight: bold; color: #000000">);</span>
<span style="font-weight: bold; color: #000000">}</span>
<span style="font-weight: bold; color: #000000">}</span>
<span style="font-weight: bold; color: #204a87">var</span> <span style="color: #000000">data</span> <span style="font-weight: bold; color: #000000">=</span> <span style="font-weight: bold; color: #204a87">new</span> <span style="color: #000000">Dictionary</span><span style="font-weight: bold; color: #000000"><</span><span style="color: #000000">CaseInsensitiveString</span><span style="font-weight: bold; color: #000000">,</span> <span style="font-weight: bold; color: #204a87">string</span><span style="font-weight: bold; color: #000000">></span> <span style="font-weight: bold; color: #000000">{</span>
<span style="font-weight: bold; color: #000000">{</span><span style="color: #4e9a06">"one"</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #4e9a06">"1"</span><span style="font-weight: bold; color: #000000">},</span>
<span style="font-weight: bold; color: #000000">};</span>
<span style="color: #000000">Console</span><span style="font-weight: bold; color: #000000">.</span><span style="color: #000000">WriteLine</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">data</span><span style="font-weight: bold; color: #000000">[</span><span style="color: #4e9a06">"One"</span><span style="font-weight: bold; color: #000000">]);</span></pre>
</div>
<p>But since <code>Dictionary</code> has constructors that allow overriding the equality comparer, this is not enough. In other words, the type <code>Dictionary<CaseInsensitiveString, string></code> does not guarantee that the dictionary is case-insensitive. We can easily work around this by creating a new type that limits <code>Dictionary</code>’s constructors:</p>
<!-- HTML generated using hilite.me -->
<div style="border-top-style: none; overflow: auto; width: auto; border-bottom-style: none; border-right-style: none; border-left-style: none">
<pre style="margin: 0px; line-height: 125%"><span style="font-weight: bold; color: #204a87">sealed</span> <span style="font-weight: bold; color: #204a87">class</span> <span style="color: #000000">Dictionary2</span><span style="font-weight: bold; color: #000000"><</span><span style="color: #000000">TKey</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">TValue</span><span style="font-weight: bold; color: #000000">>:</span> <span style="color: #000000">Dictionary</span><span style="font-weight: bold; color: #000000"><</span><span style="color: #000000">TKey</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">TValue</span><span style="font-weight: bold; color: #000000">></span> <span style="font-weight: bold; color: #000000">{</span>
<span style="font-weight: bold; color: #204a87">public</span> <span style="color: #000000">Dictionary2</span><span style="font-weight: bold; color: #000000">()</span> <span style="font-weight: bold; color: #000000">{}</span>
<span style="font-weight: bold; color: #204a87">public</span> <span style="color: #000000">Dictionary2</span><span style="font-weight: bold; color: #000000">(</span><span style="font-weight: bold; color: #204a87">int</span> <span style="color: #000000">capacity</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">:</span> <span style="font-weight: bold; color: #204a87">base</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">capacity</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">{}</span>
<span style="font-weight: bold; color: #204a87">public</span> <span style="color: #000000">Dictionary2</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">IDictionary</span><span style="font-weight: bold; color: #000000"><</span><span style="color: #000000">TKey</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">TValue</span><span style="font-weight: bold; color: #000000">></span> <span style="color: #000000">dictionary</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">:</span> <span style="font-weight: bold; color: #204a87">base</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">dictionary</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">{}</span>
<span style="font-weight: bold; color: #000000">}</span></pre>
</div>
<p>And now we can guarantee that a <code>Dictionary2<CaseInsensitiveString, string></code> will do key equality as defined by <code>CaseInsensitiveString</code> and thus it will be case-insensitive. </p>
<p>There is one downside to this solution: we need to wrap all the keys when we want a different comparer. This means more allocations, less performance. Haskell can avoid this penalty by making the wrapper a <a href="http://stackoverflow.com/a/5889784">newtype</a> (not the particular case of <code>Data.CaseInsensitive</code> though!), which we don’t have in .NET. Can we do better?</p>
<p>The main problem here was that the type doesn’t uniquely determine an equality comparer. If we don’t make that a part of the key type, then couldn’t we make the comparer part of the dictionary type?</p>
<p>This is precisely what <a href="https://ocaml.janestreet.com/ocaml-core/latest/doc/">ocaml-core</a> does. The <a href="https://ocaml.janestreet.com/ocaml-core/latest/doc/core_kernel/#Core_map">Map type</a> is determined by types of the map’s keys and values, and the comparison function used to order the keys. The book <a href="https://realworldocaml.org/">Real World OCaml</a> explains how <a href="https://realworldocaml.org/v1/en/html/maps-and-hash-tables.html#idm181613585056">including the comparator in the type is important because certain operations that work on multiple maps require that they have the same comparison function</a>. As we’ve seen, <code>System.Collections.Generic.Dictionary</code> can’t enforce that.</p>
<p>Following that same design principle, now instead of forbidding all constructors that accept an equality comparer, we do the opposite: forbid all constructors that don’t take a comparer, thus making it always explicit, and include the comparer as an additional type parameter:</p>
<!-- HTML generated using hilite.me -->
<div style="border-top-style: none; overflow: auto; width: auto; border-bottom-style: none; border-right-style: none; border-left-style: none">
<pre style="margin: 0px; line-height: 125%"><span style="font-weight: bold; color: #204a87">sealed</span> <span style="font-weight: bold; color: #204a87">class</span> <span style="color: #000000">Dictionary</span><span style="font-weight: bold; color: #000000"><</span><span style="color: #000000">TKey</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">TValue</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">TEqualityComparer</span><span style="font-weight: bold; color: #000000">></span> <span style="font-weight: bold; color: #000000">:</span> <span style="color: #000000">Dictionary</span><span style="font-weight: bold; color: #000000"><</span><span style="color: #000000">TKey</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">TValue</span><span style="font-weight: bold; color: #000000">></span> <span style="font-weight: bold; color: #204a87">where</span> <span style="color: #000000">TEqualityComparer</span> <span style="font-weight: bold; color: #000000">:</span> <span style="color: #000000">IEqualityComparer</span><span style="font-weight: bold; color: #000000"><</span><span style="color: #000000">TKey</span><span style="font-weight: bold; color: #000000">></span> <span style="font-weight: bold; color: #000000">{</span>
<span style="font-weight: bold; color: #204a87">public</span> <span style="color: #000000">Dictionary</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">TEqualityComparer</span> <span style="color: #000000">comparer</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">:</span> <span style="font-weight: bold; color: #204a87">base</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">comparer</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">{}</span>
<span style="font-weight: bold; color: #204a87">public</span> <span style="color: #000000">Dictionary</span><span style="font-weight: bold; color: #000000">(</span><span style="font-weight: bold; color: #204a87">int</span> <span style="color: #000000">capacity</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">TEqualityComparer</span> <span style="color: #000000">comparer</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">:</span> <span style="font-weight: bold; color: #204a87">base</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">capacity</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">comparer</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">{}</span>
<span style="font-weight: bold; color: #204a87">public</span> <span style="color: #000000">Dictionary</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">IDictionary</span><span style="font-weight: bold; color: #000000"><</span><span style="color: #000000">TKey</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">TValue</span><span style="font-weight: bold; color: #000000">></span> <span style="color: #000000">dictionary</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">TEqualityComparer</span> <span style="color: #000000">comparer</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">:</span> <span style="font-weight: bold; color: #204a87">base</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">dictionary</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">comparer</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">{}</span>
<span style="font-weight: bold; color: #000000">}</span></pre>
</div>
<p>A small helper to aid with type inference in C#:</p>
<!-- HTML generated using hilite.me -->
<div style="border-top-style: none; overflow: auto; width: auto; border-bottom-style: none; border-right-style: none; border-left-style: none">
<pre style="margin: 0px; line-height: 125%"><span style="font-weight: bold; color: #204a87">static</span> <span style="font-weight: bold; color: #204a87">class</span> <span style="color: #000000">Dict</span><span style="font-weight: bold; color: #000000"><</span><span style="color: #000000">TKey</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">TValue</span><span style="font-weight: bold; color: #000000">></span> <span style="font-weight: bold; color: #000000">{</span>
<span style="font-weight: bold; color: #204a87">public</span> <span style="font-weight: bold; color: #204a87">static</span> <span style="color: #000000">Dictionary</span><span style="font-weight: bold; color: #000000"><</span><span style="color: #000000">TKey</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">TValue</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">TEqualityComparer</span><span style="font-weight: bold; color: #000000">></span> <span style="color: #000000">Create</span><span style="font-weight: bold; color: #000000"><</span><span style="color: #000000">TEqualityComparer</span><span style="font-weight: bold; color: #000000">>(</span><span style="color: #000000">TEqualityComparer</span> <span style="color: #000000">comparer</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #204a87">where</span> <span style="color: #000000">TEqualityComparer</span><span style="font-weight: bold; color: #000000">:</span> <span style="color: #000000">IEqualityComparer</span><span style="font-weight: bold; color: #000000"><</span><span style="color: #000000">TKey</span><span style="font-weight: bold; color: #000000">></span> <span style="font-weight: bold; color: #000000">{</span>
<span style="font-weight: bold; color: #204a87">return</span> <span style="font-weight: bold; color: #204a87">new</span> <span style="color: #000000">Dictionary</span><span style="font-weight: bold; color: #000000"><</span><span style="color: #000000">TKey</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">TValue</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">TEqualityComparer</span><span style="font-weight: bold; color: #000000">>(</span><span style="color: #000000">comparer</span><span style="font-weight: bold; color: #000000">);</span>
<span style="font-weight: bold; color: #000000">}</span>
<span style="font-weight: bold; color: #000000">}</span></pre>
</div>
<p>Another small helper class to ease the definition of comparer types based on the ones we already have:</p>
<!-- HTML generated using hilite.me -->
<div style="border-top-style: none; overflow: auto; width: auto; border-bottom-style: none; border-right-style: none; border-left-style: none">
<pre style="margin: 0px; line-height: 125%"><span style="font-weight: bold; color: #204a87">class</span> <span style="color: #000000">DelegatingEqualityComparer</span><span style="font-weight: bold; color: #000000"><</span><span style="color: #000000">T</span><span style="font-weight: bold; color: #000000">>:</span> <span style="color: #000000">IEqualityComparer</span><span style="font-weight: bold; color: #000000"><</span><span style="color: #000000">T</span><span style="font-weight: bold; color: #000000">></span> <span style="font-weight: bold; color: #000000">{</span>
<span style="font-weight: bold; color: #204a87">private</span> <span style="font-weight: bold; color: #204a87">readonly</span> <span style="color: #000000">IEqualityComparer</span><span style="font-weight: bold; color: #000000"><</span><span style="color: #000000">T</span><span style="font-weight: bold; color: #000000">></span> <span style="color: #000000">comparer</span><span style="font-weight: bold; color: #000000">;</span>
<span style="font-weight: bold; color: #204a87">public</span> <span style="color: #000000">DelegatingEqualityComparer</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">IEqualityComparer</span><span style="font-weight: bold; color: #000000"><</span><span style="color: #000000">T</span><span style="font-weight: bold; color: #000000">></span> <span style="color: #000000">comparer</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">{</span>
<span style="font-weight: bold; color: #204a87">this</span><span style="font-weight: bold; color: #000000">.</span><span style="color: #000000">comparer</span> <span style="font-weight: bold; color: #000000">=</span> <span style="color: #000000">comparer</span><span style="font-weight: bold; color: #000000">;</span>
<span style="font-weight: bold; color: #000000">}</span>
<span style="font-weight: bold; color: #204a87">public</span> <span style="font-weight: bold; color: #204a87">bool</span> <span style="color: #000000">Equals</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">T</span> <span style="color: #000000">x</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">T</span> <span style="color: #000000">y</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">{</span>
<span style="font-weight: bold; color: #204a87">return</span> <span style="color: #000000">comparer</span><span style="font-weight: bold; color: #000000">.</span><span style="color: #000000">Equals</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">x</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">y</span><span style="font-weight: bold; color: #000000">);</span>
<span style="font-weight: bold; color: #000000">}</span>
<span style="font-weight: bold; color: #204a87">public</span> <span style="font-weight: bold; color: #204a87">int</span> <span style="color: #000000">GetHashCode</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">T</span> <span style="color: #000000">obj</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">{</span>
<span style="font-weight: bold; color: #204a87">return</span> <span style="color: #000000">comparer</span><span style="font-weight: bold; color: #000000">.</span><span style="color: #000000">GetHashCode</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">obj</span><span style="font-weight: bold; color: #000000">);</span>
<span style="font-weight: bold; color: #000000">}</span>
<span style="font-weight: bold; color: #000000">}</span></pre>
</div>
<p>Now we can easily create new comparer types like this:</p>
<!-- HTML generated using hilite.me -->
<div style="border-top-style: none; overflow: auto; width: auto; border-bottom-style: none; border-right-style: none; border-left-style: none">
<pre style="margin: 0px; line-height: 125%"><span style="font-weight: bold; color: #204a87">sealed</span> <span style="font-weight: bold; color: #204a87">class</span> <span style="color: #000000">StringComparerInvariantCultureIgnoreCase</span><span style="font-weight: bold; color: #000000">:</span> <span style="color: #000000">DelegatingEqualityComparer</span><span style="font-weight: bold; color: #000000"><</span><span style="font-weight: bold; color: #204a87">string</span><span style="font-weight: bold; color: #000000">></span> <span style="font-weight: bold; color: #000000">{</span>
<span style="font-weight: bold; color: #204a87">private</span> <span style="color: #000000">StringComparerInvariantCultureIgnoreCase</span><span style="font-weight: bold; color: #000000">()</span> <span style="font-weight: bold; color: #000000">:</span> <span style="font-weight: bold; color: #204a87">base</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">StringComparer</span><span style="font-weight: bold; color: #000000">.</span><span style="color: #000000">InvariantCultureIgnoreCase</span><span style="font-weight: bold; color: #000000">)</span> <span style="font-weight: bold; color: #000000">{}</span>
<span style="font-weight: bold; color: #204a87">public</span> <span style="font-weight: bold; color: #204a87">static</span> <span style="font-weight: bold; color: #204a87">readonly</span> <span style="color: #000000">StringComparerInvariantCultureIgnoreCase</span> <span style="color: #000000">Instance</span> <span style="font-weight: bold; color: #000000">=</span> <span style="font-weight: bold; color: #204a87">new</span> <span style="color: #000000">StringComparerInvariantCultureIgnoreCase</span><span style="font-weight: bold; color: #000000">();</span>
<span style="font-weight: bold; color: #000000">}</span></pre>
</div>
<p>Finally we can use this new Dictionary type like this:</p>
<!-- HTML generated using hilite.me -->
<div style="border-top-style: none; overflow: auto; width: auto; border-bottom-style: none; border-right-style: none; border-left-style: none">
<pre style="margin: 0px; line-height: 125%"><span style="font-weight: bold; color: #204a87">var</span> <span style="color: #000000">data</span> <span style="font-weight: bold; color: #000000">=</span> <span style="color: #000000">Dict</span><span style="font-weight: bold; color: #000000"><</span><span style="font-weight: bold; color: #204a87">string</span><span style="font-weight: bold; color: #000000">,</span> <span style="font-weight: bold; color: #204a87">string</span><span style="font-weight: bold; color: #000000">>.</span><span style="color: #000000">Create</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">StringComparerInvariantCultureIgnoreCase</span><span style="font-weight: bold; color: #000000">.</span><span style="color: #000000">Instance</span><span style="font-weight: bold; color: #000000">);</span>
<span style="color: #000000">data</span><span style="font-weight: bold; color: #000000">.</span><span style="color: #000000">Add</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #4e9a06">"one"</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #4e9a06">"1"</span><span style="font-weight: bold; color: #000000">);</span>
<span style="color: #000000">Console</span><span style="font-weight: bold; color: #000000">.</span><span style="color: #000000">WriteLine</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">data</span><span style="font-weight: bold; color: #000000">[</span><span style="color: #4e9a06">"One"</span><span style="font-weight: bold; color: #000000">]);</span></pre>
</div>
<p>Back to our original function, if we wanted to enforce a case-insensitive dictionary we can now use this new Dictionary type and change the signature to:</p>
<!-- HTML generated using hilite.me -->
<div style="border-top-style: none; overflow: auto; width: auto; border-bottom-style: none; border-right-style: none; border-left-style: none">
<pre style="margin: 0px; line-height: 125%"><span style="font-weight: bold; color: #204a87">int</span> <span style="color: #000000">DoSomething</span><span style="font-weight: bold; color: #000000">(</span><span style="color: #000000">Dictionary</span><span style="font-weight: bold; color: #000000"><</span><span style="font-weight: bold; color: #204a87">string</span><span style="font-weight: bold; color: #000000">,</span> <span style="font-weight: bold; color: #204a87">string</span><span style="font-weight: bold; color: #000000">,</span> <span style="color: #000000">StringComparerInvariantCultureIgnoreCase</span><span style="font-weight: bold; color: #000000">></span> <span style="color: #000000">data</span><span style="font-weight: bold; color: #000000">)</span></pre>
</div>
<h2><font style="font-weight: normal">Epilogue</font></h2>
<p>Types are a terrific tool to reason about our code, but only if we use them correctly. Throwing around types in an impure, partial language like C# or F# does not mean you're using types in a meaningful way. Consider what your types allow and what they don’t allow. <a href="https://blogs.janestreet.com/effective-ml-revisited/">Make illegal states unrepresentable</a>. With precise types it becomes easier to reason about your code. Invariants enforced through the type system means the compiler makes it impossible to create invalid programs.</p>
<p>When you find yourself in need of inspiration for your types, see what other typed languages do, especially OCaml and Haskell. Their type systems are much more powerful than .NET’s, but often you can extract some of the underlying design principles and adapt them to less powerful type systems.</p>
<h2><font style="font-weight: normal">Addendum</font></h2>
<p>Judging by some comments I've read around the web, it seems the goal of this post wasn't clear. It's definitely not about dictionaries concretely. It's about the thought process to recognize flaws in type design and how to fix them so that some interesting invariant can be enforced through the type system. The dictionary types here are merely used to illustrate this process.</p> Mauricio Schefferhttp://www.blogger.com/profile/15247972578064164206noreply@blogger.com2tag:blogger.com,1999:blog-8643857899806162280.post-30407404116934173172014-02-14T19:23:00.001-03:002014-02-14T19:23:17.354-03:00Generating immutable instances in C# with FsCheck<p>If you do any functional programming in C# you’ll probably have lots of classes that look like this:</p> <pre class="code"><span style="background: white; color: blue">class </span><span style="background: white; color: #2b91af">Person </span><span style="background: white; color: black">{
</span><span style="background: white; color: blue">private readonly string </span><span style="background: white; color: black">name;
</span><span style="background: white; color: blue">private readonly </span><span style="background: white; color: #2b91af">DateTime </span><span style="background: white; color: black">dateOfBirth;
</span><span style="background: white; color: blue">public </span><span style="background: white; color: black">Person(</span><span style="background: white; color: blue">string </span><span style="background: white; color: black">name, </span><span style="background: white; color: #2b91af">DateTime </span><span style="background: white; color: black">dateOfBirth) {
</span><span style="background: white; color: blue">this</span><span style="background: white; color: black">.name = name;
</span><span style="background: white; color: blue">this</span><span style="background: white; color: black">.dateOfBirth = dateOfBirth;
}
</span><span style="background: white; color: blue">public string </span><span style="background: white; color: black">Name {
</span><span style="background: white; color: blue">get </span><span style="background: white; color: black">{ </span><span style="background: white; color: blue">return </span><span style="background: white; color: black">name; }
}
</span><span style="background: white; color: blue">public </span><span style="background: white; color: #2b91af">DateTime </span><span style="background: white; color: black">DateOfBirth {
</span><span style="background: white; color: blue">get </span><span style="background: white; color: black">{ </span><span style="background: white; color: blue">return </span><span style="background: white; color: black">dateOfBirth; }
}
// equality members...
}
</span></pre>
<p>I.e. immutable classes, similar to F# records.
<br />Now let’s say we want to test some property involving this Person class. For example, let’s say we have a serializer:</p>
<pre class="code"><span style="background: white; color: blue">interface </span><span style="background: white; color: #2b91af">IPersonSerializer </span><span style="background: white; color: black">{
</span><span style="background: white; color: blue">string </span><span style="background: white; color: black">Serialize(</span><span style="background: white; color: #2b91af">Person </span><span style="background: white; color: black">p);
</span><span style="background: white; color: #2b91af">Person </span><span style="background: white; color: black">Deserialize(</span><span style="background: white; color: blue">string </span><span style="background: white; color: black">source);
}
</span></pre>
<p>Never mind the unsafety of this serializer, we want to test that roundtrip serialization works as expected. So we grab <a href="http://www.nuget.org/packages/FsCheck/">FsCheck</a> and write:</p>
<pre class="code"><span style="background: white">Spec</span><span style="background: white; color: black">.ForAny((</span><span style="background: white; color: #2b91af">Person </span><span style="background: white; color: black">p) => serializer.Deserialize(serializer.Serialize(p)).Equals(p))
.QuickCheckThrowOnFailure();
</span></pre>
<p>Only to be greeted with: </p>
<pre>System.Exception: Geneflect: type not handled Person</pre>
<p>Ok, so we write the generator explicitly:</p>
<pre class="code"><span style="background: white; color: blue">var </span><span style="background: white; color: black">personGen =
</span><span style="background: white; color: blue">from </span><span style="background: white; color: black">name </span><span style="background: white; color: blue">in </span><span style="background: white">Any</span><span style="background: white; color: black">.OfType<</span><span style="background: white; color: blue">string</span><span style="background: white; color: black">>()
</span><span style="background: white; color: blue">from </span><span style="background: white; color: black">dob </span><span style="background: white; color: blue">in </span><span style="background: white">Any</span><span style="background: white; color: black">.OfType<</span><span style="background: white; color: #2b91af">DateTime</span><span style="background: white; color: black">>()
</span><span style="background: white; color: blue">select new </span><span style="background: white; color: #2b91af">Person</span><span style="background: white; color: black">(name, dob);
</span><span style="background: white">Spec</span><span style="background: white; color: black">.For(personGen, p => serializer.Deserialize(serializer.Serialize(p)).Equals(p))
.QuickCheckThrowOnFailure();
</span></pre>
<p>And all is fine.</p>
<p>But the generator code is trivially derivable from the class definition. And when you have lots of immutable classes, this boilerplate becomes really annoying. Couldn’t we automatize that somehow? </p>
<p>As it turns out, FsCheck already does this for F# records, using reflection. With a bit of code <a href="https://github.com/fsharp/FsCheck/commit/288e1366a8e74794288a02e4edee17f46c56991a">we can extend the reflection-based generator (built into FsCheck) to make it generate instances of immutable classes</a>. With this change, we can go back to writing Spec.ForAny and FsCheck will automatically derive the generator for our Person class!</p>
<p>The restrictions on the classes to make them generable by FsCheck are:</p>
<ul>
<li>Must be a concrete class. No interfaces or abstract classes; FsCheck can’t guess a concrete implementation. </li>
<li>It has to have only one public constructor. Otherwise, which one would FsCheck choose? </li>
<li>All public fields and properties must be readonly. Otherwise, it creates the ambiguity of which ones to set and which ones not. </li>
<li>Must not be recursively defined. FsCheck doesn’t generate recursively defined F# records by reflection either. I assume this is to keep the implementation simple. </li>
<li>Must not have type parameters. This restriction could probably be relaxed, but since I haven’t needed it so far, I decided to play it safe. Left as an exercise for the reader :-) </li>
</ul>
<p>This is available in FsCheck as of version 0.9.2.</p>
<p>Happy FsChecking in C# and VB.NET !</p> Mauricio Schefferhttp://www.blogger.com/profile/15247972578064164206noreply@blogger.com12