Tuesday, June 9, 2009

Analyzing SolrNet with NDepend

I've been using NDepend off and on for the last couple of years. Mainly to diagnose build issues and help me write build scripts for complex legacy applications, since legacy applications with no automated build often fall into cyclic dependencies between different solutions. NDepend's assembly build order feature is great for this. But I've never had the time to really dive into the myriad other features this product offers.

A couple of weeks ago Patrick Smacchia kindly sent me a Pro license for NDepend (thanks Patrick!), so I thought it would be a great opportunity to use it to analyze SolrNet. Now, SolrNet is by all measures a tiny project (ohloh stats are inflated due to Sandcastle, SHFB, etc being in the trunk), but that doesn't mean it can't benefit from NDepend.

First of all, if you're analyzing a library, I recommend that you include your tests in the analysis, so that NDepend can see how the library is actually used. Otherwise, NDepend will suggest that you mark some things as private when it shouldn't. Don't worry about the tests raising warnings in the analysis, you can filter them out as I'll explain later. Plus, having the ability to analyze the tests can be pretty handy too.

For example, we can easily issue some CQL queries to get the code/test ratio:

SELECT ASSEMBLIES WHERE NameLike "Tests" 1972 LOC (as defined here)
SELECT ASSEMBLIES WHERE !NameLike "Tests" 1550 LOC

So the code:test ratio is 1:1.27, more LOC of tests than code! However, keep in mind that this metric alone doesn't imply a correct coverage.

Browsing the default queries, I find that "Fields that could be declared internal" caught a

public readonly string query;

Oops, fixing right away!

Under "Purity / Immutability / Side-Effects", "Fields should be marked as ReadOnly when possible" showed 29 results, which seemed strange since I always try to make my objects immutable. 24 of these were <>l__initialThreadId fields, which is one of the fields of the iterator that the C# compiler builds when you use yield return. This also happened with the "Methods too big", "Methods too complex" and "Potentially unused methods" metrics.

Of course, you can edit or delete the default CQL queries. For example, the "Potentially unused methods" is defined by default as:

// <Name>Potentially unused methods</Name>
WARN IF Count > 0 IN SELECT TOP 10 METHODS WHERE 
 MethodCa == 0 AND            // Ca=0 -> No Afferent Coupling -> The method is not used in the context of this application.
 !IsPublic AND                // Public methods might be used by client applications of your assemblies.
 !IsEntryPoint AND            // Main() method is not used by-design.
 !IsExplicitInterfaceImpl AND // The IL code never explicitely calls explicit interface methods implementation.
 !IsClassConstructor AND      // The IL code never explicitely calls class constructors.
 !IsFinalizer                 // The IL code never explicitely calls finalizers.

We can easily add another condition so that these methods don't bother us: AND !FullNameLike "__"

One of the most useful features of NDepend is comparing two versions of a project. I compared the latest release of SolrNet against trunk. Here's a chart of added methods (83 in total, in blue):

83 methods added

34 methods changed:

methods-changed

API breaking changes (no graphic here, just a list):

api-breaking-changes

Don't worry, these are all internal breaking changes, they won't affect the library consumer...

From these graphics you can immediately see that there aren't really many changes, and they aren't focused. The reason is that most of the changes are minor bugfixes and a couple of minor added features.

I only scratched the surface of what's possible with NDepend, but as you can see, small projects can also profit from it, so check it out!