Sunday, September 30, 2007

Code search engines mini-review

A couple of weeks ago I got sick of searching our codebase with Total Commander. So I went googling for a code search engine. This is what I found:

  • cs2project: (open-source) developed by Simone as an academic project, using Lucene.Net to index. It's very basic as it's still a new project, but looks really promising. For some reason it only indexed 740 files and then stopped. I'll turn on logging later to see what happened...
  • Koders Pro: (commercial) led by Phil Haack, supports over 30 languages and provides lots of statistics. I installed the demo and after the indexing finished, I went browsing and searching and suddenly it asked me to get a (free) account on Hmm, no thanks.
  • Krugle Enterprise: (commercial) seems cool, but I couldn't find a trial version.
  • Fisheye: (commercial) seems more repository-oriented than code-oriented. I mean, you can do full-text queries, but it doesn't give you the exact LoC where the query matched and it doesn't cross-reference classes and types. But when it comes to repository analysis, I think no other product has so many features. It even has a pseudo-SQL language called EyeQL to query the repository! Too bad the trial crashed while indexing our repository...
  • OpenGrok: (open-source) developed in Java, under the wing of OpenSolaris, OpenGrok uses Lucene to index source code. It's only about searching and doesn't offer statistics like the commercial products, but it's very good at what it does. It groks (parses) several languages, including C/C++,Java, Shell Scripts like ksh, bash, Perl, Makefiles, XML/HTML/SGML files, ELF files, Java class files, Java Jar files, archive files like TAR, GZip, BZip2, Zip files, man page formats like troff and more, but sadly, still no .NET languages. For the languages it groks, it provides cross-referencing of classes and types. And it gives you repository history search, too!


So I kept OpenGrok, and after installing and configuring, I announced it to the team. But our web designer (one of the coolest guys I have ever worked with) heard "OGrok" instead of OpenGrok (ogro means ogre in spanish). He went on calling it OGrok, and then he even put together an alternative logo, featuring the most famous ogre :-)

It has since become an invaluable tool for us, I can't recommend enough that you install one of these code search engines, it really improves collaboration with your teammates.

Wednesday, September 26, 2007

Keep your CI server on a fast machine

The build for our main site was taking nearly 8 minutes on the CI server, getting dangerously close to the recommended 10-minute limit. And I'm talking about the basic, commit build. However, a "nant all" on my machine took about 2 minutes! WTF?!? Then I remembered that I installed CruiseControl on a crappy old 512mb RAM machine!

I moved it to a newer, faster server, and problem solved :-)

Wednesday, September 19, 2007

HttpInterfaces for ASP.NET 1.1

Inspired by Phil Haack's HttpInterfaces, I wrote a similar set of interfaces for ASP.NET 1.1 (which is what we still use at work, sigh...), so we can better test our huge legacy codebase. The most significant difference is that DuckTyping doesn't work in 1.1 AFAIK... so I had to write adapter classes to wrap System.Web.HttpApplication, etc, which was pretty trivial thanks to ReSharper.

Let's see how we could use these interfaces to test a common legacy WebForm. Suppose you have a page which puts the content of a QueryString parameter in a Label, i.e.:


public class MyPage : Page
  protected Label Label1; 
  private void Page_Load(object sender, EventArgs e) {        
  	Label1.Text = Request.QueryString["text"];    
  protected override void OnInit(EventArgs e) {        
  private void InitializeComponent() {
  	this.Load += new System.EventHandler(this.Page_Load);    

We build a BasePage from which MyPage will inherit, that will allow injection of Request, Response, etc:


public class BasePage : Page
  private IHttpRequest requesto;    
  public new IHttpRequest Request {        
    get {            
      if (requesto == null)
        requesto = new HttpRequestAdapter(HttpContext.Current.Request);            
        return requesto;        
    set { requesto = value; }    

Make MyPage inherit from BasePage instead of Page, and set the Request to whatever you like... Using this, we can test that Label1 effectively gets the QueryString parameter. Just create a stub for the request, assign it to a instance of MyPage, and call Page_Load(). For example:


public void PageLoad() {
  MockRepository mocks = new MockRepository();
  IHttpRequest req = (IHttpRequest) mocks.CreateMock(typeof (IHttpRequest));
  string text = "hello world";
  NameValueCollection queryString = new NameValueCollection();
  queryString["text"] = text;
  mocks.ReplayAll(); MyPage p = (MyPage) ReflectionHelper.CreatePageWithControls(typeof (MyPage));
  p.Request = req;
  ReflectionHelper.InvokeMethod(p, "Page_Load", null, null);
  Label Label1 = (Label) ReflectionHelper.GetPageControl(p, "Label1");
  Assert.AreEqual(text, Label1.Text);

Here I used Rhino.Mocks to create the request stub. Note that we have to use reflection to call Page_Load() and get the controls since they are not public... But with minimum changes to the code, we gained a lot of testability!

With TypeMock, a very powerful mocking framework (although not free), we could write the same test without depending on any interface and without making any changes to the original code:

public void PageLoad_WithTypeMock() {
    Mock requesto = MockManager.Mock(typeof (HttpRequest), Constructor.Mocked);
    string text = "hello world";
    NameValueCollection queryString = new NameValueCollection();
    queryString["text"] = text;
    requesto.ExpectGet("QueryString", queryString);
    Mock myPage = MockManager.Mock(typeof (MyPage), Constructor.NotMocked);
    myPage.ExpectGet("Request", new HttpRequest(null, null, null));
    myPage.Strict = false; 

    MyPage p = (MyPage) ReflectionHelper.CreatePageWithControls(typeof (MyPage));
    ReflectionHelper.InvokeMethod(p, "Page_Load", null, null);
    Label Label1 = (Label) ReflectionHelper.GetPageControl(p, "Label1");
    Assert.AreEqual(text, Label1.Text);

But, like I said, TypeMock is not free. There is a community edition, though. I think TypeMock is great for initial legacy testing, but in the long run, it pays off to refactor, to add new seams so that legacy code stops being legacy. The beauty of refactoring is that it can be done progressively, so don't be afraid to make changes!

Oh, I almost forgot, here's the code, have fun! :-)

Sunday, September 16, 2007

Duplicate files finder

Oh no, not another duplicate file finder!! There are already hundreds of these... and still, I had to hack my own, since I couldn't find a single one that kept hashes in a database for quick subsequent searches and support of lots of files, and was free. So I wrote dupito (stupid name, I know). It uses SQL Server Compact Edition to store and query hashes (SHA-512 is used). I tried SQLite at first, but it was way too slow... and since I used Castle ActiveRecord, it was just a matter of configuration to change that to SQLServerCE.

Anyway, usage is as follows: when called without arguments, it indexes files in the current directory and subdirectories, then prints a list of duplicate files.

Command-line arguments:

  • c: cleans up database, deleting rows that reference nonexistent files
  • r: rehashes all files in database 
  • l: lists duplicate files currently in database

And here it is.

(As a sidenote, the exe is pretty big because I merged in Castle.ActiveRecord.dll + all of its dependencies...)