Thursday, November 22, 2007

Introducing SolrNet

UPDATE 2/19/2009: by now most of this is obsolete, please check out more recent releases

Last month I've been working my a** off integrating Solr to our main site. The first step was to find out how to communicate with the Solr server. Naturally, I came to SolrSharp. But I found it to be really IoC-unfriendly: lots of inheritance, no interfaces, no unit-tests, so it would have been a real PITA to integrate it to Castle. So, instead of wrapping it, I built SolrNet.

Before explaining how it works, a disclaimer: I'm a complete newbie to Solr, Lucene and full-text searching in general. The code works on my machine and does what I need it to do for the task that I have at hand. This project is not, and might never be, feature complete like SolrSharp. Currently it doesn't support facets (UPDATE 8/20/08: I added facet support) or highlights, and maybe some other stuff. If you absolutely need those features right now, either use SolrSharp or write a patch for SolrNet. However, the next step in the integration is implementing faceted search, so I will definitely implement facets sooner or later.

Usage

First we have to map the Solr document to a class (Solr supports only one document type per instance at the moment). Let's use a subset of the default schema that comes with the Solr distribution:

 

public class TestDocument : ISolrDocument {
    private ICollection<string> cat;
    private ICollection<string> features;
    private string id;
    private bool inStock;
    private string manu;
    private string name;
    private int popularity;
    private double price;
    private string sku;

    [SolrField("cat")]
    public ICollection<string> Cat {
        get { return cat; }
        set { cat = value; }
    }

    [SolrField("features")]
    public ICollection<string> Features {
        get { return features; }
        set { features = value; }
    }

    [SolrUniqueKey]
    [SolrField("id")]
    public string Id {
        get { return id; }
        set { id = value; }
    }

    [SolrField("inStock")]
    public bool InStock {
        get { return inStock; }
        set { inStock = value; }
    }

    [SolrField("manu")]
    public string Manu {
        get { return manu; }
        set { manu = value; }
    }

    [SolrField("name")]
    public string Name {
        get { return name; }
        set { name = value; }
    }

    [SolrField("popularity")]
    public int Popularity {
        get { return popularity; }
        set { popularity = value; }
    }

    [SolrField("price")]
    public double Price {
        get { return price; }
        set { price = value; }
    }

    [SolrField("sku")]
    public string Sku {
        get { return sku; }
        set { sku = value; }
    } 
}

 

It's just a POCO with a marker interface (ISolrDocument)[1] and some attributes: SolrField maps the attribute to a Solr field and SolrUniqueKey (optional) maps an attribute to a Solr unique key field. Let's add a document (make sure you have a running Solr instance first):

[Test]
public void AddOne() {
    ISolrOperations<TestDocument> solr = new SolrServer<TestDocument>("http://localhost:8983/solr");
    TestDocument doc = new TestDocument();
    doc.Id = "123456";
    doc.Name = "some name";
    doc.Cat = new string[] {"cat1", "cat2"};
    solr.Add(doc);
    solr.Commit();
}

Let's see if the document is there:

[Test]
public void QueryAll() {
    ISolrOperations<TestDocument> solr = new SolrServer<TestDocument>("http://localhost:8983/solr");
    ISolrQueryResults<TestDocument> r = solr.Query("*:*");
    Assert.AreEqual("123456", r[0].Id);
}

For more examples, see the tests.

DSL

Since DSLs are such a hot topic nowadays, I decided to give it a try to see what happened. I just defined the syntax I wanted in a test, then wrote the interfaces to comply to the syntax and chain the methods, then built the implementations for those interfaces. The result is pretty much self-explanatory:

[SetUp]
public void setup() {
    Solr.Connection = new SolrConnection("http://localhost:8983/solr");
}

[Test]
public void QueryById() {    
    ISolrQueryResults<TestDocument> r = Solr.Query<TestDocument>().By("id").Is("123456").Run();
}

[Test]
public void QueryByRange() {
    ISolrQueryResults<TestDocument> r = Solr.Query<TestDocument>().By("id").Between(123).And(456).OrderBy("id", Order.ASC).Run();
}

[Test]
public void DeleteByQuery() {
    Solr.Delete.ByQuery<TestDocument>("id:123456");
}

Run() is the explicit kicker method [1]. The DSL is defined in a separate DLL, in case you don't want/need it. There are some more examples in the tests.

I TDDd most of the project, so the code coverage is near 75%. I'll add the remaining tests if/when I have the time. Of course, as usual, patches/bugfixes are more than welcome :-)

[1] I might drop this requirement in the future.

5 comments:

Anonymous said...

Hi Mauricio -

I'm always encouraged when others in the dotnet community find solr to be useful and explore different ways of interacting with that system.

Interesting approach here. I'm not overly familiar with implemented IoC, though I've read up on Martin Fowler. Specific to solrsharp, I don't have an understanding of how integration to Castle would be problematic, though that is due to my own unfamiliarity with Castle. On the surface, it seems oddly restrictive that Castle cannot take advantage of the solrsharp library. Per your disclaimer, adding support for all of solr's features certainly makes for more code. I'll be interested to see how solrnet grows over time.

Timely point on the unit tests: solrsharp was initially released without unit tests, then time got the best of me (and others who use it.) Mock objects are currently in discussion to address this most glaring need for solrsharp.

Nonetheless, glad to here there's more than one way to skin this cat.

-- jeff r.

Mauricio Scheffer said...

Hi Jeff, first of all, I didn't mean to bash your project, and I'm sorry if I sounded too rude, it wasn't my intention. I actually read your code to look up features and other stuff.
About integrating SolrSharp to Castle or other IoC containers, I think it could be possible using the factory facility and some wrappers. On second thought, maybe it isn't as hard as I thought. I'll give it a shot and let you know how it worked.
About adding tests to SolrSharp, this is a nice article that explains the relationship between dependency injection and testability, maybe it can help you refactor to better accommodate the tests. It doesn't cover TDD, though.
There are many advantages to coding to interfaces, favoring composition over inheritance, and using dependency injection, not only it enhances testability, it also creates seams that makes changing behavior easier and more maintainable, and makes code reuse much easier. For example, I was able to build SolrNet.DSL just writing a few small classes that basically glued together the classes that I already had defined in the base assembly.
The nice thing about TDD is that it gets you to code that way very naturally, thus getting all the benefits I mentioned...
Have you seen LINQ to Lucene? It would be very nice to make it talk to a Solr server. If only I had more time (sigh)

Anonymous said...

Hi Mauricio - I didn't think you bashed the Solrsharp project, so no worries. It was totally constructive criticism in context with the Castle project. Your comments are totally appreciated.

I agree that there may be an adaptability to Solrsharp using factory methods and perhaps moving some of the defined implementation to interfaces. Given your background with IoC, it would be great if you could provide suggestions to how this could be achieved.

It's been a long while since I've looked at LINQ-Lucene, it was in dicussion stage that last time I paid any attention to it. Eventually having solr function with an interface driver would be so ideal.

The one thing I've found with solr is that there are so many configuration elements at the server level that dictates how queries are executed against it, where those occur, etc. And, solr continues to add configurability that affects standard operation. Trying to hide those effects within Solrsharp has definitely been a challenge.

Anonymous said...

Hi Jeff,
I m wondering how the following line is compiled because the SOlrserver constructor accepts 2 parameters and that not the URL string.

ISolrOperations solr = new SolrServer("http://localhost:8983/solr");

Can u explain how to connect to solr server and add test documents to it .
Thanks in advance...

Mauricio Scheffer said...

@Anonymous: this article is very old, things have changed. Check out the project wiki for updated documentation