I've recently received several questions about the relationship between Solr, SolrNet, NHibernate, Lucene, Lucene.Net, etc, how they fit together, how they should be used, what features does each provide. Here's an attempt at elucidating the topic:
Let's start from the bottom up:
- RDBMS: every programmer knows what these are. Oracle, SQL Server, MySQL, etc. Everyone uses them, to the point that it's often used as a Golden Hammer. RDBMS can be stand-alone programs (client-server architecture) or embedded (running within your application).
- Lucene was written to do full-text indexing and searching. The most known example of full-text searching is Google. You throw words at it and it returns a ranked set of documents that match those words.
In terms of data structures, Lucene at its core implements an inverted index, while relational databases use B-tree variants. Fundamentally different beasts.
Lucene is a Java library, this means that it's not a stand-alone application but instead embedded in your program.
- Full-text functions in relational databases: nowadays almost all major RDBMS offer some full-text capabilities: MySQL, SQL Server, Oracle, etc. As far as I know, they are all behind Lucene in terms of performance and features. They can be easier to use at first, but they're proprietary. If you ever need some advanced feature, switching to Lucene could be a PITA.
- Lucene.Net is a port of Java Lucene to the .Net platform. Nothing more, nothing less. It aims to be fully API compatible so all docs on Java Lucene can be applied to Lucene.Net with minimal translation effort. Index format is also the same, so indices created with Java Lucene can be used by Lucene.Net and vice versa.
- NHibernate is a port of Java Hibernate to the .Net platform. It's an ORM (object-relational mapper), which basically means that it talks to relational databases and maps your query results as objects for easier consumption in object-oriented languages.
- NHibernate.Search is a NHibernate contrib project that integrates NHibernate with Lucene.Net. It's a port of the Java Hibernate Search project. It keeps a Lucene index in sync with a relational database and hides some of the complexity of raw Lucene, making it easier to index and query.
This article explains its basic usage.
- Solr is a search server. It's a stand-alone Java application that uses Lucene to provide full-text indexing and searching through a XML/HTTP interface. This means that it can be used from any platform/language. It can be embedded in your own Java programs, but it's not its primary design purpose.
While very flexible, it's easier to use than raw Lucene and provides features commonly used in search applications, like faceted search and hit highlighting. It also handles caching, replication, sharding, and has a nice web admin interface.
This article is a very good tour of Solr's basic features.
- SolrNet is a library to talk to a Solr instance from a .Net application. It provides an object-oriented interface to Solr's operations. It also acts as an object-Solr mapper: query results are mapped to POCOs.
The latest version also includes Solr-NHibernate integration. This is similar to NHibernate.Search: it keeps a Solr index in sync with a relational database and lets you query Solr from the NHibernate interface.
Unlike NHibernate and NHibernate.Search, which can respectively create a DB schema and a Lucene index, SolrNet can't automatically create the Solr schema. Solr does not have this capability yet. You have to manually configure Solr and set up its schema.
In case this wasn't totally clear, here's a diagram depicting a possible NHibernate-SolrNet architecture: