In a previous post (from here on after refered to as "part one") I described a console application that demonstrated the very basics of working with Lucene.NET. In this second we'll get to know Lucene.NET a little bit better by rewriting that console application so that it will:
- Persist the index to a directory on a harddrive and
- Enable us to manually input text that will be indexed and
- Perform searches from the console
Setting up the directory and analyzer
The first thing that we'll have to do is import some necessary namespaces and set up a directory for us to work with. You might recall from part one that a directory is a place where Lucene stores the data we add to it, the Documents. In part one we where not interested in storing the index to anything else than RAM so we used a a RAMDirectory. This time however we'll want to be able to add some text and still be able to perform searches on it the next time we run the application as that will be more like a real usage scenario. It will also be more convenient when we use the application for testing. So, instead of using a RAMDirectory we'll instead use a FSDirectory which stores the indexes in files in a specified directory on a harddrive.
A FSDirectory is created by invoking the static FSDirectory.GetDirectory()-method. It takes two parameters. The first parameter is for specifying the location of the directory and the second is a boolean which determines whether old data in the specified location should be read or overwritten. In our case we want to keep data from previous executions of the program so we'll set the second parameter to false if the directory already exists. If the directory doesn't already exists we must however set it to true so that directory will be created.
We'll also add a StandardAnalyzer as a member variable as it will be used by several of our methods.
using System;
using Lucene.Net.Analysis;
using Lucene.Net.Analysis.Standard;
using Lucene.Net.Documents;
using Lucene.Net.Index;
using Lucene.Net.QueryParsers;
using Lucene.Net.Search;
using Lucene.Net.Store;
namespace Example.LuceneTest2
{
class Program
{
private static System.IO.FileInfo _path = new System.IO.FileInfo("indexes");
private static Directory _directory;
private static Analyzer _analyzer;
static void Main(string[] args)
{
bool directoryExists = _path.Exists;
bool createDirectory = !directoryExists;
_directory = FSDirectory.GetDirectory(_path, createDirectory);
_analyzer = new StandardAnalyzer();
The main interface
The next step is to create a basic interface that let's the user do three things:
- Add a new text to be indexed (as in part one we'll imagine that the text actually is a blog entry)
- Perform a search in the texts that have been added and
- Quit the application
Once the user quits the application we'll close the directory so we wont have any locks on the files in it.
static void Main(string[] args)
{
...
...
while(true)
{
Console.WriteLine("Press (A) to add an entry. Press (S) to search. Press (Q) to quit.");
char actionChar = Console.ReadKey().KeyChar;
string action = actionChar.ToString().ToLower();
Console.Clear();
if(action == "a")
AddText();
else if (action == "s")
Search();
else if(action == "q")
break;
}
_directory.Close();
}
Adding text
When the user chooses to add text in the main menu the AddText()-method is invoked. In it we'll allow the user to enter some text that will be written to the index. If you've read part one it will look quite familliar, with one important exception. The IndexWriter we use here is instansiated with the create-parameter (the third parameter in the constructor) set to false if an index already exists. By doing so the IndexWriter will append to the existing index instead of overwriting it as it would have done if the parameter was set to true.
private static void AddText()
{
Console.Write("Enter text to index: ");
string textToIndex = Console.ReadLine();
bool indexExists = IndexReader.IndexExists(_path);
bool createIndex = !indexExists;
IndexWriter indexWriter = new IndexWriter(_directory, _analyzer, createIndex);
Document document = new Document();
Field bodyField = new Field("blogEntryBody", textToIndex, Field.Store.YES, Field.Index.TOKENIZED);
document.Add(bodyField);
indexWriter.AddDocument(document);
indexWriter.Close();
}
Performing searches and printing results
The final step is to add the Search()-method which will allow the user to enter one or several word to search for. The implementation is straight forward and is pretty much just a rewrite of what we did in part one.
private static void Search()
{
Console.Write("Enter text to search for: ");
string textToSearchFor = Console.ReadLine();
IndexSearcher indexSearcher = new IndexSearcher(_directory);
QueryParser queryParser = new QueryParser("blogEntryBody", _analyzer);
Query query = queryParser.Parse(textToSearchFor);
Hits hits = indexSearcher.Search(query);
indexSearcher.Close();
_directory.Close();
PrintHits(hits);
}
private static void PrintHits(Hits hits)
{
int numberOfResults = hits.Length();
string numberOfResultsHeader = string.Format("The search returned {0} results.", numberOfResults);
Console.WriteLine(numberOfResultsHeader);
for (int i = 0; i < hits.Length(); i++)
{
float score = hits.Score(i);
string hitHeader = string.Format("\nHit number {0}, with a score of {1}:", i, score);
Console.WriteLine(hitHeader);
Console.WriteLine(hits.Doc(i).Get("blogEntryBody"));
}
}
Conclusion
This second part really hasn't introduced any new features except storing the index on disk. We do now however have a simple yet effective application for testing how Lucene performs it's searches. We have also discussed the important create-parameters of the FSDirectory.GetDirectory()-method and of the IndexWriter()-constructor.
Sample project
The above code can be downloaded as a Visual Studio 2008 project here.
PS. For updates about new posts, sites I find useful and the occasional rant you can follow me on Twitter. You are also most welcome to subscribe to the RSS-feed.
Similar articles
- Getting to know Lucene.Net part three - time to crawl
- Getting to know Lucene.Net
- Extending ASP.NET MVC Music Store with elasticsearch
- Building a search page for an EPiServer site using Truffler - Part 2
- ElasticSearch 101
- Building a search page for an EPiServer site using Truffler
- Truffler update – dotting the i’s and crossing the t’s
- Building a search page for an EPiServer 7 site with EPiServer Find
Comments
comments powered by Disqus