Introduction to Lucene.Net | CodeGuru

Introduction to Lucene.Net

What is Lucene.Net? Lucene.Net is an exact port of the original Lucene search engine library, written in C#. It provides a framework (APIs) for creating applications with full text search. Lucene.Net can be downloaded from http://incubator.apache.org/lucene.net/download.html. Currently it is undergoing incubation at Apache Software Foundation (ASF). Why Use Lucene.Net? You can use Lucene.Net to add […]

Written By
CodeGuru Staff
CodeGuru Staff
Jan 18, 2012
3 minute read
CodeGuru content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

What is Lucene.Net?

Lucene.Net is an exact port of the original Lucene search engine
library, written in C#. It provides a framework (APIs) for creating
applications with full text search.

Lucene.Net can be downloaded from http://incubator.apache.org/lucene.net/download.html.
Currently it is undergoing incubation at Apache Software Foundation (ASF).

Why Use Lucene.Net?

You can use Lucene.Net to add more power to an already existing
search in your ASP.Net web application or website. It can also be used to index
and search documents (word, pdf, etc.) within your application.

This article describes how we can use Lucene.Net to add full
text search in our ASP.Net applications. Any search function consists of two
basic steps, first to index the text and second to search the text. We will use
Lucene.Net to do both of the steps.

In this example we will try to read the content of a text file
and index it using Lucene.Net. First download the dll and add a reference to
the project.

Advertisement

How to Use Lucene.Net

Indexing the text

There are a few things to understand before we start indexing.

1. Analyzer
– To read the text and break them into words (Tokens). Can also be used to
remove ‘noise words’ (common words which you would not want to index).

2. Fields
– Content holders with a name and a value.

3. Documents
– The unit of indexing and search. Is a collection of fields. Documents are
added to the index and are returned as a list of results.

4. Index
– is a collection of documents.

5. IndexWriter
– Writes the document to the index file.

Code for creating the index file

string strIndexDir = @"D:Index";
Lucene.Net.Store.Directory indexDir = Lucene.Net.Store.FSDirectory.Open(new System.IO.DirectoryInfo(strIndexDir));
Analyzer std = new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29); //Version parameter is used for backward compatibility. Stop words can also be passed to avoid indexing certain words
IndexWriter idxw = new IndexWriter(indexDir, std, true, IndexWriter.MaxFieldLength.UNLIMITED); //Create an Index writer object.
Lucene.Net.Documents.Document doc = new Lucene.Net.Documents.Document();
Lucene.Net.Documents.Field fldText = new Lucene.Net.Documents.Field("text", System.IO.File.ReadAllText(@"d:test.txt"), Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.ANALYZED, Lucene.Net.Documents.Field.TermVector.YES);
doc.Add(fldText);
//write the document to the index
idxw.AddDocument(doc);
//optimize and close the writer
idxw.Optimize();
idxw.Close();
Response.Write("Indexing Done");

Parameters passed while adding Field are:

1. Lucene.Net.Documents.Field.Store. YES
– Field is stored in the index and would be returned in search results. Passing NO
would not store the field in the index and would not be shown in the results.

2. Lucene.Net.Documents.Field.Index. ANALYZED
– Field can be searched. NO means it will not be searchable. NOT_ANALYZED means
field would be searched but analyzer is not used.

3. Lucene.Net.Documents.Field.TermVector. YES
– Stores list of terms and number of occurrences (Google to understand
TermVector more).

It is recommended to call the IndexWriter.Optimize() on
completion of the indexing. It “optimizes” the index for the fastest possible
search.

First part of indexing the text is completed. We will now
search the index for the text entered in the textbox.

Search the text

We will build the search query using the QueryParser class.
There are more Query classes available in Lucene.Net, such as TermQuery,
RangeQuery, etc., which can be used for different requirements. To create a
search query we need use the Analyzer object and the field in the index to
search in.

string strIndexDir = @"D:Index";
Analyzer std = new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29);
Lucene.Net.QueryParsers.QueryParser parser = new Lucene.Net.QueryParsers.QueryParser(Lucene.Net.Util.Version.LUCENE_29, "text", std);
Lucene.Net.Search.Query qry = parser.Parse(Search.Text);

After creating the query object we will use the IndexReader object
for opening the index in read only mode.

Lucene.Net.Store.Directory directory = Lucene.Net.Store.FSDirectory.Open(new System.IO.DirectoryInfo(strIndexDir)); //Provide the directory where index is stored
Lucene.Net.Search.Searcher srchr = new Lucene.Net.Search.IndexSearcher(Lucene.Net.Index.IndexReader.Open(directory, true));//true opens the index in read only mode

Lucene.Net stores the search results (documents) in
Collectors. There are different Collectors available in Lucene.Net. In this
example we will use “TopScoreDocCollector,” which sorts the results based on
athe number of occurrences in each document. Create method of “TopScoreDocCollector”
accepts two parameters – maximum number of documents required (int) and whether
to sort the docs by score.

TopScoreDocCollector cllctr = TopScoreDocCollector.create(100, true);

Once the collector object is ready we will perform the
search and get the results from the collector in a ScoreDoc array.

ScoreDoc[] hits = cllctr.TopDocs().scoreDocs;
for (int i = 0; i < hits.Length; i++)
{
int docId = hits[i].doc;
float score = hits[i].score;
Lucene.Net.Documents.Document doc = srchr.Doc(docId);
Response.Write("Searched from Text: " + doc.Get("text"));
}

This is just an introduction to Lucene.Net. There are a lot
of other areas to be explored, such as different Analyzers, QueryParsers,
Collectors, etc.

Happy learning.

CodeGuru Logo

CodeGuru covers topics related to Microsoft-related software development, mobile development, database management, and web application programming. In addition to tutorials and how-tos that teach programmers how to code in Microsoft-related languages and frameworks like C# and .Net, we also publish articles on software development tools, the latest in developer news, and advice for project managers. Cloud services such as Microsoft Azure and database options including SQL Server and MSSQL are also frequently covered.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.