I am reading the book Lucene in Action, Second Edition: Covers Apache Lucene 3.0. in the chapter one, there is one basic java program which do the 101 indexing and searching.
Here are some basic tutorial to do that.
1. there is only one core jar file necessary for the engine to run, you can download it from http://www.apache.org/dyn/closer.cgi/lucene/java/
2. open the eclipse , create one java project and reference the core jar file.
3. Create a text file , and put some contents. then save it as test.txt
4. write some java code to do the indexing and searching.
import java.io.BufferedInputStream; import org.apache.lucene.analysis.Analyzer; public class Program { public static void main(String[] args) { Search("/Users/androidyou/Documents/lucence/index","nonexistedkeyworld"); } catch (Exception e) { private static void Search(String indexpath, String keyword) throws Exception, IOException { TopDocs docs= searcher.search(query, 10); } private static void IndexFile(String datafolder, String indexfolder) throws CorruptIndexException, LockObtainFailedException, IOException { Document doc=new Document(); iw.addDocument(doc); } |
And here, if you run the program three times, there will be three “Documents” in the index repository.
here, I will get
Search keyword nonexistedkeyworld |
also you can download the lucene toolkit luke. and Open the index directory.
from the snapshoot above, you can see there are 3 documents inside the Index. for each document, it has two fields. totally 58+1=59 terms
for the content field. by default . method Field(name , filereader) only index the field, not store it.
when you click Documents tab, you can browse the document individually. also you can verify that only filename is stored in the index. for the content filed, just terms. (indexed content.)
No comments:
Post a Comment