Monday, August 22, 2011

How To, Auto Suggesting with multi words/terms using solr ShingleFilterFactory

Google has one amazing feature called auto sugesstion. basically, when you type one word, even just the prefix of the word, it will list the suggestion for you like this,
Even further, google has one unknown REST API that you can use to query the auto-completion word lists provided by Google.


For Apache Solr, If you want to do the same thing as google does, here is one simple approach. Just using the ShingleFilter,

Create one text field that use this filter,

<fieldType name="text_shingle" class="solr.TextField" positionIncrementGap="100">
                <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
         <filter class="solr.ShingleFilterFactory" maxShingleSize="4" outputUnigrams="true"/>

Then create one field to use that field type

After that, we can post a simple document to the Solr, I will put one wiki page, to the solr.

you can just create one doc and post it to solr using curl



Now you can go to the solr admin to check the description terms, you will see those multi-word terms


for application, we can run a facet query to get the keyword list and count.


pretty simple to get startedSmile

No comments:

Locations of visitors to this page