Monday, August 22, 2011

How To, Auto Suggesting with multi words/terms using solr ShingleFilterFactory

Google has one amazing feature called auto sugesstion. basically, when you type one word, even just the prefix of the word, it will list the suggestion for you like this,
image
Even further, google has one unknown REST API that you can use to query the auto-completion word lists provided by Google.
http://google.com/complete/search?output=toolbar&q=%22intel+s%22

 image

For Apache Solr, If you want to do the same thing as google does, here is one simple approach. Just using the ShingleFilter,

Create one text field that use this filter,

<fieldType name="text_shingle" class="solr.TextField" positionIncrementGap="100">
            <analyzer>
                <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
         <filter class="solr.ShingleFilterFactory" maxShingleSize="4" outputUnigrams="true"/>
            </analyzer>
        </fieldType>

Then create one field to use that field type
image

After that, we can post a simple document to the Solr, I will put one wiki page, http://en.wikipedia.org/wiki/X25-M to the solr.

you can just create one doc and post it to solr using curl

image


image

Now you can go to the solr admin to check the description terms, you will see those multi-word terms

image

for application, we can run a facet query to get the keyword list and count.
image

http://localhost:8080/solrmaster/select/?q=description%3A*&version=2.2&start=0&rows=0&indent=on&facet=true&facet.field=description&facet.prefix=02g2

pretty simple to get startedSmile

No comments:

 
Locations of visitors to this page