Monday, April 30, 2012

Lucene Generic Highlighter

I have pushed on GitHub a projet on how to create a generic highlighter with Apache Lucene.

Original Lucene Highlighter is too much coupled with snippet highlighting and :
  • Do not allow easily to highlight a whole text 
  • Handles only text with a formatter strongly coupled to text 
 I have modified the original Lucene Highlighter to allow highlighting of "anything". The highlighter is a callback instead of a formatter and it's purpose is to find terms in a whole text with a score. I used this code to highlight XML, PDF, HTML... with or without Solr.

Note : This project is an extract of a large project with submodule.

1 comment:

  1. Nice Blog. This blog is very interesting and you are provide best information for users. I always visited your blog site.Thanks for your wonderful sharing.

    liferay development and liferay portal

    ReplyDelete