Hi folks! We are back with another post. Today we will discuss in detail an important aspect of solr which is being used by a large number of enterprise applications. It not only improves the aesthetics of the applications but also provides the end-user with a clear understanding of the response that he/she is expecting. The feature we are discussing is called “Highlighting” in solr.
In simple words, Highlighting in Solr allows fragments of documents that match the user’s query to be included with the query response. The fragments are included in a special section of the query response (the highlighting
section), and the client uses the formatting clues to determine how to present the snippets to users. Fragments are a portion of a document field that contains matches from the query and are sometimes also referred to as “snippets”.
Highlighting is extremely configurable. There are many parameters each for fragment sizing, formatting, ordering, backup/alternate behavior and more options that are hard to categorise.
How to Use?
hl
- Use this parameter to enable or disable highlighting. The default is
false
. If you want to use highlighting, you must set this totrue
. hl.method
- The highlighting implementation to use. Acceptable values are:
unified
,original
,fastVector
. The default isoriginal
. hl.fl
- Specifies a list of fields to highlight, either comma- or space-delimited. These must be “stored”. A wildcard of
*
(asterisk) can be used to match field globs, such astext_*
or even*
to highlight all fields where highlighting is possible. When using*
, consider addinghl.requireFieldMatch=true
. The following example uses the local-params syntax and the edismax parser to highlight fields inhl.fl
:&hl.fl=field1 field2&hl.q={!edismax qf=$hl.fl v=$q}&hl.qparser=lucene&hl.requireFieldMatch=true
(along with other applicable parameters, of course). hl.q
- A query to use for highlighting. This parameter allows you to highlight different terms or fields than those being used to search for documents. When setting this, you might also need to set
hl.qparser
. The default is the value of theq
parameter (already parsed). hl.qparser
- The query parser to use for the
hl.q
query. It only applies whenhl.q
is set. The default is the value of thedefType
parameter which in turn defaults tolucene
. hl.requireFieldMatch
- By default,
false
, all query terms will be highlighted for each field to be highlighted (hl.fl
) no matter what fields the parsed query refers to. If set totrue
, only query terms aligning with the field being highlighted will, in turn, be highlighted.If the query references field different from the field being highlighted and they have different text analysis, the query may not highlight query terms it should have and vice versa. The analysis used is that of the field being highlighted (
hl.fl
), not the query fields. hl.usePhraseHighlighter
- If set to
true
, the default, Solr will highlight phrase queries (and other advanced position-sensitive queries) accurately – as phrases. Iffalse
, the parts of the phrase will be highlighted everywhere instead of only when it forms the given phrase. hl.highlightMultiTerm
- If set to
true
, the default, Solr will highlight wildcard queries (and otherMultiTermQuery
subclasses). Iffalse
, they won’t be highlighted at all. hl.snippets
- Specifies the maximum number of highlighted snippets to generate per field. It is possible for any number of snippets from zero to this value to be generated. The default is
1
. hl.fragsize
- Specifies the approximate size, in characters, of fragments to consider for highlighting. The default is
100
. Using0
indicates that no fragmenting should be considered and the whole field value should be used.
Highlighting in a query response
In response to a query, Solr includes highlighting data in a section separate from the documents. It is up to a client to determine how to process this response and display the highlights to users.
Using the example documents included with Solr, we can see how this might work:
In response to a query such as:
We get a response such as :
Note the two sections docs
and highlighting
. The docs
section contains the fields of the document requested with the fl
parameter of the query (only “id”, “name”, “manu”, and “cat”).
The highlighting
section includes the ID of each document and the field that contains the highlighted portion. In this example, we used the hl.fl
parameter to say we wanted query terms highlighted in the “manu” field. When there is a match to the query term in that field, it will be included for each document ID in the list.
So, this is it about Highlighting in solr. We will be back with another post on solr very soon.