Solr querying
Docs and sources:
- Query Syntax and Parsers
- Common Query Parameters
- Standard Query Parser
- DixMax Query Parser
- Extended DisMax (eDisMax) Query Parser
How do send queries to Solr?`
Basic example:
About QueryParsers
The query syntax is strictly dependent on the QueryParser being used. Each parser has different syntax and capabilities.
Common query parameters
Full list and description: Common Query Parameters
Some examples:
param | description | examples | defaultValue |
---|---|---|---|
defType |
which QueryParser should be used |
|
lucene |
q |
main query | depending on QueryParser | none |
fq |
filter query | fq=popularity:[10 TO *]&fq=section:0 |
none |
sort |
result set sorting order |
|
score desc |
start |
for pagination: offset of page start | 0 |
0 |
rows |
for pagination: number of documents | 72 |
10 |
fl |
fields to retrieve |
|
* |
debug |
return debug information |
|
none |
timeAllowed |
the amount of time, in milliseconds, allowed for a search to complete | 2000 | none |
wt |
output format |
|
json |
echoParams |
should the query parameters be included in the response |
|
none |
Additionally, you can specify:
Lucene Query Parser (default)
Example:
Docs: Standard Query Parser
param | description | defaultValue |
---|---|---|
q |
main query (mandatory); see examples below | none |
q.op |
AND or OR , relation between tokens |
OR |
df |
which (single) field should be searched, eg. name |
none |
The main query syntax by example:
vw golf
- searches for these words in fields defined indf
, and joins them usingq.op
"vw golf"
- quotes mark phrases. Tokens must be next to each other.gol?
,gol*
- wildcards are supportedgolf~2
- tilde after word (not phrase) - fuzzy searching, finds similar words"vw golf"~3
- tilde after phrase (not word) - proximity search, tokensvw
andgolf
must be within 3 words of each othername:vw
- possibility to define explicit fields -<field-name>:<query>
name:"vw golf"
,name:gol?
,name:gol*
,name:golf~2
- it's all supportedname:*
- finds documents, which have some value in fieldname
setprice:[52 TO 1000]
- ranges for numeric fields (including those borders)price:{52 TO 1000}
- ranges for numeric fields (EXCLUDING those borders)price:{52 TO 1000]
,price:[* TO 1000]
- also possiblevw^4 golf
- boosting; tokenvw
is boosted (more important) four timesvw OR golf
,vw AND golf
,vw || golf
,vw && golf
- explicitly specify operator between tokens"vw golf" OR toyota
- also possible(vw AND golf) OR toyota
- also possiblevw NOT golf
,vw ! golf
- excluding tokens+vw -golf
- must includevw
, cannot includegolf
DisMax
Example:
The main goal of DisMax was to separate the user's query from how the query should be processed. DisMax doesn't support Lucene Query Parser's syntax!
Docs: DixMax Query Parser
param | description | defaultValue |
---|---|---|
q |
main query (mandatory): only basic syntax is supported:
|
none |
q.alt |
query which should be used if main query is empty | none |
qf |
which fields should be searched with their weights, eg. brand^2.5 model^0.5 |
none |
mm |
minimum should match; number of SHOULD MATCH words that must match the document; might be absolute or percentage | none |
pf |
phrase fields; if the tokens appear in close proximity in this field, the document is boosted, eg title^4 description^1 |
none |
ps |
phrase slop; the maximum distance between tokens to form a phrase | none |
tie |
tie breaker: see the docs | none |
bq |
boost query: see the docs | none |
bf |
boost function: see the docs | none |
The main query syntax by example:
q=vw golf&qf=brand^2 model
- search for wordsvw
(optional) andgolf
(optional) in fieldsbrand
(higher priority) andname
(lower priority)q=+vw golf&qf=brand^2 model
- search for wordsvw
(mandatory) andgolf
(optional) in fieldsbrand
(higher priority) andname
(lower priority)q=+"vw golf"&qf=brand^2 model
- search for phrase"vw golf"
(mandatory) in fieldsbrand
(higher priority) andname
(lower priority)q=+vw +golf&qf=brand^2 model
- search for wordsvw
(mandatory) andgolf
(mandatory) in fieldsbrand
(higher priority) andname
(lower priority)q=+vw +golf -toyota&qf=brand^2 model
- search for wordsvw
(mandatory) andgolf
(mandatory) in fieldsbrand
( higher priority) andname
(lower priority); documents cannot containtoyota
q=vw golf hatchback&qf=brand^2 model&mm=2
- search for wordsvw
(optional) andgolf
(optional) andhatchback
(optional) in fieldsbrand
(higher priority) andname
(lower priority); the document is returned if at least two of these optional words have been foundq=vw golf hatchback&qf=brand^2 model&mm=50%
- search for wordsvw
(optional) andgolf
(optional) andhatchback
(optional) in fieldsbrand
(higher priority) andname
(lower priority); the document is returned if at least half of the optional words have been foundq=vw golf&qf=brand^2 model&pf=name^5
- search for wordsvw
(optional) andgolf
(optional) in fieldsbrand
( higher priority) andname
(lower priority); if the phrase "vw golf" if found in fieldname
, boost the documentq=vw golf&qf=brand^2 model&pf=name^5&ps=4
- search for wordsvw
(optional) andgolf
(optional) in fieldsbrand
(higher priority) andname
(lower priority); if the phrase "vw golf" if found in fieldname
, boost the document; maximum allowed distance between these words is 4
eDisMax
Example:
Extended DisMax, combination DisMax and Lucene Query Parser with add-ons.
As a first approximation, you can think of eDixMax as DisMax, where:
- you can use
AND
andOR
insideq
parameter - you can specify a
boost
function in a better way than in DisMax - have some additional enhancements
For better understanding, refer to the docs: Extended DisMax (eDisMax) Query Parser