Queries and filters have been merged. Any query clause can now be used as a query in “query context” and as a filter in “filter context”.
As a general rule, filters should be used instead of queries:
Learn how to use Elasticsearch, from beginner basics to advanced techniques, with online video tutorials taught by industry experts. Enroll for Free“Elasticsearch Course”. Demo!
A search request for the word ‘crime’.
curl -XPOST "https://localhost:9200/movies/_search" -d'
{
"query": {
"query_string": {
"query": "crime"
}
}
}'
Related Page:Introduction To Elasticsearch Aggregations
As we have four movies in our index containing the word “crime” in the _all field (from the category field), we get four hits for the above query. Now, imagine that we want to limit these hits to movies released in 1962. In order to do that, we need to apply a filter requiring the “year” field to equal 1962. To add such a filter, we modify our search request body so that our current top level query, the query string query, is wrapped in query of a different type, a filtered query:
The original query wrapped in a ‘filtered’ query.
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "crime"
}
},
"filter": {
//Filter to apply to the query
}
}
}
}
A filtered query is a query that has two properties, query and filter. When executed, it filters the result of the query using the filter. To finalize the query, we’ll need to add a filter requiring the year field to have value 1962. ElasticSearch’s query DSL has a wide range of filters to choose from. For this simple case, a certain field should match a specific value a term filter will work well.
Adding a ‘term’ filter to the filtered query’s filter property.
"filter": {
"term": { "year": 1962 }
}
The complete search request now looks like this:
A search request searching for the word ‘crime’ limited to documents with year == 1962.
curl -XPOST "https://localhost:9200/movies/_search" -d'
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "crime"
}
},
"filter": {
"term": { "year": 1962 }
}
}
}
}'
Perform the above request and you should get a single hit, To Kill a Mockingbird, which has “Crime” in its genre property and whose year property has 1962 as value.
Frequently Asked Elasticsearch Interview Questions & Answers
The difference between queries and filters can be confusing at first. Especially if we try to relate the terminology to query languages for other data sources such as SQL. In ElasticSearch (and Lucene), a query is something that determines whether documents match a given criteria and, producing a score between 0 and 1 that indicates how much it does so. For instance, if we tell ElasticSearch to index “The quick brown fox” it will by default index that as [“the”, “quick”, “brown”, “fox”]. If we search using a query that looks for the term “brown” the indexed string will match, but the score won’t be 1 as only one of the words matches.
A filter on the other hand skips scoring. It only determines whether a document matches the filter or not. Using filters, we can limit the search result to documents that match a given criteria without effecting the score. Also, as ElasticSearch doesn’t have to care about scoring for filters, they are faster and can be cached. Therefore, a general rule of thumb is to always use filters unless you need the results to be sorted by relevancy according to the query.
If a query is not specified, it defaults to the match_all query. This means that the filtered query can be used to wrap just a filter, so that it can be used wherever a query is expected.
In the above example, we limit the results of a query_string query using a filter. But what if all we want to do is apply a filter? For instance, if we want to find all movies from 1962. In such cases, we still use the query property in the search request body, which expects a query. In other words, we can’t just add a filter, we need to wrap it in some sort of query. One solution for doing this is to modify our current search request, replacing the query string query in the filtered query with a match_all query which is a query that simply matches everything. Like this:
Search request that only filters by year == 1962.
curl -XPOST "https://localhost:9200/movies/_search" -d'
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"term": { "year": 1962 }
}
}
}
}'
Another, shorter, option is to use a constant score query:
Again filtering by year == 1962 but this time using a different type of query that can wrap a filter.
curl -XPOST "https://localhost:9200/movies/_search" -d'
{
"query": {
"constant_score": {
"filter": {
"term": { "year": 1962 }
}
}
}
}'
Try both of these requests out and you’ll see the same result for both, two movies from 1962.
Using only filters, or rather a query comprised only of filters, is very common outside of use cases related to free text search. In such cases, we usually want to sort the results in other ways.
Name | Dates | |
---|---|---|
Elasticsearch Training | Sep 17 to Oct 02 | View Details |
Elasticsearch Training | Sep 21 to Oct 06 | View Details |
Elasticsearch Training | Sep 24 to Oct 09 | View Details |
Elasticsearch Training | Sep 28 to Oct 13 | View Details |
Ravindra Savaram is a Technical Lead at Mindmajix.com. His passion lies in writing articles on the most popular IT platforms including Machine learning, DevOps, Data Science, Artificial Intelligence, RPA, Deep Learning, and so on. You can stay up to date on all these technologies by following him on LinkedIn and Twitter.