Mindmajix

Explain about Filtered Query Search | Elasticsearch

Filters

Queries and filters have been merged. Any query clause can now be used as a query in “query context” and as a filter in “filter context”.

As a general rule, filters should be used instead of queries:

  • for binary yes/no searches
  • for queries on exact values

Filtering

A search request for the word ‘crime’.

curl -XPOST "http://localhost:9200/movies/_search" -d'
 {
 "query": {
 "query_string": {
 "query": "crime"
     }
   }
 }'

As we have four movies in our index containing the word “crime” in the _all field (from the category field), we get four hits for the above query. Now, imagine that we want to limit these hits to movies released in 1962. In order to do that, we need to apply a filter requiring the “year” field to equal 1962. To add such a filter, we modify our search request body so that our current top level query, the query string query, is wrapped in query of a different type, a filtered query:

The original query wrapped in a ‘filtered’ query.
{
 "query": {
 "filtered": {
 "query": {
 "query_string": {
 "query": "crime"
 }
 },
 "filter": {
 //Filter to apply to the query
       }
     }
   }
 }

A filtered query is a query that has two properties, query and filter. When executed, it filters the result of the query using the filter. To finalize the query, we’ll need to add a filter requiring the year field to have value 1962. ElasticSearch’s query DSL has a wide range of filters to choose from. For this simple case, a certain field should match a specific value a term filter will work well.

Adding a ‘term’ filter to the filtered query’s filter property.

"filter": {
 "term": { "year": 1962 }
 }

The complete search request now looks like this:

A search request searching for the word ‘crime’ limited to documents with year == 1962.

curl -XPOST "http://localhost:9200/movies/_search" -d'
 {
 "query": {
 "filtered": {
 "query": {
 "query_string": {
 "query": "crime"
 }
 },
 "filter": {
 "term": { "year": 1962 }
       }
     }
   }
 }'

Perform the above request and you should get a single hit, To Kill a Mockingbird, which has “Crime” in its genre property and whose year property has 1962 as value.

Queries and filters

The difference between queries and filters can be confusing at first. Especially if we try to relate the terminology to query languages for other data sources such as SQL. In ElasticSearch (and Lucene), a query is something that determines whether documents match a given criteria and, producing a score between 0 and 1 that indicates how much it does so. For instance, if we tell ElasticSearch to index “The quick brown fox” it will by default index that as [“the”, “quick”, “brown”, “fox”]. If we search using a query that looks for the term “brown” the indexed string will match, but the score won’t be 1 as only one of the words matches.

A filter on the other hand skips scoring. It only determines whether a document matches the filter or not. Using filters, we can limit the search result to documents that match a given criteria without effecting the score. Also, as ElasticSearch doesn’t have to care about scoring for filters, they are faster and can be cached. Therefore, a general rule of thumb is to always use filters unless you need the results to be sorted by relevancy according to the query.

Filtering without a query

If a query is not specified, it defaults to the match_all query. This means that the filtered query can be used to wrap just a filter, so that it can be used wherever a query is expected.

In the above example, we limit the results of a query_string query using a filter. But what if all we want to do is apply a filter? For instance, if we want to find all movies from 1962. In such cases, we still use the query property in the search request body, which expects a query. In other words, we can’t just add a filter, we need to wrap it in some sort of query. One solution for doing this is to modify our current search request, replacing the query string query in the filtered query with a match_all query which is a query that simply matches everything. Like this:

Search request that only filters by year == 1962.

curl -XPOST "http://localhost:9200/movies/_search" -d'
 {
 "query": {
 "filtered": {
 "query": {
 "match_all": {}
 },
 "filter": {
 "term": { "year": 1962 }
       }
     }
   }
 }'

Another, shorter, option is to use a constant score query:

Again filtering by year == 1962 but this time using a different type of query that can wrap a filter.

curl -XPOST "http://localhost:9200/movies/_search" -d'
 {
 "query": {
 "constant_score": {
 "filter": {
 "term": { "year": 1962 }
       }
     }
   }
 }'

Try both of these requests out and you’ll see the same result for both, two movies from 1962.

Using only filters, or rather a query comprised only of filters, is very common outside of use cases related to free text search. In such cases, we usually want to sort the results in other ways.


0 Responses on Explain about Filtered Query Search | Elasticsearch"

Leave a Message

Your email address will not be published. Required fields are marked *

Copy Rights Reserved © Mindmajix.com All rights reserved. Disclaimer.
Course Adviser

Fill your details, course adviser will reach you.