Mindmajix

How to Combine Aggregations and Filters in Elasticsearch

A natural extension to aggregation scoping is filtering. Because the aggregation operates in the context of the query scope, any filter applied to the query will also apply to the aggregation.

Combining aggregations and filters

Aggregations can be used for visualizing aggregated values from the search results and to allow users to filter by them. If we were to do something similar for our movies, it might look something like this:

elasticsearch aggregation filter

In order to be able to create a page such as the one above, we’d use a search request such as this:

A search request for all movies and terms aggregations for directors and genres.

curl -XPOST "http://localhost:9200/movies/movie/_search" -d'
 {
 "aggregations": {
 "directors": {
 "terms": {
 "field": "director.original"
 }
 },
 "genres": {
 "terms": {
 "field": "genres.original"
 }
 }
 }
 }'

Now, what if a user wants to filter by a director? On the web development side of things we’d send the director name as a parameter of some sort back to the server. Once on the server we’d need to modify our request to ElasticSearch to add a filter, like this:

The same request as the previous one, only this time with filtering for movies by a specific director.

curl -XPOST "http://localhost:9200/movies/movie/_search" -d'
 {
 "query": {
 "filtered": {
 "query": {
 "match_all": {}
 },
 "filter": {
 "term": {
 "director.original": "Francis Ford Coppola"
 }
 }
 }
 },
 "aggregations": {
 "directors": {
 "terms": {
 "field": "director.original"
 }
 },
 "genres": {
 "terms": {
 "field": "genres.original"
 }
 }
 }
 }'

With the filtered response from ElasticSearch, we rebuild the web page based on the new response:

Screenshot_13

As the search response now only contains the two movies directed by Francis Ford Coppola, only two hits will be shown. Also, as aggregations are calculated over the document set that the query generates the filters in the left part of the page has also changed. Only the genres and directors found in the movies by Francis Ford Coppola are shown.

Often this is the desired behavior, letting the aggregations reflect the result of applied queries and filters. However, sometimes it’s not. For instance, what if we want to allow users to filter by multiple directors? In such cases, we’d still want buckets for the other directors, even though there are no documents with them in the director field that match the current query.

In such cases, we can add a min_doc_count parameter to our aggregations with zero as value. Like this:

Including empty buckets by setting the min_doc_count parameter in the aggregations.

curl -XPOST "http://localhost:9200/movies/movie/_search" -d'
 {
 "query": {
 "filtered": {
 "query": {
 "match_all": {}
 },
 "filter": {
 "term": {
 "director.original": "Francis Ford Coppola"
 }
 }
 }
 },
 "aggregations": {
 "directors": {
 "terms": {
 "field": "director.original",
 "min_doc_count": 0
 }
 },
 "genres": {
 "terms": {
 "field": "genres.original",
 "min_doc_count": 0
 }
 }
 }
 }'

The min_doc_count parameter allows us to control the minimum number of documents that must match a term in order for a bucket to be created by a terms aggregation. The default value is one. By setting it to zero buckets will be created for terms even though no document in the search results has that term. For our page, this would mean that other genres and directors would still be listed:

Screenshot_14


0 Responses on How to Combine Aggregations and Filters in Elasticsearch"

Leave a Message

Your email address will not be published. Required fields are marked *

Copy Rights Reserved © Mindmajix.com All rights reserved. Disclaimer.
Course Adviser

Fill your details, course adviser will reach you.