Blog

Combine Aggregations & Filters in ElasticSearch

  • (4.0)
  •   |   766 Ratings

A natural extension to aggregation scoping is filtering. Because the aggregation operates in the context of the query scope, any filter applied to the query will also apply to the aggregation.

COMBINING AGGREGATIONS AND FILTERS

Aggregations can be used for visualizing aggregated values from the search results and to allow users to filter by them. If we were to do something similar for our movies, it might look something like this:

AGGREGATIONS AND FILTERS

In order to be able to create a page such as the one above, we’d use a search request such as this:
A search request for all movies and terms aggregations for directors and genres.

Learn how to use Elasticsearch, from beginner basics to advanced techniques, with online video tutorials taught by industry experts. Enroll for Free Elasticsearch Training Demo!

 

curl -XPOST "https://localhost:9200/movies/movie/_search" -d'
{
"aggregations": {
"directors": {
"terms": {
"field": "director.original"
}
},
"genres": {
"terms": {
"field": "genres.original"
}
}
}
}'

Now, what if a user wants to filter by a director? On the web development side of things we’d send the director name as a parameter of some sort back to the server. Once on the server we’d need to modify our request to ElasticSearch to add a filter, like this:
The same request as the previous one, only this time with filtering for movies by a specific director.

curl -XPOST "https://localhost:9200/movies/movie/_search" -d'
 {
 "query": {
 "filtered": {
 "query": {
 "match_all": {}
 },
 "filter": {
 "term": {
 "director.original": "Francis Ford Coppola"
 }
 }
 }
 },
 "aggregations": {
 "directors": {
 "terms": {
 "field": "director.original"
 }
 },
 "genres": {
 "terms": {
 "field": "genres.original"
 }
 }
 }
 }'

Frequently Asked Elasticsearch Interview Questions & Answers

With the filtered response from ElasticSearch, we rebuild the web page based on the new response:

rebuild the web page

As the search response now only contains the two movies directed by Francis Ford Coppola, only two hits will be shown. Also, as aggregations are calculated over the document set that the query generates the filters in the left part of the page has also changed. Only the genres and directors found in the movies by Francis Ford Coppola are shown.
Often this is the desired behavior, letting the aggregations reflect the result of applied queries and filters. However, sometimes it’s not. For instance, what if we want to allow users to filter by multiple directors? In such cases, we’d still want buckets for the other directors, even though there are no documents with them in the director field that match the current query.
In such cases, we can add a min_doc_count parameter to our aggregations with zero as value. Like this:
Including empty buckets by setting the min_doc_count parameter in the aggregations.

 

curl -XPOST "https://localhost:9200/movies/movie/_search" -d'
 {
 "query": {
 "filtered": {
 "query": {
 "match_all": {}
 },
 "filter": {
 "term": {
 "director.original": "Francis Ford Coppola"
 }
 }
 }
 },
 "aggregations": {
 "directors": {
 "terms": {
 "field": "director.original",
 "min_doc_count": 0
 }
 },
 "genres": {
 "terms": {
 "field": "genres.original",
 "min_doc_count": 0
 }
 }
 }
 }'

The min_doc_count parameter allows us to control the minimum number of documents that must match a term in order for a bucket to be created by a terms aggregation. The default value is one. By setting it to zero buckets will be created for terms even though no document in the search results has that term. For our page, this would mean that other genres and directors would still be listed:

Related Page:Elasticsearch Post Filter Aggregation

Explore Elasticsearch Sample Resumes! Download & Edit, Get Noticed by Top Employers!Download Now!

 


Popular Courses in 2018

Get Updates on Tech posts, Interview & Certification questions and training schedules