Elasticsearch Post Filter Aggregation

Rating: 4.8
  
 
10670

Post filter

The Post_Filter is applied to the search hits at the very end of a search request, after aggregations have already been calculated. Its purpose is best explained by example:
Imagine that you are selling shirts, and the user has specified two filters: color:red and brand:gucci. You only want to show them red shirts made by Gucci in the search results. Normally you would do this with a bool query:

curl -XGET localhost:9200/shirts/_search -d '
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "color": "red"   }},
        { "term": { "brand": "gucci" }}
      ]
    } } }
If you want to enrich your career and become a professional in Elasticsearch, then enroll in "Elasticsearch Online Training" - This course will help you to achieve excellence in this domain.

However, you would also like to use faceted navigation to display a list of other options that the user could click on. Perhaps you have a model field that would allow the user to limit their search results to red Gucci t-shirts or dress shirts.
This can be done with a terms aggregation:

curl -XGET localhost:9200/shirts/_search -d '
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "color": "red"   }},
        { "term": { "brand": "gucci" }}
      ]
    } },
  "aggs": {
    "models": {
      "terms": { "field": "model" } 
    } } }

MindMajix Youtube Channel

Returns the most popular models of red shirts by Gucci.

But perhaps you would also like to tell the user how many Gucci shirts are available in other colors. If you just add a terms aggregation on the color field, you will only get back the color red, because your query returns only red shirts by Gucci.

[Related Article: Elasticsearch Interview Questions]

Instead, you want to include shirts of all colors during aggregation, then apply the colors filter only to the search results. This is the purpose of the post_filter:

curl -XGET localhost:9200/shirts/_search -d '
{
  "query": {
    "bool": {
      "filter": {
        { "term": { "brand": "gucci" }} 
      } } },
  "aggs": {
    "colors": {
      "terms": { "field": "color" } 
    },
    "color_red": {
      "filter": {
        "term": { "color": "red" } 
      },
      "aggs": {
        "models": {
          "terms": { "field": "model" } 
        } } } },
  "post_filter": { 
    "term": { "color": "red" }
  } }
  • The main query now finds all shirts by Gucci, regardless of color.
  • The colors agg returns popular colors for shirts by Gucci.
  • The color_red agg limits the models sub-aggregation to red Gucci shirts.

Finally, the post_filter removes colors other than red from the search hits

Performance consideration

Use a post_filter only if you need to differentiate filter search results and aggregations. Sometimes people will use post_filter for regular searches.
The nature of the post_filter means it runs after the query, so any performance benefit of filtering (such as caches) is lost completely.
The post_filter should be used only in combination with aggregations, and only when you need differential filtering.

[Related Article: Elasticsearch Aggregations]

Only use post_filter when needed

The post_filter parameter has an alias, filter. This is for backward compatibility as post_filter used to be named filter in early versions of ElasticSearch. The name was changed for a reason. While it’s certainly possible, and more convenient, to use post_filter instead of the query parameter when creating a request that should only filter the results, it’s not as good as using the query parameter performance-wise. So, feel free to use post_filter even if you don’t need to while debugging, but only use it when you actually need it against a production cluster.

Explore Elasticsearch Sample Resumes! Download & Edit, Get Noticed by Top Employers!Download Now!
Join our newsletter
inbox

Stay updated with our newsletter, packed with Tutorials, Interview Questions, How-to's, Tips & Tricks, Latest Trends & Updates, and more ➤ Straight to your inbox!

Course Schedule
NameDates
Elasticsearch TrainingMar 30 to Apr 14View Details
Elasticsearch TrainingApr 02 to Apr 17View Details
Elasticsearch TrainingApr 06 to Apr 21View Details
Elasticsearch TrainingApr 09 to Apr 24View Details
Last updated: 03 Apr 2023
About Author

Ravindra Savaram is a Technical Lead at Mindmajix.com. His passion lies in writing articles on the most popular IT platforms including Machine learning, DevOps, Data Science, Artificial Intelligence, RPA, Deep Learning, and so on. You can stay up to date on all these technologies by following him on LinkedIn and Twitter.

read more
Recommended Courses

1 / 15