Overview on ElasticSearch Aggregations

Bucket Script Aggregation

This functionality is experimental and may be changed or removed completely in a future release.
A parent pipeline aggregation, which executes a script which can perform per bucket computations on specified metrics in the parent multi-bucket aggregation. The specified metric must be numeric and the script must return a numeric value.

Want to become a master in Elasticsearch Enroll here for Free Elasticsearch Online Training Demo!

Syntax

A bucket_script aggregation looks like this in isolation:

{
    "bucket_script": {
        "buckets_path": {
            "my_var1": "the_sum", 
            "my_var2": "the_value_count"
        },
        "script": "my_var1 / my_var2"
    }
}
   

Here, my_var1 is the name of the variable for this buckets path to use in the script, the_sum is the path to the metrics to use for that variable. 

                                                   bucket_script Parameters

Parameter NameDescriptionRequiredDefault Value
scriptThe script to run for this aggregation. The script can be inline, file or indexed.Required 
buckets_pathA map of script variables and their associated path to the buckets we wish to use for the variableRequired 
gap_policyThe policy to apply when gaps are found in the dataOptional, defaults to skip 
formatformat to apply to the output value of this aggregationOptional, defaults to null 

MindMajix Youtube Channel   
The following snippet calculates the ratio percentage of t-shirt sales compared to total sales each month:

{
 "aggs" : {
   "sales_per_month" : {
       "date_histogram" : {
          "field" : "date",
          "interval" : "month"
},
 "aggs": {
   "total_sales": {
        "sum": {
          "field": "price"
   }
},
  "t-shirts": {
   "filter": {
        "term": {
           "type": "t-shirt"
  }
},
  "aggs": {
     "sales": {
         "sum": {
             "field": "price"
                 }
              }
           }
},
   "t-shirt-percentage": {
       "bucket_script": {
       "buckets_path": {
           "tShirtSales": "t-shirts>sales",
           "totalSales": "total_sales"
                        },
             "script": "tShirtSales / totalSales * 100"
                    }
                }
            }
        }
    }
}
And the following may be the response:

{
 "aggregations": {
    "sales_per_month": {
       "buckets": [
          {
           "key_as_string": "2015/01/01 00:00:00",
           "key": 1420070400000,
           "doc_count": 3,
           "total_sales": {
             "value": 50
         },
           "t-shirts": {
              "doc_count": 2,
                "sales": {
                  "value": 10
         }
},
           "t-shirt-percentage": {
              "value": 20
      }
},
  {
     "key_as_string": "2015/02/01 00:00:00",
     "key": 1422748800000,
     "doc_count": 2
     "total_sales": {
        "value": 60
    },
        "t-shirts": {
         "doc_count": 1,
          "sales": {
           "value": 15
               }
    },
       "t-shirt-percentage": {
        "value": 25
            }
         },
    {
       "key_as_string": "2015/03/01 00:00:00",
         "key": 1425168000000,
         "doc_count": 2,
         "total_sales": {
         "value": 40
       },
          "t-shirts": {
           "doc_count": 1,
             "sales": {
               "value": 20
       }
},
          "t-shirt-percentage": {
             "value": 50
               }
            }
         ]
      }
   }
}

Frequently Asked Elasticsearch Interview Questions & Answers

Scripted Metric Aggregation

This functionality is experimental and may be changed or removed completely in a future release.
A metric aggregation that executes using scripts to provide a metric output.

Example:

{
  "query" : {
      "match_all" : {}
  },
  "aggs": {
      "profit": {
          "scripted_metric": {
              "init_script" : "_agg['transactions'] = []",
              "map_script" : "if (doc['type'].value == "sale") { _agg.transactions.add(doc['amount'].value) } else { _agg.transactions.add(-1 * doc['amount'].value) }", 
              "combine_script" : "profit = 0; for (t in _agg.transactions) { profit += t }; return profit",
              "reduce_script" : "profit = 0; for (a in _aggs) { profit += a }; return profit"
         }
      }
   }
}

map_script is the only required parameter

The above aggregation demonstrates how one would use the script aggregation compute the total profit from sale and cost transactions.

The response for the above aggregation:

{
    ...
    "aggregations": {
        "profit": {
            "value": 170
        }
   }
}

The above example can also be specified using file scripts as follows:

{
    "query" : {
        "match_all" : {}
    },
    "aggs": {
        "profit": {
            "scripted_metric": {
                "init_script" : {
                    "file": "my_init_script"
                },
                "map_script" : {
                    "file": "my_map_script"
                },
                "combine_script" : {
                    "file": "my_combine_script"
                },
                "params": {
                    "field": "amount" 
                },
                "reduce_script" : {
                    "file": "my_reduce_script"
                },
            }
        }
    }
}

 

script parameters for init, map and combine scripts must be specified in a global params object so that it can be shared between the scripts

Related Page: Learn About Searching Data In Elasticsearch

What are aggregations good for?

By now it should, hopefully, be clear that aggregations are generated values based on the documents that match a search request.

There are a ton of use cases for aggregations. If we use ElasticSearch to analyze logs or statistical data, we can use aggregations to extract information from the data, such as the number of HTTP requests per URL, average call time to a call center per day of the week or number of restaurants that are open on Sundays in different geographical areas.

One especially powerful and interesting aggregation type when analyzing data is the significant_- terms aggregation. This aggregation type allows us to find things in a foreground set compared to a background set (such as all support tickets).

Another use case for aggregations is navigation. In such cases, we may use aggregations to generate a list of categories based on the content on a website to build a menu, or we may aggregate values from many different fields from documents that match a search query to allow users to narrow their search. The below screen shot from Amazon illustrates an example of the latter:

Amazon illustrate

An example of how facets/aggregations are used to filter search results on Amazon.com.

Explore Elasticsearch Sample Resumes! Download & Edit, Get Noticed by Top Employers!Download Now!
Course Schedule
NameDates
Elasticsearch TrainingOct 19 to Nov 03View Details
Elasticsearch TrainingOct 22 to Nov 06View Details
Elasticsearch TrainingOct 26 to Nov 10View Details
Elasticsearch TrainingOct 29 to Nov 13View Details
Last updated: 03 Apr 2023
About Author

Yamuna Karumuri is a content writer at Mindmajix.com. Her passion lies in writing articles on IT platforms including Machine learning, PowerShell, DevOps, Data Science, Artificial Intelligence, Selenium, MSBI, and so on. You can connect with her via  LinkedIn.

read less