ElasticSearch Bucket-and-Scripted-Metric-Aggregations

Bucket Script Aggregation

This functionality is experimental and may be changed or removed completely in a future release.
A parent pipeline aggregation, which executes a script which can perform per bucket computations on specified metrics in the parent multi-bucket aggregation. The specified metric must be numeric and the script must return a numeric value.

Want to become a master in Elasticsearch Enroll here for Free Elasticsearch Online Training Demo!

Syntax

A bucket_script aggregation looks like this in isolation:

{
    "bucket_script": {
        "buckets_path": {
            "my_var1": "the_sum", 
            "my_var2": "the_value_count"
        },
        "script": "my_var1 / my_var2"
    }
}

Here, my_var1 is the name of the variable for this buckets path to use in the script, the_sum is the path to the metrics to use for that variable.

bucket_script Parameters

Parameter Name	Description	Required	Default Value
script	The script to run for this aggregation. The script can be inline, file or indexed.	Required
buckets_path	A map of script variables and their associated path to the buckets we wish to use for the variable	Required
gap_policy	The policy to apply when gaps are found in the data	Optional, defaults to skip
format	format to apply to the output value of this aggregation	Optional, defaults to null

The following snippet calculates the ratio percentage of t-shirt sales compared to total sales each month:

{
 "aggs" : {
   "sales_per_month" : {
       "date_histogram" : {
          "field" : "date",
          "interval" : "month"
},
 "aggs": {
   "total_sales": {
        "sum": {
          "field": "price"
   }
},
  "t-shirts": {
   "filter": {
        "term": {
           "type": "t-shirt"
  }
},
  "aggs": {
     "sales": {
         "sum": {
             "field": "price"
                 }
              }
           }
},
   "t-shirt-percentage": {
       "bucket_script": {
       "buckets_path": {
           "tShirtSales": "t-shirts>sales",
           "totalSales": "total_sales"
                        },
             "script": "tShirtSales / totalSales * 100"
                    }
                }
            }
        }
    }
}
And the following may be the response:

{
 "aggregations": {
    "sales_per_month": {
       "buckets": [
          {
           "key_as_string": "2015/01/01 00:00:00",
           "key": 1420070400000,
           "doc_count": 3,
           "total_sales": {
             "value": 50
         },
           "t-shirts": {
              "doc_count": 2,
                "sales": {
                  "value": 10
         }
},
           "t-shirt-percentage": {
              "value": 20
      }
},
  {
     "key_as_string": "2015/02/01 00:00:00",
     "key": 1422748800000,
     "doc_count": 2
     "total_sales": {
        "value": 60
    },
        "t-shirts": {
         "doc_count": 1,
          "sales": {
           "value": 15
               }
    },
       "t-shirt-percentage": {
        "value": 25
            }
         },
    {
       "key_as_string": "2015/03/01 00:00:00",
         "key": 1425168000000,
         "doc_count": 2,
         "total_sales": {
         "value": 40
       },
          "t-shirts": {
           "doc_count": 1,
             "sales": {
               "value": 20
       }
},
          "t-shirt-percentage": {
             "value": 50
               }
            }
         ]
      }
   }
}

Frequently Asked Elasticsearch Interview Questions & Answers

Scripted Metric Aggregation

This functionality is experimental and may be changed or removed completely in a future release.
A metric aggregation that executes using scripts to provide a metric output.

Example:

{
  "query" : {
      "match_all" : {}
  },
  "aggs": {
      "profit": {
          "scripted_metric": {
              "init_script" : "_agg['transactions'] = []",
              "map_script" : "if (doc['type'].value == "sale") { _agg.transactions.add(doc['amount'].value) } else { _agg.transactions.add(-1 * doc['amount'].value) }", 
              "combine_script" : "profit = 0; for (t in _agg.transactions) { profit += t }; return profit",
              "reduce_script" : "profit = 0; for (a in _aggs) { profit += a }; return profit"
         }
      }
   }
}

map_script is the only required parameter

The above aggregation demonstrates how one would use the script aggregation compute the total profit from sale and cost transactions.

The response for the above aggregation:

{
    ...
    "aggregations": {
        "profit": {
            "value": 170
        }
   }
}

The above example can also be specified using file scripts as follows:

{
    "query" : {
        "match_all" : {}
    },
    "aggs": {
        "profit": {
            "scripted_metric": {
                "init_script" : {
                    "file": "my_init_script"
                },
                "map_script" : {
                    "file": "my_map_script"
                },
                "combine_script" : {
                    "file": "my_combine_script"
                },
                "params": {
                    "field": "amount" 
                },
                "reduce_script" : {
                    "file": "my_reduce_script"
                },
            }
        }
    }
}

script parameters for init, map and combine scripts must be specified in a global params object so that it can be shared between the scripts

What are aggregations good for?

By now it should, hopefully, be clear that aggregations are generated values based on the documents that match a search request.

There are a ton of use cases for aggregations. If we use ElasticSearch to analyze logs or statistical data, we can use aggregations to extract information from the data, such as the number of HTTP requests per URL, average call time to a call center per day of the week or number of restaurants that are open on Sundays in different geographical areas.

One especially powerful and interesting aggregation type when analyzing data is the significant_- terms aggregation. This aggregation type allows us to find things in a foreground set compared to a background set (such as all support tickets).

Another use case for aggregations is navigation. In such cases, we may use aggregations to generate a list of categories based on the content on a website to build a menu, or we may aggregate values from many different fields from documents that match a search query to allow users to narrow their search. The below screen shot from Amazon illustrates an example of the latter:

An example of how facets/aggregations are used to filter search results on Amazon.com.

Explore Elasticsearch Sample Resumes! Download & Edit, Get Noticed by Top Employers!Download Now!

On-Job Support Service

Online Work Support for your on-job roles.

@Learner@SME

Our work-support plans provide precise options as per your project tasks. Whether you are a newbie or an experienced professional seeking assistance in completing project tasks, we are here with the following plans to meet your custom needs:

Pay Per Hour
Pay Per Week
Monthly

Learn MoreContact us

Course Schedule

Name	Dates
Elasticsearch Training	Jul 29 to Aug 13	View Details
Elasticsearch Training	Aug 02 to Aug 17	View Details
Elasticsearch Training	Aug 05 to Aug 20	View Details
Elasticsearch Training	Aug 09 to Aug 24	View Details

Last updated: 03 Apr 2023

About Author

Yamuna Karumuri

Yamuna Karumuri is a content writer at Mindmajix.com. Her passion lies in writing articles on IT platforms including Machine learning, PowerShell, DevOps, Data Science, Artificial Intelligence, Selenium, MSBI, and so on. You can connect with her via LinkedIn.

read less

Recommended Courses