Calculates aggregate statistics over the results set, such as average, count, and sum. This is similar to SQL aggregation. If stats are used without a by clause only one row is returned, which is the aggregation over the entire incoming result set. If you use a by clause one row is returned for each distinct value specified in the by clause. The stats command calculates statistics based on the fields in your events.
Accelerate Your career with splunk Training and become expertise in splunk Enroll For Free Splunk Training Demo!
Simple: stats (stats-function(field) [AS field])... [BY field-list]
Complete: stats [partitions=] [allnum=] [delim=] ( ... | ... ) 
- Syntax: ( | ) [AS ]
- Description: statistical aggregation function. The function can be applied to an eval expression, or to a field or set of fields. Use the AS clause to place the result into a new field with a name that you specify. You can use wildcard characters in field names.
- Syntax: [AS ]
- Description: sparkline aggregation function. Use the AS clause to place the result into a new field with a name that you specify. You can use wildcard characters in the field name.
- syntax: allnum=
- Description: If true, computes numerical statistics on each field if and only if all of the values of that field are numerical.
- Default: false
- Syntax: delim=
- Description: Specifies how the values in the list() or values() aggregation are delimited.
- Default: a single space
- Syntax: BY
- Description: The name of one or more fields to group by. You cannot use a wildcard character to specify multiple fields with similar names. You must specify each field separately.
- Syntax: partitions=
- Description: If specified, partitions the input data based on the split-by fields for multithreaded reduce.
- Default: 1
Stats function options
- Syntax: avg() | c() | count() | dc() | distinct_count() | earliest() | estdc() | estdc_error() | exactperc() | first() | last() | latest() | list() | max() | median() | min() | mode() | p() | perc() | range() | stdev() | stdevp() | sum() | sumsq() | upperperc() | values() | var() | varp()
- Description: Functions used with the stats command. Each time you invoke the stats command, you can use more than one function. However, you can use only one BY clause.
Sparkline function options
Subscribe to our youtube channel to get new updates..!
Sparklines are inline charts that appear within table cells in search results to display time-based trends associated with the primary key of each row.
- Syntax: sparkline (count(), ) | sparkline ((), )
- Description: A sparkline specifier, which takes the first argument of an aggregation function on a field and an optional timespan specifier. If no timespan specifier is used, an appropriate timespan is chosen based on the time range of the search. If the sparkline is not scoped to a field, only the count aggregator is permitted. You can use wildcard characters in the field name. See the Usage section.
- Syntax: c() | count() | dc() | mean() | avg() | stdev() | stdevp() | var() | varp() | sum() | sumsq() | min() | max() | range()
- Description: Aggregation function to use to generate sparkline values. Each sparkline value is produced by applying this aggregation to the events that fall into each particular time bin.
Eval expressions with statistical functions
When you use the stats command, you must specify either a statistical function or a sparkline function. When you use a statistical function, you can use an eval expression as part of the statistical function. For example:
index=* | stats count(eval(status="404")) AS count_status BY sourcetype
Related Page: Splunk Eval Commands With Examples
Functions and memory usage
Some functions are inherently more expensive, from a memory standpoint, than other functions. For example, the distinct_count function requires far more memory than the count function. The values and list functions also can consume a lot of memory.
If you are using the distinct_count function without a split-by field or with a low-cardinality split-by by field, consider replacing the distinct_count function with the estdc function (estimated distinct count). The estdc function might result in significantly lower memory usage and run times.
Event order functions
Using the first and last functions when searching based on time does not produce accurate results.
- To locate the first value based on time order, use the earliest function, instead of the first function.
- To locate the last value based on time order, use the latest function, instead of the last function.
For example, consider the following search.
index=test sourcetype=testDb | eventstats first(LastPass) as LastPass, last(_time) as mostRecentTestTime BY testCaseId | where startTime==LastPass OR _time==mostRecentTestTime | stats first(startTime) AS startTime, first(status) AS status, first(histID) AS currentHistId, last(histID) AS lastPassHistId BY testCaseId
Replace the first and last functions when you use the stats and eventstats commands for ordering events based on time. The following search shows the function changes.
index=test sourcetype=testDb | eventstats latest(LastPass) AS LastPass, earliest(_time) AS mostRecentTestTime BY testCaseId | where startTime==LastPass OR _time==mostRecentTestTime | stats latest(startTime) AS startTime, latest(status) AS status, latest(histID) AS currentHistId, earliest(histID) AS lastPassHistId BY testCaseId
Wildcards in BY clauses
The stats command does not support wildcard characters in field values in BY clauses.
For example, you cannot specify | stats count BY source*.
You cannot rename one field with multiple names. For example, if you have field A, you cannot rename A as B, A as C. The following example is not valid.
... | stats first(host) AS site, first(host) AS report
1. Return the average transfer rate for each host
sourcetype=access* | stats avg(kbps) BY host
2. Search the access logs, and return the total number of hits from the top 100 values of "referer_domain"
Search the access logs, and return the total number of hits from the top 100 values of "referer_domain". The "top" command returns a count and percent value for each "referer_domain".
sourcetype=access_combined | top limit=100 referer_domain | stats sum(count) AS total
Related Page: Splunk Enterprise Security
1. Count the number of events by HTTP status and host
Count the number of events for a combination of HTTP status code values and host:
sourcetype=access_* | chart count BY status, host
This creates the following table
2. Determine how much email comes from each domain
This example will show how much mail coming from which domain.
sourcetype="cisco_esa" mailfrom=* | eval accountname=split(mailfrom,"@") | eval from_domain=mvindex(accountname,-1) | stats count(eval(match(from_domain, "[^nrs]+.com"))) AS ".com", count(eval(match(from_domain, "[^nrs]+.net"))) AS ".net", count(eval(match(from_domain, "[^nrs]+.org"))) AS ".org", count(eval(NOT match(from_domain, "[^nrs]+.(com|net|org)"))) AS "other"
The first half of this search uses eval to break up the email address in the mail from the field and define the from_domain as the portion of the mail from the field after the @ symbol.
The results are then piped into the stats command. The count() function is used to count the results of the eval expression. Here, eval uses the match() function to compare the from_domain to a regular expression that looks for the different suffixes in the domain. If the value of from_domain matches the regular expression, the count is updated for each suffix, .com, .net, and .org. Other domain suffixes are counted as other.
Related Page: Splunk Streamstats Command
This produces the following results table: