Mindmajix

What is Elasticsearch Pagination and Retrieving of Documents

Pagination

In the same way as SQL uses the LIMIT keyword to return a single “page” of results, Elasticsearch accepts the from and size parameters:

size
Indicates the number of results that should be returned, defaults to 10
from
Indicates the number of initial results that should be skipped, defaults to 0

Elasticsearch Pagination

If a search request results in more than ten hits, ElasticSearch will, by default, only return the first ten hits. To override that default value in order to retrieve more or fewer hits, we can add a size parameter to the search request body. For instance, the below request finds all movies ordered by year and returns the first two:

A search request that searches for everything in the ‘movies’ index and sorts the result based on the ‘year’ property in descending order.

curl -XPOST "http://localhost:9200/movies/_search" -d'
 {
 "query": {
 "match_all": {}
 },
 "sort": [
 {
 "year": "desc"
 }
 ],
 "size": 2
 }'

It’s also possible to exclude the N first hits, which you typically would do when building pagination functionality. To do so, use the from parameter and inspect the total property in the response from ElasticSearch to know when to stop paging.

The same request as the previous one, only this time the first two hits are excluded and hit 3 and 4 is returned.

curl -XPOST "http://localhost:9200/movies/_search" -d'
 {
 "query": {
 "match_all": {}
 },
 "sort": [
 {
 "year": "desc"
 }
 ],
 "size": 2,
 "from": 2
 }'

Don’t set the size parameter to some huge number or you’ll get an exception back from ElasticSearch. While it’s typically fine to retrieve tens, hundreds and even thousands of results with a single request, you shouldn’t ask for millions of results using the size parameter.

If you truly need to fetch a huge number of results, or even all of them, you can either iterate over them using size + from or, better yet, you can use the scroll API.

Retrieving only parts of documents

So far, in all of the examples that we’ve looked, the entire JSON object that we’ve indexed has been included as the value of the _source property in each search results hit. In many cases, that’s not needed and only results in unused data being sent over the wire.

There are a couple of ways to control what is returned for each hit. One is by adding a _source parameter to the search request body. This parameter can have values of different kinds. For instance, we can give it the value false to not include a _source property at all in the hits:

A search request that only retrieves a single hit and instructs ES not to include the _source property in the hits.

curl -XPOST "http://localhost:9200/movies/_search" -d'
 {
 "size": 1,
 "_source": false
 }'

The response to the above request will look something like the one below. Note that the only thing that is returned for each hit is the score and the meta data.

A response without the _source property.

{
 "took": 1,
 "timed_out": false,
 "_shards": {
 "total": 5,
 "successful": 5,
 "failed": 0
 },
 "hits": {
 "total": 6,
 "max_score": 1,
 "hits": [
 {
"_index": "movies",
 "_type": "movie",
 "_id": "4",
 "_score": 1
 }
 ]
 }
 }

Another way to use the _source parameter is to give it a string with a single field name. This will result in the _source property in the hits to contain only that field.

A search request instructing ElasticSearch to only include the ‘title’ property in the _source.

curl -XPOST "http://localhost:9200/movies/_search" -d'
 {
 "size": 1,
 "_source": "title"
 }'

The response to the request, which includes the ‘title’ property in the _source.

"hits": [
 {
 "_index": "movies",
 "_type": "movie",
 "_id": "4",
 "_score": 1,
 "_source": {
 "title": "Apocalypse Now"
 }
 }
 ]

If we want multiple fields, we can instead use an array:

A search request instructing ElasticSearch to only include the ‘title’ and ‘director’ properties in the _source.

curl -XPOST "http://localhost:9200/movies/_search" -d'
 {
 "size": 1,
 "_source": ["title", "director"]
 }'

It’s also possible to include, and exclude, fields whose names match one or more patterns.


0 Responses on What is Elasticsearch Pagination and Retrieving of Documents"

Leave a Message

Your email address will not be published. Required fields are marked *

Copy Rights Reserved © Mindmajix.com All rights reserved. Disclaimer.
Course Adviser

Fill your details, course adviser will reach you.