Introduction to Elasticsearch Aggregations


preview imageProgramming
by Anurag Srivastava,Aug 14, 2018, 4:47:56 PM | 4 minutes |

Aggregations provide us the option to group and extract statistics from our data. aggregations give the insight of our data and can be used for a wide range of problems like we can use Elasticsearch aggregations for creating a recommendation engine through which we can implement the recommendation system on any website.

Now, let us jump to the Elasticsearch aggregations and learn how we can apply data aggregations in Elasticsearch. There are mainly four types of aggregations in Elasticsearch:


  • Metric: Here we can extract metrics on a set of documents like on a numeric field we can get the average, max, min etc.
  • Matrix: This type of aggregations works on multiple fields of the document and after extracting the values from those fields it creates the matrix which provides the insight of those fields.
  • Bucketing: The bucketing aggregations is like group by of RDBMS where we can aggregate the data in a form of the bucket which holds the data as per the bucket criteria. So here we can group the data in different buckets and these buckets hold the data as per the applied criteria.

We will see these aggregations types in detail now. So let us start by understanding the syntax of aggregations:

"aggregationss|aggs" {
   "<name of aggregations>" : {
    "<type of aggregations>" : {
        <body of aggregations>
    }
   }
}

This is the simplest representation of Elasticsearch aggregations. Now let us see what is the meaning of each line of example.

- The first line denotes the aggregation keyword where we can use "aggregations" or "aggs".
- In the second line, we need to specify a name for the aggregation.
- In the third line, we need to specify the type of aggregation like terms, etc.
- Then we need to specify the actual aggregation body.

Now let us see the data format which I am going to use for the aggregation:

{
        "_index": "bqstack",
        "_type": "blogs",
        "_id": "EwJnGWQBnhG38eKPq5Bo",
        "_score": 1,
        "_source": {
          "category_name": "Cars",
          "name": "Rocky Paul",
          "edit_approved": false,
          "email": "rocky.paul.9867@xyz.com",
          "edited_blog_content": null,
          "category_id": 35,
          "author_id": 75,
          "create_date": "2018-05-09T13:28:20.917Z",
          "preview_image": "blog_57.png",
          "approved": false,
          "views": 148,
          "@version": "1",
          "blog_content": """
<p><span class="storyText"><p class="MsoNormal"><span lang="EN-GB">The central government approved green licence plates for electric vehicles </span>
""",
          "tags": "",
          "id": 57,
          "blog_title": "Centre approves green licence plates for electric cars",
          "update_date": "2018-05-16T18:30:22.669Z",
          "category_image": "cars.jpg",
          "@timestamp": "2018-06-19T18:56:20.427Z"
        }
      }

Above document is taken from the index bqstack and will be used to demonstrate Elasticsearch aggregation. This is the introduction of aggregations blog so here I will explain the simplest form of Elasticsearch aggregation. See the below example:

GET bqstack/_search?size=0
{
  "aggs": {
    "blog_categories" : {
      "terms" : {
        "field" : "category_name",
        "size" : 5
      }
    }
  }
}

In the above example we are doing the following:
- Given size=0 after _search API to stop listing the documents.
- Keyword "aggs" is there to tell Elasticsearch that I am going to apply the aggregations. We can use "aggregations" instead of "aggs".
- I have given the name as "blog_categories"  to make the aggregation name meaningful because we are going to bucket on category names.
- After specifying the aggregation name we are simply providing the term to specify the field name.
- I have also added, "size" = 5 as there are multiple categories and I am interested in top 5 categories only.

After executing the above expression we would get the following response:

{
  "took": 16,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 54,
    "max_score": 0,
    "hits": []
  },
  "aggregationss": {
    "blog_categories": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 19,
      "buckets": [
        {
          "key": "programming",
          "doc_count": 11
        },
        {
          "key": "devops",
          "doc_count": 9
        },
        {
          "key": "news",
          "doc_count": 8
        },
        {
          "key": "poetry",
          "doc_count": 5
        },
        {
          "key": "informational",
          "doc_count": 4
        }
      ]
    }
  }
}

In this way, we can create a bucket for any field of the document. This was the basic blog for aggregations and in my next blog of aggregations, I will explain more complex examples using which we can get better insights into our data.

Other Blogs on Elastic Stack:
Introduction to Elasticsearch

Elasticsearch Installation and Configuration on Ubuntu 14.04
Log analysis with Elastic stack 
Elasticsearch Rest API
Basics of Data Search in Elasticsearch
Elasticsearch Rest API
Wildcard and Boolean Search in Elasticsearch
Configure Logstash to push MySQL data into Elasticsearch 
Metrics Aggregation in Elasticsearch
Bucket Aggregation in Elasticsearch
How to create Elasticsearch Cluster

If you found this article interesting, then you can explore “Mastering Kibana 6.0”, “Kibana 7 Quick Start Guide”, “Learning Kibana 7”, and “Elasticsearch 7 Quick Start Guide” books to get more insight about Elastic Stack, how to perform data analysis, and how you can create dashboards for key performance indicators using Kibana.

You can also follow me on:

- LinkedIn: https://www.linkedin.com/in/anubioinfo/

- Twitter: https://twitter.com/anu4udilse

- Medium: https://anubioinfo.medium.com




Comments (0)

Leave a comment

Related Blogs

preview thumbnail
Introduction to Kibana

Aug 1, 2020, 6:19:45 PM | Anurag Srivastava

preview thumbnail
Bucket Aggregation in Elasticsearch

Aug 29, 2018, 7:15:06 PM | Anurag Srivastava

preview thumbnail
Metrics Aggregations in Elasticsearch

Aug 18, 2018, 6:02:20 PM | Anurag Srivastava

preview thumbnail
Wildcard and Boolean Search in Elasticsearch

Aug 10, 2018, 7:14:40 PM | Anurag Srivastava

preview thumbnail
Basics of Data Search in Elasticsearch

Aug 4, 2018, 7:02:21 AM | Anurag Srivastava

preview thumbnail
Elasticsearch REST APIs

Jul 31, 2018, 6:16:42 PM | Anurag Srivastava

Top Blogs

preview thumbnail
Wildcard and Boolean Search in Elasticsearch

Aug 10, 2018, 7:14:40 PM | Anurag Srivastava

preview thumbnail
Elasticsearch REST APIs

Jul 31, 2018, 6:16:42 PM | Anurag Srivastava

preview thumbnail
preview thumbnail
Create a Chess board in PHP

Mar 9, 2020, 8:45:41 AM | Rocky Paul

preview thumbnail
Bucket Aggregation in Elasticsearch

Aug 29, 2018, 7:15:06 PM | Anurag Srivastava

preview thumbnail
Metrics Aggregations in Elasticsearch

Aug 18, 2018, 6:02:20 PM | Anurag Srivastava