Elastic search Aggregations (Metrics) Part - 5

In the previous post we saw about how to create index and how can we map fields in elastic search now we will see how aggregations work in elastic search.

If not referred previous post just have an overview of it and then read this tutorial.

In the previous post we prepared an index for elastic search known as company.In that we have a type called employee and various fields beneath.

Now lets insert some documents for application of aggregations.

POST company/employee/1
{
  "employee_company_name":"jobs beach",
  "employee_dateofjoining": "2015-01-01",
  "employee_designation":"software developer",
  "employee_experience":2,
  "employee_name":"rushabh thakkar",
  "employee_salary":19000,
  "employee_technology":"java"
}

POST company/employee/2
{
  "employee_company_name":"jobs sky",
  "employee_dateofjoining": "2016-01-01",
  "employee_designation":"software developer",
  "employee_experience":3,
  "employee_name":"jacos patel",
  "employee_salary":39000,
  "employee_technology":".net"
}

POST company/employee/3
{
  "employee_company_name":"jobs fire",
  "employee_dateofjoining": "2018-01-01",
  "employee_designation":"software developer",
  "employee_experience":7,
  "employee_name":"yogesh patel",
  "employee_salary":69000,
  "employee_technology":"angular"
}


Now In elastic search here we will understand 2 types of aggregations namely

  1. Metrics Aggregations
  2. Bucket Aggregations
Metrics Aggregations can be  of  single value type or multi value type lets understand by an example.

Suppose you are doing a survey with the records you have i.e in your documents that what is an avg salary of the developers in the documents.

You can achieve this by the Metrics aggregations of the type avg

Lets see how to achieve this

AVERAGE

GET company/employee/_search
{
  "size": 0,
  "aggs": {
    "avg_salary": {
      "avg": {
        "field": "employee_salary"
      }
    }
  }
}


The Results :
{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "avg_salary": {
      "value": 42333.333333333336
    }
  }
}

"aggs" is the keyword used to perform aggregations beneath it is the "avg_salary" that is assigned a name to the aggregations again under it "avg" is the keyword of the Metrics type aggregations of single value means it just returns avg of the salary as an aggregation output.

Note: I have mentioned size:0 if not mentioned then it would return all the documents.

Lets see some other basic Metric type Aggregations

value_count

In value count we will have just the count of the record which falls under it.

Lets take a same example say value_count of salary so we have 3 records and so we have three salaries But if we have 3 records and 2 only have salaries then answer will be 2

lets see how to achieve this

GET company/employee/_search
{
  "size": 0,
  "aggs": {
    "Total count salary": {
      "value_count": {
        "field": "employee_salary"
      }
    }
  }
}


Results:

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "Total count salary": {
      "value": 42333.333333333336
    }
  }
}

Lets take another one If we want max and min salaried person than how to achieve this?

Lets see..

GET company/employee/_search
{
  "size": 0,
  "aggs": {
    "minimum salried Person": {
      "min": {
        "field": "employee_salary"
      }
    }
  }
}

Results
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "minimum salried Person": {
      "value": 19000
    }
  }
}

Now just replace min word by max and observe the result ultimately we will get the max salary out of all documents.

Now lets talk about the multi valued Metrics aggregations

Here the keyword which we will use is stats

GET company/employee/_search
{
  "size": 0,
  "aggs": {
    "stats aggregations": {
      "stats": {
        "field": "employee_salary"
      }
    }
  }
}

Results:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "minimum salried Person": {
      "count": 3,
      "min": 19000,
      "max": 69000,
      "avg": 42333.333333333336,
      "sum": 127000
    }
  }
}

Here we are able to see count , min, max ,avg ,sum values by applying stats.So its called multi-valued aggregations.

In the next post we will see some examples of Bucket aggregations.

Comments

Popular posts from this blog

State space search / blocks world problem by heuristic approach /TIC TAC TOE

Navigation in Vaadin.

Drag and drop items from one Grid to another