Data Aggregation in Elasticsearch: A Guide

Kacper Bąk
2 min readAug 14, 2023

The wonderful world of Elasticsearch offers a plethora of options for data analysis. In this guide, we’ll walk you through a hands-on lab where you’ll aggregate data and glean insights from a 3-node Elasticsearch cluster.

Elasticsearch logo
Elasticsearch: The Heartbeat of Big Data

The Scenario

You’re a data analyst at an online banking company. The company utilizes a 3-node Elasticsearch cluster as a NoSQL database to manage active accounts. Your mission is to answer a series of questions using the Elasticsearch search API without fetching any documents, focusing strictly on aggregation results.

Questions:

  1. How many unique employers exist among account holders?
  2. How many accounts are there in each of the 50 US states?
  3. What’s the average account balance for each of the 50 US states, and which state boasts the highest average balance?

Tools & Resources:

  1. Elasticsearch Cluster Nodes
    - Master-1
    - Data-1
    - Data-2
  2. Kibana Instance: On the master-1 node.
Kibana Login Page
Kibana Login Page

Step-By-Step Aggregation Walkthrough

1. Aggregating Unique Employers
In the Kibana console tool, run:

GET bank/_search
{
"size": 0,
"aggs": {
"employers": {
"cardinality": {
"field": "employer.keyword"
}
}
}
}
Unique Employers Aggregation Result
Unique Employers Aggregation Result

2. Aggregating Accounts by State
Execute:

GET bank/_search
{
"size": 0,
"aggs": {
"state": {
"terms": {
"field": "state.keyword",
"size": 50
}
}
}
}
State Account Aggregation Result
State Account Aggregation Result

3. Aggregating Average Balance & Finding the State with the Highest Average
Execute:

--

--