In this Tech Bite, we will see how to use ElasticSearch’s Java API to perform some of the common Elasticsearch queries.

Elasticsearch query types

The queries in Elasticsearch can be broadly classified into leaf and compound queries. Leaf queries search for specific values in a certain field (or fields) and can be used independently. Some of these queries include matchterm, and range queries. Compound queries use the combination of leaf or compound queries. Essentially, they combine multiple queries to achieve their target results.

Basic search query

The “match” query is one of the most basic and commonly used queries in Elasticsearch and functions as a full-text query. We can use this query to search for text, numbers, or boolean values.

GET index/_search 
{
  "query": {
    "match" : {
      "name" : "Pizza" 
     }
   }
}

The QueryBuilders class provides a variety of static methods used as dynamic matchers to find specific entries in the cluster. Converting the above Elasticsearch query in JSON format to a QueryBuilder object will give the following:

final QueryBuilder qb = QueryBuilders.matchQuery("name", "Pizza");

Both the Elasticsearch query and the query written using Querybuilders class will return all the documents with the name “Pizza.” The match query also analyzes provided search term before performing a search. Keep in mind that Elasticsearch will analyze the text field lowercase unless you define a custom mapping. To search for the exact term you provide, use the term query.

To find the objects having values between the ranges of given values, we can use the rangeQuery parameter followed by the field and the conditions to check.

final QueryBuilder rangeQuery = QueryBuilders
    .rangeQuery(field)
    .from(startDate)
    .to(endDate)
    .includeLower(false)
    .includeUpper(true);
final QueryBuilder rangeQuery = QueryBuilders
    .rangeQuery(field)
    .gte(fromTimeInMillis)
    .lte(toTimeInMillis);

Bool query for logical expressions

The AND/OR/NOT operators can be used to fine-tune search queries to provide more relevant or specific results. Now, let’s convert a more complex logical expression into a QueryBuilder object. For the sake of simplicity and readability, this expression is divided into three parts.

Figure 1. An example of logical expression

Figure 1. An example of logical expression

Elasticsearch provides the facility for combining these queries using the bool query (one of the compound query clauses).

GET index/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "bool": {
            "must": [
              {
                "term": {
                  "field1": {
                    "value": "value1"
                  }
                }
              },
              {
                "term": {
                  "field2": {
                    "value": "value2"
                  }
                }
              }
            ]
          }
        },
        {
          "bool": {
            "must": [
              {
                "term": {
                  "field1": {
                    "value": "value3"
                  }
                }
              }
            ],
            "must_not": [
              {
                "term": {
                  "field2": {
                    "value": "value3"
                  }
                }
              }
            ]
          }
        }
      ]
    }
  }
}
final BoolQueryBuilder first = QueryBuilders.boolQuery()
    .must(QueryBuilders.termQuery(field1, value1))
    .must(QueryBuilders.termQuery(field2, value2));

final BoolQueryBuilder second = QueryBuilders.boolQuery()
    .must(QueryBuilders.termQuery(field1, value3))
    .mustNot(QueryBuilders.termQuery(field2, value3));

final BoolQueryBuilder filter = new BoolQueryBuilder()
    .should(first)
    .should(second);

Table 1. Boolean operations with the Bool Query Fields analogy

Table 1. Boolean operations with the Bool Query Fields analogy

Using nested query with nested object

To run a nested query, you must have an index that includes nested mapping. Elasticsearch has no concept of inner objects. The nested type is a specialized version of the object data type that allows arrays of objects to be indexed such that they can be queried independently of each other. Let’s say that this is the type of data we have stored on the Elasticsearch index:

{
   "name": "pizza",
   "main_ingredient": {
      "id": 1,
      "name": "cheese",
      "weight": 200
   },
   "other_ingredients": [
      {"id": 10, "name": "onion", "weight": 80},
      {"id": 11, "name": "tomato", "weight": 100},
      {"id": 12, "name": "olive", "weight": 70},
      {"id": 13, "name": "pepperoni", "weight": 150},
      {"id": 14, "name": "mushroom", "weight": 180}
   ]
}

We have to find recipes having cheese as the main ingredient (weight less than 200) with two other ingredients: tomato (weight less than 100) and mushroom (weight greater than 50).

GET index/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "main_ingredient.name": "cheese"
          }
        },
        {
          "range": {
            "main_ingredient.weight": {
              "lte": 200
            }
          }
        },
        {
          "nested": {
            "path": "other_ingredients",
            "query": {
              "bool": {
                "must": [
                  {
                    "match": {
                      "other_ingredients.name": "tomato"
                    }
                  },
                  {
                    "range": {
                      "other_ingredients.weight": {
                        "lte": 100
                      }
                    }
                  }
                ]
              }
            }
          }
        },
        {
          "nested": {
            "path": "other_ingredients",
            "query": {
              "bool": {
                "must": [
                  {
                    "match": {
                      "other_ingredients.name": "mushroom"
                    }
                  },
                  {
                    "range": {
                      "other_ingredients.weight": {
                        "gte": 50
                      }
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  }
}

The nested query searches nested field objects as if they were indexed as separate documents.

final QueryBuilder firstNestedQuery = QueryBuilders.nestedQuery(
     "other_ingredients",
     QueryBuilders.boolQuery()
           .must(QueryBuilders.matchQuery("other_ingredients.name", "tomato"))
           .must(QueryBuilders.rangeQuery("other_ingredients.weight").lte(100)),
     ScoreMode.None
);

final QueryBuilder secondNestedQuery = QueryBuilders.nestedQuery(
     "other_ingredients",
     QueryBuilders.boolQuery()
           .must(QueryBuilders.matchQuery("other_ingredients.name", "mushroom"))
           .must(QueryBuilders.rangeQuery("other_ingredients.weight").gte(50)),
     ScoreMode.None
);

final QueryBuilder queryBuilder = QueryBuilders.boolQuery()
     .must(QueryBuilders.matchQuery("main_ingredient.name", "cheese"))
     .must(QueryBuilders.rangeQuery("main_ingredient.weight").lte(200))
     .must(firstNestedQuery)
     .must(secondNestedQuery);

 


“Converting logical expressions into ElasticSearch’s Java API using QueryBuilders class” Tech Bite was brought to you by Kerim Kadušić, Software Engineer at Atlantbh.

Tech Bites are tips, tricks, snippets or explanations about various programming technologies and paradigms, which can help engineers with their everyday job.

oban
Software DevelopmentTech Bites
February 23, 2024

Background Jobs in Elixir – Oban

When and why do we need background jobs? Nowadays, background job processing is indispensable in the world of web development. The need for background jobs stems from the fact that synchronous execution of time-consuming and resource-intensive tasks would heavily impact an application's  performance and user experience.  Even though Elixir is…
selenium
QA/Test AutomationTech Bites
December 22, 2023

Selenium Grid 4 with Docker

Introduction When talking about automation testing, one of the first things that comes to mind is Selenium. Selenium is a free, open-source automated testing framework used to validate web applications across different browsers and platforms. It is not just a single tool but a suite of software. Every component of…

Want to discuss this in relation to your project? Get in touch:

Leave a Reply