Querying in elastic search (part -3)

Now moving further with the query part All the query which will be performed will be on the data which we have in part -2 of the post so if not refered do refer it first.

For querying , What if you only  remember a word cricket say , I want to find the student who knows cricket to play you have some sort of student records and you only know the database name and a table name.

Now we will need a query to find records containing cricket in db.

POST /college/student/_search
{
    "query" :
    {
        "query_string":
        {
           "query": "cricket"
        }
    }
}

We can have this type of query for the search query for _search in that we again have a query_string,
The query_string query parses the input and splits text around operators. Each textual part is analyzed independently of each other.

Assume here we only have a word cricket here first of all so by applying this query it will search independently in all the fields for the word cricket and will list out all the records matching the word cricket.

Now if you feel that you remember the names of the keys which contain "cricket" so we can also provide the default_field

POST /college/student/_search
{
    "query" :
    {
        "query_string":
        {
          "default_field": "hobbies",
           "query": "cricket"
        }
    }
}

For default field we also have text based and keyword based ,If we choose keyword based then "cricket " as a text will not work. 

For more understanding refer to part -2 regarding text based and keyword based.

Let's crosscheck whether it works in the default field only or not.

POST /college/student/_search
{
    "query" :
    {
        "query_string":
        {
          "default_field": "studentId", 
           "query": "cricket"
        }
    }
}

just hit the above query and observe the result.

We will have an exception such as

{
  "error": {
    "root_cause": [
      {
        "type": "query_shard_exception",
        "reason": "failed to create query: {\n  \"query_string\" : {\n    \"query\" : \"cricket\",\n    \"default_field\" : \"studentId \",\n    \"fields\" : [ ],\n    \"type\" : \"best_fields\",\n    \"default_operator\" : \"or\",\n    \"max_determinized_states\" : 10000,\n    \"enable_position_increments\" : true,\n    \"fuzziness\" : \"AUTO\",\n    \"fuzzy_prefix_length\" : 0,\n    \"fuzzy_max_expansions\" : 50,\n    \"phrase_slop\" : 0,\n    \"escape\" : false,\n    \"auto_generate_synonyms_phrase_query\" : true,\n    \"fuzzy_transpositions\" : true,\n    \"boost\" : 1.0\n  }\n}",
        "index_uuid": "suTgVPZDQziSPLZXTE8QPA",
        "index": "college"
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "college",
        "node": "hoHWVaa9S2OMMQXXwJLpIg",
        "reason": {
          "type": "query_shard_exception",
          "reason": "failed to create query: {\n  \"query_string\" : {\n    \"query\" : \"cricket\",\n    \"default_field\" : \"studentId \",\n    \"fields\" : [ ],\n    \"type\" : \"best_fields\",\n    \"default_operator\" : \"or\",\n    \"max_determinized_states\" : 10000,\n    \"enable_position_increments\" : true,\n    \"fuzziness\" : \"AUTO\",\n    \"fuzzy_prefix_length\" : 0,\n    \"fuzzy_max_expansions\" : 50,\n    \"phrase_slop\" : 0,\n    \"escape\" : false,\n    \"auto_generate_synonyms_phrase_query\" : true,\n    \"fuzzy_transpositions\" : true,\n    \"boost\" : 1.0\n  }\n}",
          "index_uuid": "suTgVPZDQziSPLZXTE8QPA",
          "index": "college",
          "caused_by": {
            "type": "number_format_exception",
            "reason": "For input string: \"cricket\""
          }
        }
      }
    ]
  },
  "status": 400
}

So its clear now that it searches only across the field provided.


Now again the some complicated test case you don't remember whether your hobbies or sport as a key contains the word cricket or not then we can also find as 


POST /college/student/_search
{
    "query" :
    {
        "query_string":
        {
          "fields" : ["sport", "hobbies"], 
           "query": "cricket"
        }
    }

Here the default_field will be replaced by fields.

Now we will move forward to bool query.....

For the explanation we will just move towards the scenario not diving too deep in the following Api's

Suppose we want the record of students whose student Id no is greater than 1.For that we can use a bool query that give me the records whose student Id no is greater than 1.

The query goes here ....

POST /college/student/_search
{
    "query" :
    {
        "bool": {
          "filter": {"range": {
            "studentId ": {
              "gte": 1
            }
          }}
        }
    }
}


Here we will use bool in that we will use the term filter that means we are filtering the records in the range greater than 1 for greater than we use gt and less than we use lt, Now hit the query and observe the result.

We will get the following result

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0,
    "hits": [
      {
        "_index": "college",
        "_type": "student",
        "_id": "2",
        "_score": 0,
        "_source": {
          "studentName ": "Yogesh Dhedhi",
          "studentId ": 2,
          "hobbies": "Active discussions and debate",
          "sport": "pool and snokker"
        }
      },
      {
        "_index": "college",
        "_type": "student",
        "_id": "4",
        "_score": 0,
        "_source": {
          "studentName ": "Darshan",
          "studentId ": 4,
          "hobbies": "shopping",
          "sport": "football"
        }
      },
      {
        "_index": "college",
        "_type": "student",
        "_id": "1",
        "_score": 0,
        "_source": {
          "studentName ": "kushal baldev",
          "studentId ": 1,
          "hobbies": "play cricket read books and writing blogs",
          "sport": "cricket"
        }
      },
      {
        "_index": "college",
        "_type": "student",
        "_id": "3",
        "_score": 0,
        "_source": {
          "studentName ": "John",
          "studentId ": 3,
          "hobbies": "horse riding",
          "sport": "Long jump"
        }
      }
    ]
  }
}

If we want gt 1 and lt 3 Lets try...

POST /college/student/_search
{
    "query" :
    {
        "bool": {
          "filter": {"range": {
            "studentId ": {
                "gt" : 1,"lt":3
            }
          }}
        }
    }
}

Similarly we have gte and lte grater than or equal to and less than or equal to for that we can use gte and lte instead of gt and lt vice versa.

In the Next post we will work around with new operators and new scenarios with different query structures.

Comments

Popular posts from this blog

State space search / blocks world problem by heuristic approach /TIC TAC TOE

Navigation in Vaadin.

Drag and drop items from one Grid to another