JoBins採用企業サイト JoBinsエージェントサイト

アカウントをお持ちの方はこちら 採用企業ログイン
エージェントログイン


background

A Look Into Elastic Search

Jun 15, 2018

Elastic search is an open source, restful, distributed and highly scalable analytics search engine capable of solving a growing large number of data.
Benefits
1) Fast and real-time
It is very fast to search data in large volumes of data. In Conventional SQL database management SQL quires would tale more than 10 seconds in the larger dataset but using elastic search SQL will return matches data in 10 milliseconds. Not only updating documents in index usually take around one second which lets us use cases application monitoring and anomaly detection.

2) Easy to use API
It provides rest API, a simple HTTP interface and uses schema-free JSON documents making to easy to index your documents, search and manage the large dataset

3) Tooling and plugins
Elastic search has visualization and reporting tool knows has kibana from where we can search, visual data and manage data with mapping of the index. Logtash is another data collection engine with real-time channeling capabilities.

4) Various deployment language support
It has various open source clients to make easy for developers. It supports Java, Python, PHP, Ruby, node.js, javascript and others.

The basic concept of elastic search components relating to relational database management system so that it will be easy to understand elastic components.

Elastic search                       RDMS
Index                                     Database
Shard                                     Shard
Mapping                                 Table
Field                                        Field
JSON Object                            Tuple

After installation of elastic search with compatible version kibana tool we can manage data likewise.
Run the elastic search and kibana and open the kibana interface from your browser. In localhost, kibana URL is like http://localhost:5601

Datatypes
Like in relational database management system elastic search also have its datatypes which are

  • String
  • Text and keyword(keyword datatype is used for searching exact matching values For eg like username and password)
  • Numeric Datatypes(long, integer,short, byte, double, float,half_float,scaled_float)
  • Date
  • Boolean
  • Range Datatypes(integer_range,float_range, long_range, double_range, date_range)

DSL Queries (Note: this query should run in kibana)

Query to create index with mapping (In terms of RDMS creating database with table)

PUT CustomIndexName
{
"mappings":{
"CustomMappingName":{
"properties":{
"fieldName1":{"type":"datatype"},
"fieldName2":{"type":"datatype"}
}
}
}

Query to update Mapping with new field

PUT IndexName/_mapping/mappingName
{
"properties":{
"fieldName3":{"type":"datatype"}
}
}

Query to insert data

POST IndexName/mappingName/nodeID(node id can be set by user )
{
"fieldName":"value",
"fieldName":"value" 
}

Query to update data

POST IndexName/mappingName/nodeID/_update {
"fieldName":"value",
"fieldName":"value"
} 

Search Queries

Simple Queries

Query search specific data from the whole index

GET IndexName/_search?q=searchdata

Query to search data in the specific field

GET IndexName/_search?q=fieldName:data

Operators

must
All of these clauses must match. The equivalent of AND

must_not
All of these clauses must not match. The equivalent of NOT

should
At least one of these clauses must match. The equivalent of OR

filter
Clauses that must match but are run in non-scoring filtering mode.

Some example of complex queries

MultiSearch
Search one search value from multiple fields

GET IndexName/mappingName/_search
{
"query":{
"mult_match":{
"query":"search text",
"fields":["fieldName1","fieldName2"]
}
}
}

Wildcards search
Fields can be specified with wildcards
E.g(query the title, first_name, last_name)

Get indexName/mappingName/_search
{
"query":{
"multi-match":{
"query": "search text",
"fields":["title","*_name"]
}
}
}

Results this query search data from title field and all others fields which name ends with _name so that we don't need to define each field name that names end with _name

best_fields search
The best_fields is searching for multiple words best found in the same field. For instance "google creation" is a single field is meaningful than "google" is one field and "creation" in other.

GET IndexName/mappingName/_search or GET IndexName/_search
{
"query":{
"multi_match":{
"query":"search text",
"type":"best_fields",
"fields":["field1","field2"],
"tie_breaker":0.3
}
}
}

Multi search conditions from multiple fields

GET indexName/mappingName/_search
{
"query":{
"bool"{
"should":[
{"match":{"fieldName1":"value1"}},
{"match":{"fieldName2":"value2"}}
]
}
}
}

Here we are using should terms which is to be or clause so we will get match data either from fieldName1 or fieldName2 or both. In case of should clause we can use must or must_not according to our search conditions.

Range query
Matches records with fields that have terms within the defined range. The query depends on the field type for string fields ie TermRangeQuery and NumericRangeQuery for number and date fields.

gte => greater than or equal to
lte=> lesser than or equal to
gt=>greater than
lt=>lesser than

Example of numeric range query
Query to search all staff whose monthly income is equal to 2 thousand or greater than 2 thousand and equal to 4 thousand or lesser than 4 thousand

GET indexName/mappingName/_search
{
"range":{
"income":{
"gte": 2000,
"lte":4000
}
}
}

Sort by query
Query to sorting the results either ascending or descending

GET indexName/_search
{
"query":{
"sort":[
{"fieldName":{"order":"desc"}}
]
}

Select specific field query
Specific field query to select certain fields as a result. It is like as select field1,field2 from Table as in RDMS

GET indexName/_search
{
"_source":["field1","field2"]
}

Terms query
Terms query to search multiple search text from one field and array also

GET indexName/_search
{
"query":{
{"terms":{"field":["value1","value2"]}}
}
}

Prefix query
Prefix query to search items from specific field starts with certain character or term

GET indexName/_search
{
"query":{
"prefix":{"field":"value"}
}
}

Example

GET organization/employes/_search
{
"query": {
"prefix":{"first_name":"ak"}
}
}

Wildcard query
A wildcard is * which match any character sequence(including the empty one) and? .wild card query can be slow and should not start with one of the wildcard * or?

GET indexName/_search
{
"query":{
"wildcard":{"field": "firstchar of search text * lastchar of search text"}
}

 

Ids query
Filters documents according to the node id that we provided during add documents

GET indexName/_search
{
"query":{
"ids":{
"type":"_doc",
"values":["node1","node2"]
}
}
}

"type" is optional and can be omitted

Phrase matching
Query to search for standard full-text search

GET indexName/_search
{
"query"{
"match_phrase":{
"field":"value"
}
}

Deployment in various languages will continue in next blog.

References

https://aws.amazon.com/elasticsearch-service/what-is-elasticsearch/
https://www.elastic.co/