Elasticsearch is an open-source, broadly accessed, readily scalable, enterprise-grade search engine. Accessible through an extensive and elaborate API, it supports extremely quick search commands on a large amount of data powering discovery applications.
We build an application that would search a large amount of GIS data to highlight geographical features such as roads, paths, lanes for a given location. The GIS data was made available in a geoJSON (a JSON format ) [for more details refer: ‘http://geojson.org/’ ] format. The geojson data comprise of road-specific information (highway, city road, one-way, paved/unpaved, latitude, longitude, etc). The objective was to import this information into an Elasticsearch instance in the most efficient way and then power an API library that would provide all the coordinates of a road for a search query. For example, if a user gives a name of a road, the elastic search would return all the waypoints or coordinates for the road and the end application would then highlight those coordinates to identify the road on the map and display the metadata alongside.
Expand to read more
The API library was developed using node.js and the user interface was built using AngularJS and Layers 3.0. [for more details refer : ‘https://openlayers.org/’ ] .
Below is a step-by-step explanation of our approach:
To understand what we have done to develop our application and which challenges we faced during development, and end-user should know the basic structure of Elasticsearch and Node.js.
Step 1: Implementation of Elasticsearch and integrate it with Node.js
The above image shows the comparison between Elasticsearch and a relational database (RDBMS). Elasticsearch has a simpler structure. The structure follows the steps like:
Step 1: Create Index (my_index in image),
Step-2: Create Type (my_type in image),
Step-3: Upload JSON data to the created type as documents with separate index for each parametric data point((A,B,..X,Y in image)).
Elasticsearch supports a tree structure: Index>Type>Document>fields. In Elasticsearch multiple Indexes can be created. In each index, multiple types can be created and in each type, multiple documents can be uploaded. And also each document consists of multiple fields.
After installation of Elasticsearch (from https://www.elastic.co), it can be run on the default port 9200. We used a Node.js application for Index creation, type creation, for uploading JSON data in bulk to Elasticsearch and for the development of API that would provide all the coordinates of a road using search query of Elasticsearch(in our case). To connect Node.js with Elasticsearch, we need to install ‘Elasticsearch.js’ in our Node.js application. Elasticsearch.js is the official Elasticsearch client for Node.js by which we can create indexes, types and upload bulk JSON data to Elasticsearch.
Step 2: Establish Connection between Elasticsearch and Node.js Application
Using the command ‘module.exports = client’, Connection.js can be exported to anywhere in Node.js application to keep the connection of Elasticsearch alive via elasticsearch.Client.
Step 3: Create Index
Indexing in Elasticsearch is not quite like indexing in other databases. In Elasticsearch, an index is a place to store related documents. To create an index in Elasticsearch with Node.js application, we require a Connection.js to be imported, and then we can create an index. The following illustrates the importing of elasticsearch connection and the creation of an index.
Step 4: Bulk import nested JSON structures
The above sections give a basic understanding of how Elasticsearch and Node.js application can work together. Now it will be easy to understand and overcome the challenges that are typically faced around bulk imports. While there are multiple mechanisms/tools by which data can be uploaded into elasticsearch, such as using Kibana or logstash, our objective was to use a custom-built node.js application to upload the data in bulk.
Our objective was to upload a large amount of geojson data to Elasticsearch. The geoJSON format is a derivative of the JSON format file with a complex JSON tree and nested structure as shown below:
- How do we upload a nested JSON file structure into elasticsearch using Node.js? Does elasticsearch support a nested JSON format?
- How do we create a separate index for each data point while avoiding the risk of indexing large quantities of data under a single index, which typically is expected from a bulk upload?
While it is easy to load large amounts of JSON data into Elasticsearch where every datarow is indexed separately. Our objective, however, was to upload geoJSON file (which is a custom JSON data file) to Elasticsearch using a Node.js application. We had to develop a code in Node.js that can easily upload the nested JSON data in bulk to Elasticsearch.
We were getting an error while parsing our geojson file using the command result=JSON.parse(‘./filename.geojson’); as shown in the above image.
‘jsonfile’ is a Node.js package that has to be installed to the Node.js application (for installation and details refer: ‘https://www.npmjs.com/package/jsonfile’). ‘Jsonfile’ can do multiple tasks with any type of JSON data, whether it is flat JSON or complex JSON. It would stringify the JSON data then parse the stringified JSON and then read the parsed JSON, which can be sent for upload in bulk to Elasticsearch.
We installed the jsonfile packages in our Node.js application. We then imported ‘jsonfile’ and modified the code of bulkupload.js.
By running this script on Node.JS, we uploaded large amount of geoJSON data into Elasticsearch in bulk with a separate index of each record. We can run the following URL in the browser to cross-check whether each record of geoJSON data: ‘localhost:9200/bulkimport/bulkdoc/_search’.
- In URL localhost:9200 indicates that Elasticsearch is running on your local IP address with 9200 port.
- bulkimport indicates the index you have created in Elasticsearch and bulkdoc indicates the type you created in the index.
- _search is a REST API of Elasticsearch. We can also run curl command with _search API to cross-check the same on the command line or ‘Kibana’.
After importing the geojson file to Elasticsearch, we developed an API with HTTP protocol by calling the search query of Elasticsearch. This was achieved by Node.js and Express.js.
The image given below is the UI of our Application. The end-users would be able to search for a particular area of the road or the entire road with its name or road reference number. As per the road reference number mentioned in the search engine of the image, the highlighted part of the image is the end result for the required location.
In today’s world, advancement in technology is resulting in customers expecting a very intuitive shopping experience…