This post is about my GSoC project, that I worked on during summer, 2016. I worked under the LabLua organization on adding a test suite and improving documentation for elasticsearch-lua. elasticsearch-lua.
Elasticsearch is a distributed, scalable and full-text search engine based on Lucene. It provides an HTTP web interface and handles JSON documents. It is presently ranked 1 in the category of ‘Search engines’.
My GSoC project this year was entitled ‘Improve elasticsearch-lua tests and builds’ and was a continuation of the work that I had done last year. Apart from adding a test suite for
elasticsearch-lua and making it robust, I also decided to work on the documentation of the code.
Test suite for elasticsearch-lua
The tests are divided into unit, integration and stress tests. Note that all these tests run for Lua 5.1, 5.2, 5.3 and LuaJIT 2.0. Code coverage is measured for unit tests and integration tests. Coveralls was chosen to measure and maintain code coverage. As of now, around 91% of the code is covered with tests.
There are many different modules within
elasticsearch-lua. For every such module, there is a corresponding unit test written. Unit tests can be found in
tests/ directory. Care was taken to test extensively all the endpoints. Some key points to note:
Some modules were ‘mocked’ to intercept external calls.
Not only return values (success or failure) but every internal parameter was ‘deep’ checked. Deep check involves checking each nested parameter recursively. For example, a lua table might have another table inside it.
Travis was chosen for continuous integrations. Everytime code is pushed, a build is triggered on travis and unit tests are run. Success or failure status is reported back.
A number of bugs (pertaining to generating of target url for endpoint, and listing source files in the rockspec file) were found by running the tests. All were fixed.
The diff of changes due to unit tests can be seen here.
Apart from the test of every component individually, it is equally important that they work together while interacting with each other. To make
elasticsearch-lua robust, it was necessary to add some integration tests.
Integration tests involve calling an API function in a real environment and testing parameters at every point. Wrappers for some API functions were developed so as to avoid repeated code.
We believe that using real data for integration tests is always a good practice. Also, the test dataset should stress the system a bit and, thus, it should not be very small. Therefore, we opted by using part of the data available freely from www.githubarchive.org. A mirror is maintained here. The dataset is not a part of the main repository due to size, so it is downloaded on the fly while running tests on travis.
Common operations (such as search, index, get, delete and bulk) were tested in a single run. These operations are intermixed together.
The diff of changes due to integration tests can be seen here.
Stress tests involve testing elasticsearch-lua limits. By having these tests, the client will be able to prove its stability in an effective manner.
A separate framework for stress testing was designed, considering that it might take a few hours to finish. In short, every successful (unit + integration tests) build triggers a new build, which runs the stress tests, provided that no such build is already running.
The status of stress tests is reported through a separate badge in the README.
The diff of changes due to stress tests can be seen here.
Having a good documentation is very important for any library. It helps developers to understand functionalities without having to investigate the code. Moreover, it helps the library adoption as new developers can use it as a guide to get started. Although this was initially not a task for the GSOC project, after realizing its importance, I opted to invest a lot of time in the documentation and added it to the GSoC project timeline.
The guides consist of documents and tutorials that help developers to install, use and customize elasticsearch-lua. The guides explain the most frequently used functionalities along with some internals. These pages are hosted here.
The API Documentation lists all possible functions provided by the elasticsearch-lua. Each function name is accompanied by the parameters that it accepts. The API documentation is published here.
The diff of changes pertaining to documentation can be seen here.
Additional tasks (Not part of GSoC)
Apart from the tasks mentioned above, I worked on the following as well:
While working with
elasticsearch-lua, I had to frequently switch between different versions of lua while developing the test suite. Switching is not simple and I faced the following issues often:
Building different lua versions required some effort such as downloading the version source, unzipping, installing and managing any dependency faced. Also, the previous version had to be deleted completely in order to avoid any ambiguity.
Luarocks installation depends on the Lua version. Switching lua versions can mess up the installed rocks.
To solve these issues I used workaround methods, such as editing the source code of some existing rocks.
Sometimes, these code changes broke the entire rock. In such cases, I had to remove all existing rocks, rebuild luarocks and then reinstall the needed rocks.
I also wrote a separate blog post about luaver and you can support the project here. Initially, I didn’t expect to spend much time on it and figured that I could manage both GSoC and develop luaver simultaneously. However, at some point in time, I got too involved in luaver which resulted in me getting one week behind the timeline that I had proposed for GSoC. Nevertheless, I covered it up soon.
It is important that the client implements all the features provided by Elasticsearch. Also, Elasticsearch is evolving a lot and releasing in a fast pace, so it is important that clients are also up-to-date. Some features were missing and the client version was 1.6 while Elasticsearch is in 2.3. Therefore, I decided to update existing features and implement some missing features.
Benefits of working on the same project for two consecutive years
I myself had written the client. The codebase was already at my finger-tips. I could spend more time working than understanding and getting comfortable with the code.
I wanted to further consolidate my client and make it stable. I couldn’t get much time during the rest of the year to work full-fledged on the development. Google Summer of Code offered a nice incentive.
I had already worked with the Lua community. Being in familiar environment, I was able to work and think freely. luaver was created to benefit the open source Lua community. If this was my first time I wouldn’t even have thought about developing it.