Posted by Kelvin on 02 Jul 2018 at 04:41 pm | Tagged as: Lucene / Solr / Elasticsearch / Nutch
To recap, in a previous article, we saw that a SearchComponent comprises of a prepare() and process() method:
and that the standalone SolrCloud application flow of a search request looks more or less like:
SolrCloud high-level search request lifecycle
When a search request is received by a SolrCloud collection. the following sequence of events takes place:
1. A node receives the search request.
2. On this node, each SearchComponent's prepare() method is called.
3. A loop is entered where the request is handled in stages. The 'standard' stages, as defined in ResponseBuilder, are:
In each stage, each SearchComponent has the ability to 'vote' on the next stage. The lowest stage from each SearchComponent is what determines the actual next stage.
The request may not go through every stage. It all depends on what SearchComponents are handling this request.
For illustrative purposes, this is what QueryComponent's stage handling code looks like:
It is beyond the scope of this article to fully go into each stage, but it is sufficient for you to know that search requests are handled in stages.
4. At each stage, SearchComponents may send out requests to the other shards, requesting data in some form. This is known as a ShardRequest. Every ShardRequest has an integer field called purpose. The ShardRequest purpose should not be confused with the ResponseBuilder stage. They are both integer fields, but that's where the similarity ends.
For example, in QueryComponent's STAGE_EXECUTE_QUERY, createMainQuery() is called, in which a ShardRequest is sent out to the other shards, requesting them to execute the query and return the top matching docs. This ShardRequest has the purpose of ShardRequest.PURPOSE_GET_TOP_IDS.
5. In a SearchComponent, the handling of the ShardRequest and its specific purpose occurs in 2 separate methods, both of which are contained in the SearchComponent: process() and handleResponses().
process() is executed on each shard which receives the ShardRequest.
handleResponses is executed on the shard that initiated the ShardRequest, where usually some process of merging and collation takes place. handleResponses() would be much better named handleShardResponses(). The signature of handleResponses() is
The responses of each shard to this ShardRequest is available from sreq.responses.
This step is potentially the most confusing conceptual bit of the entire lifecycle, so let's stop here to recap.
The key realization is that even though the ShardRequest creation, processing and ShardResponse handling takes place in a single SearchComponent, they are executed on different shards.
For example, given 2 shards, ShardA and ShardB. Suppose the request arrives at ShardB and we are at the STAGE_EXECUTE_QUERY in QueryComponent.
a. In ShardB's QueryComponent.createMainQuery(), a ShardRequest to all other shards, with ShardRequest.PURPOSE_GET_TOP_IDS
b. ShardA and ShardB's QueryComponents receive the ShardRequest and respond to it in process()
c. ShardB's QueryComponent.handleResponses() is called with the ShardRequest containing the 2 ShardResponses and collates the results.
5. After each stage is complete, finishStage() is then called. The signature of finishStage() is
6. The above steps are repeated until we arrive at the ResponseBuilder stage of STAGE_DONE.
This concludes the core lifecycle of a search request in SolrCloud.
In summary, a SolrCloud collection is made of a number of shards. For a search request to be handled, each shard needs to provide data to the search request.
A distributed search request is handled in (ResponseBuilder) stages.
In any of the ResponseBuilder stages, a SearchComponent may initiate ShardRequests to the shards in a collection.
That same SearchComponent (though potentially residing on a different machine also responds to that ShardRequest (in process()), as well as collates the ShardResponses from the various shards to that ShardRequest (in handleResponses() or finishStage()).
In this fashion, the necessary computation required to build the response to the search request is executed.
1. For simplicity's sake, I opted to use the terminology of shards as opposed to replicas, since this is more of a conceptual overview. I am aware that it is actually replicas that initiate and respond to ShardRequests.
2. Similarly, I use the terminology of machines as opposed to nodes. Nodes is the 'proper' term used in Solr documentation and code.
- Introducing this series on developing Solr SearchComponents
- Developing a Solr SearchComponent for standalone Solr
- Introducing SolrCloud
- SolrCloud search request lifecycle and writing distributed Solr SearchComponents