Installing and using Solr 6.1 cloud in AWS EC2 (with some note on upgrading)

Like any company, we also have some legacy codes. Our codes were using Solr 3 and I was going to upgrade it to the latest (6.1). The upgrading itself is not such a big deal, just fire up a new setup convert the old schema type to the new schema type which only differs in XML formats. I am not going through that as you can easily get sample schema format from latest version and just compare it to your schema. Once done you can start the new solr with your old schema and it will start giving errors!! but with patience and hard work you can resolve them one by one.

Anyway, the upgrade process is not such a big deal but working with new solr is. Specially if you want to use the cloud version which uses zookeeper to manage the configs, shards, replications, leaders and etc. All you might come on your way is some depreciated class or missing class which you can download.

In my case I found this page very useful to find the deprecated classes of Solr 3.6.

Before I jump on Solr cloud 6.1 you may need to know some concepts:

  1. Collection: A single search index.
  2. Shard: A logical section of a single collection (also called Slice). Sometimes people will talk about “Shard” in a physical sense (a manifestation of a logical shard). Shard is literally the parts of your data. It means if you have 3 shards then all your data (documents) are distributed in 3 parts. It also means if one of the shards is missing then you are in trouble!!
  3. Replica: A physical manifestation of a logical Shard, implemented as a single Lucene index on a SolrCore. Replica is the replication of the shards! so if you have replication factor of 2 then you will have 2 copy of each shard.
  4. Leader: One Replica of every Shard will be designated as a Leader to coordinate indexing for that Shard. Leader is the master node in a shard. So if you have to replicas, then the master one is the boss!
  5. SolrCore: Encapsulates a single physical index. One or more make up logical shards (or slices) which make up a collection.
  6. Node: A single instance of Solr. A single Solr instance can have multiple SolrCores that can be part of any number of collections.
  7. Cluster: All of the nodes you are using to host SolrCores.

In continue, I will go through installing and using this whole setup.

Continue reading