Hướng dẫn cài đặt zookeeper cho solr

The steps below are a guide to help you install Solr versions 4.1 to 4.6. Here, I tested Solr 4.6 installation on Windows 7 and Windows 8.1.

1. Download and install Java SDK [I installed jdk-7u45-windows-x64.exe].

2. Download and install Apache Tomcat [I installed apache-tomcat-8.0.0-RC5.exe]. Usually, Tomcat is installed on port 8983 [but this is optional- you can specify your own port].

3. Test the Tomcat server in your browser. You should see the following screen:

4. You can configure the Tomcat server by going to Windows > Start > Monitor Tomcat

5. Stop the Tomcat server by going to Windows > Start > Monitor Tomcat > General Tab> Stop

6. Download Solr-4.6.0 and unzip it in your local directory i.e. C:\ Solr-4.6.0 [download zip file].

7. Go to downloaded Solr folder above [step 6] and Copy solr.war file to Apache webapps folder. I.e. Copy C:\solr-4.6.0\dist\solr-4.6.0.war file to C:\Program Files\Apache Software Foundation\Tomcat 8.0\webapps folder [rename solr-4.6.0.war to solr.war].

8. Create an empty Solr home folder. i.e. C:\solr

9. Go to downloaded Solr folder above [step 6]. Copy all files from C:\solr-4.6.0\solr-4.6.0\example\solr folder to C:\solr [Solr home folder]. This will be your Solr home folder.

10. Look into C:\solr and you will see two folders with name collection1 and bin.

11. Copy the jars from C:\solr-4.6.0\example\lib\ext [all jars] into C:\Program Files\Apache Software Foundation\Tomcat 8.0\lib [this is your Tomcat server main library directory].

12. Set the Java system property solr.solr.home to your Solr Home. Go to Windows > Start > Monitor Tomcat > Java Tab > Java Options. Enter the following entry at the end [see below screenshot]: -Dsolr.solr.home=c:\solr

13. Restart Tomcat by going to Windows > Start > Monitor Tomcat > General Tab > Start

14. Test Solr by going to //localhost:8983/solr/ in your browser. You should see the Solr admin page below.

Let us now see how we can actually practically implement an Apache Solr cloud using external Zookeeper ensemble.

Let us assume we have 3 separate nodes as that is the minimum number of nodes required to create a zookeeper ensemble. This can be scaled to any number of nodes depending on horizontal scaling factor that is required for the use case. There can be a scenario where zookeeper and Solr instances reside on different nodes and in my opinion that is how it should be. Because if you think about it, This will actually make the cluster more robust and fault tolerant.

First of all, let’s download zookeeper and Apache Solr in each of the 3 machines whose IPs we assume are 10.17.153.112, 10.17.152.145 and 10.17.153.247 using following steps for Mac OS or Linux. Windows user can directly download and unzip the package.

Commands to download and extract SolrCommands to download and extract ZooKeeper

Now that we have Solr and Zookpeer in place. Let’s start by setting up zookeeper.

Setting up Zookeeper Ensemble

Login to 10.17.153.112 and do the following-

  1. Make a new directory for zookeeper data and navigate inside it using following command

mkdir /opt/zookeeperdata && cd zookeeperdata

2. Create a new file at this location called myid which should contain the serial number of the server. This indicates that server 10.17.153.112 is the 1st server of zookeeper ensemble.

echo “1” > myid

3. Rename zoo_sample.cfg to zoo.cfg in conf folder of zookeeper

mv /opt/zookeeper/conf/zoo_sample.cfg /opt/zookeeper/conf/zoo.cfg

4. comment dataDir and clientPort in zoo.cfg file by prepending the lines with #

5. Add following lines in zoo.cfg

zoo.cfg

In server.x the value of x should be exactly the same as the value in the file named myid we created in step 2 clientPort=2181 means that zookeeper on this node will spin up on port 2181

7. Now let us spin up the zookeeper instance by invoking the zkServer.sh file in bin of zookeeper

./opt/zookeeper/bin/zkServer.sh start /opt/zookeeper/conf/zoo.cfg

8. Repeat steps 1 to 7 on every server you wish to install zookeeper on. Except in step 2, you should echo the server number on each one of them. Eg. “2” will be echoed on server 10.17.152.145 and “3” will be echoed on server 10.17.152.145. In this case all three servers will have zookeeper as well as solr.

9. Ensure that there are no major errors in the log file using

cat /opt/zookeeper/bin/zookeeper.out

Now that zookeeper ensemble is setup, let’s spin up Solr instances also.

Setting up Apache Solr

Login to 10.17.153.112 and do the following-

  1. Navigate to bin folder of Apache Solr

cd /opt/solr/bin

2. Start the solr server by point to each of the zookeeper instances that we already spin up as mentioned above.

./solr start -c -p 8983 -z 10.17.153.112:2181,10.17.152.145:2181,10.17.153.247:2181 -force

3. Repeat steps 1 and 2 on other two servers.

Creating collections

You can now create collections and implement sharding and replication factor to newly created collection.

Sharding — Breaking data into parts is called sharding. Each part of data is called a shard. Replication factor — Factor by which each shard should be replicated. This ensures availability of shards [data] even when some nodes are down.

Use following command to create a collection named TEST with 3 shards and 3 as it’s replication factor.

./solr create_collection -c TEST -s 3 -rf 3 -force

Validating cloud status

You can hit any of the three IPs to access the SolrCloud. For example will open up Solr UI on browser. Navigate to Cloud option on the left hand side and you will see TEST collection with its shards and replicas as follows-

Cloud depiction of a collection

This confirms that everything is up and running. Apache Solr cloud has been implemented using an external Zookeeper ensemble.

Chủ Đề