Thursday, April 18, 2013

Zookeeper Summary

ZooKeeper is a distributed, open-source coordination service for distributed applications

naming.Its used for synchronization, configuration maintenance, and groups and


ZooKeeper allows distributed processes to coordinate with each other
through a shared hierarchal namespace which is organized similarly to a standard file system.


 Like the distributed processes it coordinates, ZooKeeper itself is
intended to be replicated over a sets of hosts called an ensemble


The name space provided by ZooKeeper is much like that of a standard file system. A name
is a sequence of path elements separated by a slash (/). Every node in ZooKeeper's name
space is identified by a path

So basically what happens here is that you create a tree which can be updated from any client of the system and the model is shared and all the actions are ordered.
For example if you want to implement destributed Queue - you create a znode, and put to it data with related paths, fro other side you retrieve those values and remove them


API:
Zookeeper Javadoc


General architecture

Every ZooKeeper server services clients. Clients connect to exactly one server to submit
irequests. Read requests are serviced from the local replica of each server database. Requests
that change the state of the service, write requests, are processed by an agreement protocol.
As part of the agreement protocol all write requests from clients are forwarded to a single
server, called the leader. The rest of the ZooKeeper servers, called followers, receive
message proposals from the leader and agree upon message delivery. The messaging layer
takes care of replacing leaders on failures and syncing followers with leaders.
ZooKeeper uses a custom atomic messaging protocol. Since the messaging layer is atomic,
ZooKeeper can guarantee that the local replicas never diverge. When the leader receives a
write request, it calculates what the state of the system is when the write is to be applied and
transforms this into a transaction that captures this new state.



Additional feautures

1. Watches concept  - Clients can set a watch on a znodes. A watch will be triggered and removed when the znode changes

2. Sequential Consistency - Updates from a client will be applied in the order that they were
sent.

3. Atomicity - Updates either succeed or fail. No partial results.

4. Single System Image - A client will see the same view of the service regardless of the
server that it connects to.

5. Reliability - Once an update has been applied, it will persist from that time forward until
a client overwrites the update.

6. Timeliness - The clients view of the system is guaranteed to be up-to-date within a
certain time bound.
7.  Observer - in order not o hurt write performence with many many clients use Observer nide

Observersforward these requests to the Leader like Followers do, but they then simply wait to hear the
result of the vote. Because of this, we can increase the number of Observers as much as we
like without harming the performance of votes




Architecture
Can eb run in standalone mode or replicated.A replicatedgroup of servers in the same application is called a quorum, and in replicated mode, all servers in the quorum have copies of the same configuration file


Performance
In version 3.2 r/w performance improved by ~2x compared to the previous 3.1 release

What can be implemented
Shared Barriers
Shared Queues
Shared Locks
Two face commit
etc
Documentation, tutorials and examplesZookeeper documentation


No comments:

Post a Comment