In this Sunday Read with Horton edition we take a closer look at the selection of papers about Distributed Consensus provided by Camille Fournier (Zookeeper PMC) as part of the RfP (Research for Practice) of the ACM. For Hadoop practitioners distributed consensus is best know as Apache Zookeeper, which supports most critical aspects of almost all Hadoop components. Continue reading “Sunday Read: Distributed Consensus”
ZooKeeper is in use at many different distributed systems and could probably also be of great help to your application, when it comes to distributed synchronization, and maintaining configuration. It promises to be simple, expressive, highly available, and highly reliable. To get an better understanding of what Zookeeper can do for you it’s a good idea to gain a solid understand of it’s implementation details and the consistency guarantees it makes. And it’s sure worth looking at the ZooKeeper Recipes and Solutions page.
As I just recently had the opportunity to take a closer look at ZooKeeper for the first time, this article does little more than provide the ‘Notes of a Beginner’. You’ll find here a brief overview of the consistency guarantees ZooKeeper makes, and a sample application. I decided to implement a distributed queue with locking, which I found as an example on the Solutions and Recipes page.
Imagine a web crawler who stores it’s links in a queue, so that other instances can grab link by link from the queue and crawl the web pages. In this scenario we wouldn’t want multiple instances crawling the same link, therefor we lock the link prior to performing the fetch. We don’t just remove it from the queue, because the fetch could fail. On the other hand if the instance dies unexpectedly during the fetch the lock needs to be released, so another instance can pick from there. All this can easily be done with ZooKeeper. Continue reading “My 1st ZooKeeper Recipe: Distributed Queue with Locking”