Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $9.99/month after trial. Cancel anytime.

Apache ZooKeeper Essentials
Apache ZooKeeper Essentials
Apache ZooKeeper Essentials
Ebook313 pages2 hours

Apache ZooKeeper Essentials

Rating: 5 out of 5 stars

5/5

()

Read preview

About this ebook

About This Book
  • Learn the basics of Apache ZooKeeper with a comprehensive examination of its internals and administration
  • Explore the ZooKeeper API model and learn how to develop applications using ZooKeeper in C, Java, and Python for common distributed coordination tasks
  • See how ZooKeeper is used in real-world applications and services to carry out complex distributed coordination tasks
Who This Book Is For

Whether you are a novice to ZooKeeper or already have some experience, you will be able to master the concepts of ZooKeeper and its usage with ease.

This book assumes you to have some prior knowledge of distributed systems and high-level programming knowledge of C, Java, or Python, but no experience with Apache ZooKeeper is required.

LanguageEnglish
Release dateJan 28, 2015
ISBN9781784398323
Apache ZooKeeper Essentials

Related to Apache ZooKeeper Essentials

Related ebooks

Programming For You

View More

Related articles

Reviews for Apache ZooKeeper Essentials

Rating: 5 out of 5 stars
5/5

2 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Apache ZooKeeper Essentials - Saurav Haloi

    Table of Contents

    Apache ZooKeeper Essentials

    Credits

    About the Author

    About the Reviewers

    www.PacktPub.com

    Support files, eBooks, discount offers, and more

    Why subscribe?

    Free access for Packt account holders

    Preface

    What this book covers

    What you need for this book

    Who this book is for

    Conventions

    Reader feedback

    Customer support

    Downloading the example code

    Errata

    Piracy

    Questions

    1. A Crash Course in Apache ZooKeeper

    Defining a distributed system

    Why coordination in a distributed system is so challenging

    Introducing Apache ZooKeeper

    Getting hands-on with Apache ZooKeeper

    Download and installation

    Downloading

    Installing

    Configuration

    Starting the ZooKeeper server

    Connecting to ZooKeeper with a Java-based shell

    Connecting to ZooKeeper with a C-based shell

    Setting up a multinode ZooKeeper cluster

    Starting the server instances

    Running multiple node modes for ZooKeeper

    Summary

    2. Understanding the Inner Workings of Apache ZooKeeper

    A top-down view of the ZooKeeper service

    The ZooKeeper data model

    Types of znodes

    The persistent znode

    The ephemeral znode

    The sequential znode

    Keeping an eye on znode changes – ZooKeeper Watches

    The ZooKeeper operations

    Watches and ZooKeeper operations

    The ZooKeeper access control lists

    The ZooKeeper stat structure

    Understanding the inner working of ZooKeeper

    The quorum mode

    Client establishment of sessions with the ZooKeeper service

    Implementation of ZooKeeper transactions

    Phase 1 – leader election

    Phase 2 – atomic broadcast

    Local storage and snapshots

    Summary

    3. Programming with Apache ZooKeeper

    Using the Java client library

    Preparing your development environment

    The first ZooKeeper program

    Implementing a Watcher interface

    Example – a cluster monitor

    The C client library

    Getting started with the C API

    Example – the znode data watcher

    Python client bindings

    A watcher implementation

    Summary

    4. Performing Common Distributed System Tasks

    ZooKeeper recipes

    Barrier

    Queue

    Lock

    Leader election

    Group membership

    Two-phase commit

    Service discovery

    Summary

    5. Administering Apache ZooKeeper

    Configuring a ZooKeeper server

    Minimum configuration

    Storage configuration

    Network configuration

    Configuring a ZooKeeper ensemble

    Configuring a quorum

    Quota and authorization

    ZooKeeper best practices

    Monitoring a ZooKeeper instance

    Four-letter words

    Java Management Extensions

    Summary

    6. Decorating ZooKeeper with Apache Curator

    Curator components

    Curator client

    Curator framework

    Curator recipes

    Curator utilities

    Curator extensions

    Exhibitor

    Summary

    7. ZooKeeper in Action

    Projects powered by ZooKeeper

    Apache BookKeeper

    Apache Hadoop

    Apache HBase

    Apache Helix

    OpenStack Nova

    Organizations powered by ZooKeeper

    Yahoo!

    Facebook

    eBay

    Twitter

    Netflix

    Zynga

    Nutanix

    VMware vSphere Storage Appliance

    Summary

    Index

    Apache ZooKeeper Essentials


    Apache ZooKeeper Essentials

    Copyright © 2015 Packt Publishing

    All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

    Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

    Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

    First published: January 2015

    Production reference: 1220115

    Published by Packt Publishing Ltd.

    Livery Place

    35 Livery Street

    Birmingham B3 2PB, UK.

    ISBN 978-1-78439-132-4

    www.packtpub.com

    Credits

    Author

    Saurav Haloi

    Reviewers

    Hanish Bansal

    Christopher Tang, PhD

    Commissioning Editor

    Ashwin Nair

    Acquisition Editor

    Richard Harvey

    Rebecca Youé

    Content Development Editor

    Ajinkya Paranjape

    Technical Editor

    Anushree Arun Tendulkar

    Copy Editors

    Karuna Narayanan

    Alfida Paiva

    Project Coordinator

    Harshal Ved

    Proofreaders

    Martin Diver

    Ameesha Green

    Indexer

    Hemangini Bari

    Production Coordinator

    Melwyn D'sa

    Cover Work

    Melwyn D'sa

    About the Author

    Saurav Haloi works as a principal software engineer at EMC in its data protection and availability division. With more than 10 years of experience in software engineering, he has also been associated with prestigious software firms such as Symantec Corporation and Tata Consultancy Services, where he worked in the design and development of complex, large-scale, multiplatform, multi-tier, and enterprise software systems in a storage, networking, and distributed systems domain. He has been using Apache ZooKeeper since 2011 in a variety of different contexts. He graduated from National Institute of Technology, Surathkal, India, with a bachelors degree in computer engineering. An open source enthusiast and a hard rock and heavy metal fanatic, he lives in the city of Pune in India, which is also known as the Oxford of the East.

    I would like to thank my family for their support and encouragement throughout the writing of this book.

    It was a pleasure to work with Packt Publishing, and I would like to thank everyone associated with this book: the editors, reviewers, and project coordinators, for their valuable comments, suggestions, and assistance during the book development period. Special thanks to Ajinkya Paranjape, my content development editor, who relentlessly helped me while writing this book and patiently answered all my queries relating to the editorial processes.

    I would also like to thank the Apache ZooKeeper contributors, committers, and the whole community for developing such a fantastic piece of software and for their continuous effort in getting ZooKeeper to the shape it is in now. Kudos to all of you!

    About the Reviewers

    Hanish Bansal is a software engineer with over 3 years of experience in developing Big Data applications. He has worked on various technologies such as the Spring framework, Hibernate, Hadoop, Hive, Flume, Kafka, Storm, and NoSQL databases, which include HBase, Cassandra, MongoDB, and SearchEngines such as ElasticSearch. He graduated in Information Technology from Jaipur Engineering College and Research Center, Jaipur, India. He is currently working in Big Data R&D Group in Impetus Infotech Pvt. Ltd., Noida (UP). He published a white paper on how to handle data corruption in ElasticSearch, which can be read at http://bit.ly/1pQlvy5. In his spare time, he loves to travel and listen to Punjabi music.

    You can read his blog at http://hanishblogger.blogspot.in/ and follow him on Twitter at @hanishbansal786.

    I would like to thank my parents for their love, support, encouragement, and the amazing opportunities they've given me over the years.

    Christopher Tang, PhD, is a technologist and software engineer who develops scalable systems for research and analytics-oriented applications that involve rich data in biology, education, and social engagement. He was one of the founding engineers in the Adaptive Learning and Data Science team at Knewton, where Apache ZooKeeper is used with PettingZoo for distributed service discovery and configuration. He has a BS degree in biology from MIT, and received his doctorate degree from Columbia University after completing his thesis in computational protein structure recognition. He currently resides in New York City, where he works at JWPlayer and advises startups such as KnewSchool, FindMine, and Moclos.

    I'd like to extend my thanks to my family for their loving support, without which all these wonderful opportunities would not have been open to me.

    www.PacktPub.com

    Support files, eBooks, discount offers, and more

    For support files and downloads related to your book, please visit www.PacktPub.com.

    Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at for more details.

    At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

    https://www2.packtpub.com/books/subscription/packtlib

    Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

    Why subscribe?

    Fully searchable across every book published by Packt

    Copy and paste, print, and bookmark content

    On demand and accessible via a web browser

    Free access for Packt account holders

    If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view nine entirely free books. Simply use your login credentials for immediate access.

    To my parents

    Preface

    Architecting and building a distributed system is not a trivial job, and implementing coordination systems for the distributed applications is even harder. They are often prone to errors such as race conditions and deadlocks, and such bugs are not easily detectable. Apache ZooKeeper has been developed with this objective in mind, to simplify the task of developing coordination and synchronization systems from scratch. ZooKeeper is an open source service, which enables high performance and provides highly available coordination services for distributed applications.

    Apache ZooKeeper is a centralized service, which exposes a simple set of primitives that distributed applications can build on, in order to implement high-level services such as naming, configuration management, synchronization, group services, and so on. ZooKeeper has been designed to be easily programmable with its simple and elegant set of APIs and client bindings for a plethora of languages.

    Apache ZooKeeper Essentials takes readers through an enriching practical journey of learning ZooKeeper and understanding its role in developing scalable and robust distributed applications. It starts with a crisp description of why building coordination services for distributed applications is hard, which lays the stepping stone for the need to know and learn ZooKeeper. This book then describes the installation and configuration of a ZooKeeper instance, after which readers will get a firsthand experience of using it.

    This book covers the core concepts of ZooKeeper internals, its administration, and the best practices for its usage. The ZooKeeper APIs and the data model are presented in the most comprehensive manner for both beginners and experts, followed by programming with ZooKeeper. Examples of developing client applications have been given in three languages: Java, C, and Python. A full chapter has been dedicated to discuss the various ZooKeeper recipes so that readers get a vivid understanding of how ZooKeeper can be used to carry out common distributed system tasks.

    This book also introduces readers to two projects: Curator and Exhibitor, which are used to ease the use of ZooKeeper in client applications and its management in production. Real-world examples of software projects that use ZooKeeper have been cited for readers to understand how ZooKeeper solves real problems. This is followed by examples of organizations that use ZooKeeper in their production platforms and enterprise software systems.

    Apache ZooKeeper Essentials will help readers learn everything they need to get a firm grasp of ZooKeeper so that they can start building scalable and high-performant distributed applications with ease and full confidence.

    What this book covers

    Chapter 1, A Crash Course in Apache ZooKeeper, introduces you to distributed systems and explains why getting distributed coordination is a hard problem. It then introduces you to Apache ZooKeeper and explains how ZooKeeper solves coordination problems in distributed systems. After this, you will learn how to install and configure ZooKeeper, and get ready to start using it.

    Chapter 2, Understanding the Inner Workings of Apache ZooKeeper, discusses the architecture of ZooKeeper and introduces you to its data model and the various operations supported by it. This chapter then delves deeper into the internals of ZooKeeper so that you understand how various components of ZooKeeper function in tandem.

    Chapter 3, Programming with Apache ZooKeeper, introduces you to programming with the ZooKeeper client libraries and explains how to develop client applications for ZooKeeper in Java, C, and Python. This chapter presents ready-to-compile code for you to understand the nitty-gritty of ZooKeeper programming.

    Chapter 4, Performing Common Distributed System Tasks, discusses the various recipes of distributed system tasks such as locks, queues, leader election, and so on. After going through these recipes, you will understand how ZooKeeper can be used to solve common coordination problems that are often encountered while building distributed systems.

    Chapter 5, Administering Apache ZooKeeper, provides you with all the information that you need to know about the administration and configuration of ZooKeeper. It also presents the best practices of ZooKeeper usage and the various ways to monitor it.

    Chapter

    Enjoying the preview?
    Page 1 of 1