Chicago Data Summit: Apache HBase: An Introduction

Apache HBase: an introduction Todd Lipcon [email_address] @tlipcon @cloudera April 26th, 2011

Software Engineer at Committer and PMC member on Apache HBase, HDFS, MapReduce, and Thrift Previously: systems programming, operations, large scale data analysis I love data and data systems Introductions

Outline What is HBase? HBase Architecture 101 HBase vs Other Technologies Use Cases Questions

Apache HBase HBase is an open source , distributed , sorted map datastore modeled after Google’s BigTable

Open Source Apache 2.0 License Committers and contributors from diverse organizations Cloudera, Facebook, StumbleUpon, Trend Micro, etc.

Distributed Store and access data on 1-700 commodity servers Automatic failover based on Apache ZooKeeper Linear scaling of capacity and IOPS by adding servers

Sorted Map Datastore Not a relational database (very light “schema”) Tables consist of rows, each of which has a primary key (row key) Each row may have any number of columns -- like a Map<byte[], byte[]> Rows are stored in sorted order

Sorted Map Datastore (logical view as “records”) A single cell might have different values at different timestamps Different rows may have different sets of columns(table is sparse ) Useful for *-To-Many mappings Different types of data separated into different “ column families” Implicit PRIMARY KEY in RDBMS terms Data is all byte[] in HBase Row key Data cutting info: { ‘height’: ‘9ft’, ‘state’: ‘CA’ } roles: { ‘ASF’: ‘Director’, ‘Hadoop’: ‘Founder’ } tlipcon info: { ‘height’: ‘5ft7, ‘state’: ‘CA’ } roles: { ‘Hadoop’: ‘Committer’@ts=2010, ‘ Hadoop’: ‘PMC’@ts=2011, ‘ Hive’: ‘Contributor’ }

Sorted Map Datastore (physical view as “cells”) Sorted on disk by Row key, Col key, descending timestamp Milliseconds since unix epoch info Column Family roles Column Family Row key Column key Timestamp Cell value cutting roles:ASF 1273871823022 Director cutting roles:Hadoop 1183746289103 Founder tlipcon roles:Hadoop 1300062064923 PMC tlipcon roles:Hadoop 1293388212294 Committer tlipcon roles:Hive 1273616297446 Contributor Row key Column key Timestamp Cell value cutting info:height 1273516197868 9ft cutting info:state 1043871824184 CA tlipcon info:height 1273878447049 5ft7 tlipcon info:state 1273616297446 CA

Column Families Different sets of columns may have different properties and access patterns Configurable by column family: Compression (none, gzip, LZO) Version retention policies Cache priority CFs stored separately on disk: access one without wasting IO on the other.

Accessing HBase Java API (thick client) REST/HTTP Apache Thrift (any language) Hive/Pig for analytics

HBase API get(row) put(row, Map<column, value>) scan(key range, filter) increment(row, columns) … (checkAndPut, delete, etc…) MapReduce/Hive

High Level Architecture HBase HDFS ZooKeeper Java Client MapReduce Hive/Pig Thrift/REST Gateway Your Java Application

Terms and Daemons Region - A subset of a table's rows, like a range partition - Automatically sharded RegionServer (slave) Serves data for reads and writes Master Responsible for coordinating the slaves Assigns regions, detects failures of Region Servers Controls some admin functions

Cluster Architecture RegionServer HDFS HMaster RegionServer RegionServer … HMaster ZK Peer ZK Peer ZK Peer ZK Quorum Client Client finds RegionServer addresses in ZooKeeper Client reads and writes rows by directly accessing the RegionServers Master assigns regions and achieves load balancing

Cluster Deployment (big cluster) HDFS NameNode Secondary NameNode MapReduce JobTracker ZooKeeper ZooKeeper ZooKeeper HMaster HMaster RegionServer DataNode TaskTracker RegionServer DataNode TaskTracker RegionServer DataNode TaskTracker RegionServer DataNode TaskTracker RegionServer DataNode TaskTracker 3 or 5 nodes ZK HMaster with one standby 40+ slaves with HBase, HDFS, and MR slave processes

Cluster Deployment (small cluster / POC) NameNode SecondaryNameNode HMaster JobTracker ZooKeeper RegionServer DataNode TaskTracker RegionServer DataNode TaskTracker RegionServer DataNode TaskTracker RegionServer DataNode TaskTracker RegionServer DataNode TaskTracker 5+ slaves with HBase, HDFS, and MR slave processes The proverbial basket full of eggs

HBase vs just HDFS If you have neither random write nor random read, stick to HDFS! Plain HDFS/MR HBase Write pattern Append-only Random write, bulk incremental Read pattern Full table scan, partition table scan Random read, small range scan, or table scan Hive (SQL) performance Very good 4-5x slower Structured storage Do-it-yourself / TSV / SequenceFile / Avro / ? Sparse column-family data model Max data size 30+ PB ~1PB

HBase vs RDBMS RDBMS HBase Data layout Row-oriented Column-family-oriented Transactions Multi-row ACID Single row only Query language SQL get/put/scan/etc * Security Authentication/Authorization Work in progress Indexes On arbitrary columns Row-key only Max data size TBs ~1PB Read/write throughput limits 1000s queries/second Millions of queries/second

HBase vs other “NoSQL” Favors Consistency over Availability (but availability is good in practice!) Great Hadoop integration (very efficient bulk loads, MapReduce analysis) Ordered range partitions (not hash) Automatically shards/scales (just turn on more servers) Sparse column storage (not key-value)

HBase in Numbers Largest cluster : 700 nodes, ~700TB Most clusters: 5-20 nodes, 100GB-4TB Writes : 1-3ms, 1k-10k writes/sec per node Reads : 0-3ms cached, 10-30ms disk 10-40k reads / second / node from cache Cell size : 0-3MB preferred

SaaS Audit Logging Online service requires per-user audit logs Row key userid_timestamp allows efficient range-scan lookups to fetch per-user history Server-side Filter mechanism allows efficient queries MapReduce for analytic questions about user behavior

Facebook Analytics Realtime counters of URLs shared, links “liked”, impressions generated 20 billion events/day (200K events/sec) ~30 second latency from click to count Heavy use of incrementColumnValue API for consistent counters Tried MySQL, Cassandra, settled on HBase http://tiny.cloudera.com/hbase-fb-analytics

OpenTSDB Scalable time-series store and metrics collector Thousands of machines each generating hundreds of operational metrics Thousands of writes/second Web interface to fetch and display graphs of metrics over time for selected hosts http://opentsdb.net

Powered By HBase … and others

Use HBase if… You need random write, random read, or both ( but not neither ) You need to do many thousands of operations per second on multiple TB of data Your access patterns are well-known and simple

Don’t use HBase if… You only append to your dataset, and tend to read the whole thing You primarily do ad-hoc analytics (i.e. ill-defined access patterns) Your data easily fits on one beefy node

Resources Download CDH3 (http://cloudera.com/) Cloudera HBase training (1 st chapter free online) http://hbase.apache.org/ irc.freenode.net #hbase Coming soon: HBase: The Definitive Guide by Lars George

Questions? [email_address] @tlipcon Or come chat at the reception this evening

Chicago Data Summit: Apache HBase: An Introduction

More Related Content

Chicago Data Summit: Apache HBase: An Introduction

Editor's Notes