Thursday, December 5, 2013

Hbase ? No! Kiji + Hbase!

I started using HBase about a 1.5 years ago. It took for me some time to learn the terminology ,configuration ,API's etc...
Then  i realized Kiji and started using it...The well documented framework with code example , tutorials and clear design definitely improved my life .

Just check this out
http://www.kiji.org/



KijiSchema

Provides a simple Java API and command line interface for importing, managing, and retrieving data from HBase ( thats the MAIN componenet!)
  • Set up HBase layouts using user-friendly tools including a DDL
  • Implement HBase best practices in table management
  • Use evolving Avro schema management to serialize complex data
  • Perform both short-request and batch processing on data in HBase
  • Import data from HDFS into structured HBase tables

 KijiMR
  • KijiMR allows KijiSchema users to employ MapReduce-based techniques to develop many kinds of applications, including those using machine learning and other complex analytics.
    KijiMR is organized around three core MapReduce job types: Bulk Importers, Producers and Gatherers.
  • Bulk Importers make it easy to efficiently load data into Kiji tables from a variety of formats, such as JSON or CSV files stored in HDFS.
  • Producers are entity-centric operations that use an entity’s existing data to generate new information and store it back in the entity’s row. One typical use-case for producers is to generate new recommendations for a user featuresbased on the user’s history.
  • Gatherers provide flexible MapReduce computations that scan over Kiji table rows and output key-value pairs. By using different outputs and reducers, gatherers can export data in a variety of formats (such as text or Avro) or into other Kiji tables.
KijiREST 
Provides a REST (Representational State Transfer) interface for Kiji, allowing applications to interact with a Kiji cluster over HTTP by issuing the four standard actions: GET, POST, PUT, and DELETE.

KijiExpress
 is a set of tools designed to make defining data processing MapReduce jobs quick and expressive, particularly for data stored in Kiji tables.

Scoring 
is the application of a trained model against data to produce an actionable result. This result could be a recommendation, classification, or other derivation.

Anyhow - No more pure Hbase API.Only Kiji!