Wednesday, February 25, 2009

HIVE: Data Warehousing & Analytics on Hadoop

HIVE is a system for querying and managing structured data built on top of Hadoop. The client describes a query in the Hive query language (based on SQL). Based on this query, Hive creates a map reduce execution plan. The Hive data model is slightly simpler than a full relational database, but appears strong enough for most datacenter query applications. Unlike the other extensions to database systems for scalability that we looked at, Hive is designed to be mostly used offline rather than online.

1 comment:

  1. This sounds a little bit too much as an abstract. Anything else?

    ReplyDelete