Wednesday, February 25, 2009
HIVE: Data Warehousing & Analytics on Hadoop
HIVE is a system for querying and managing structured data built on top of Hadoop. The client describes a query in the Hive query language (based on SQL). Based on this query, Hive creates a map reduce execution plan. The Hive data model is slightly simpler than a full relational database, but appears strong enough for most datacenter query applications. Unlike the other extensions to database systems for scalability that we looked at, Hive is designed to be mostly used offline rather than online.