Wednesday, March 18, 2009

Chukwa

Chukwa is a data collection system built on top of Hadoop. It solves some particular problems in this context such as the fact that HDFS is not best suited to hold files used for monitoring; for this Chukwa uses a collector to aggregate logs and reduce the number of HDFS files generated. Chukwa is under use at Yahoo and the evaluation shows a small overhead.

No comments:

Post a Comment