Monday, April 27, 2009

BOINC: A System for Public-Resource Computing and Storage

The paper presents BOINC, a software similar to SETI@HOME for using home users computational resources.

A common problem with this approach are the incentives for home users, which the current paper addresses through the use of "credits" for computational power (which may be rewarded) and by allowing users to select computational topics they feel stronger about.

An interesting discussion point is the use of total energy of this approach compared to a dedicated cluster (a single organizational control). For example, BOINC launches multiple redundant tasks to avoid malicious results, which is clearly a waste of energy. Also, these tasks can wake up processors in low states of sleep (a global optimization could optimize this energy use) and transfer data on long distances.

Incentives Build Robustness in BitTorrent

This paper presents a piece of good software. The idea behind bittorrent is to download file chunks in parallel from multiple peers. Chunk order is rarest piece first to increase overall presence of the file. The peers are found through a tracker (server) who's address is hard coded in the torrent file. Each peer selfishly allows other peers to download from it based on reciprocal downloads and to avoid oscillations, this is done or a larger time scale.

One thing that I liked is the extensive use of randomization (returned peers are random, unchoking random peers) which has shown good results in practice.

Measuring and Evaluating Large-Scale CDNs

Very interesting paper, it shows the different design choices for CDNs and their current deployment scale and performance. It is interesting that CDNs do not necessarily redirect to the lowest delay servers (but use other metrics also) and that they also use IP anycast. Another interesting aspect of the paper is that it measures the marginal gains of each datacenter and that offers a good starting point for new CDNs on how to deploy their network (lowering a bit the barrier to entry).

Wednesday, April 15, 2009

Open Cloud/AppDrop/GoogleApp

The Open Cloud Manifesto proposes the idea of an open cloud mainly for portability/flexibility in choosing cloud provider and efficiency (developers used to the API, maturing code, etc).

Unfortunately, the paper is written from a cloud user standpoint. From a business perspective, the leader cloud providers have no incentives to open up their interfaces. The tradition of successful IT business seems more to indicate to rely on very closed systems e.g. look at Microsoft, Google (even though appears open it is actually very closed).

In fact, this is the approach for cloud that Google is pursuing with Google Apps. AppDrop could turn out to be a bad replica of the original due to many reasons and unknown tweaks/details of the Google setup.

Wednesday, April 1, 2009

Erlang

Erlang is a programming language with higher level constructions such as process,scheduling or memory management built into the language itself. These constructions are typically associated with operating systems. By construction, Erlang is run in a virtual machine like environment (supposedly it runs just as fast as as unoptimized C code but this is not an appropriate performance metric).

While I was convinced by the usefulness of having such high level operations in the language constructions, I was not fully convinced by the language itself, which at a first glance does not look very appealing. I would argue (possibly wrongly :) ) that a language with this constructions but closer in syntax to an existing programming language would be much easily adopted.

Monday, March 30, 2009

Friday: Global Comprehension for Distributed Replay

Friday is a system for distributed debugging using deterministic replay. Friday allows distributed watchpoints and breakpoints to be placed in the replayed system. Besides these, Friday allows a scripting language (Python) for commands to be associated with the watchpoints and breakpoints. These commands can modify debug variables, implement complex predicates or even call functions of the debugged application. This seems a really important feature to have as exemplified by the simplicity with which the Chord example can be debugged.

XTrace

XTrace is a tracing system for distributed and complex applications. XTrace allows traces across multiple layers. It uses labels to recreate the trace and a conceptually centralized database. XTrace requires code instrumentation at all (traced) stack layers.

I liked XTrace and I think it can be very useful to check where the distributed execution got stuck as it generates an easy to follow tree output.