Distributed Systems Group -- Research

Extreme Scale Cluster Architecture

Student: Tassos Argyros

A well-known trend today is that data is growing at a much higher rate than processing power. For example, the amount of data in GenBank (a genomics database) doubles every 9 months compared to the increase in processor speed (roughly doubles every 18 months by Moore's law). Where all this leads to is that in the near future we are going to need much more processing power that single CPU's can offer. A cheap solution to this problem could be clusters of commodity PC's in extremely large numbers (thousands or tenths of thousands of PC's connected together). However, little work has been done in solving the scaling problems that clusters of that size have. How should the data be send between the nodes so that congestion and latency problems do not arise? How do you deal with node and network failures that are inevitable in a system of that size? What's the most effective way to interconnect such a big network without spending the bulk of the budget in exotic switches? And last but not least, how can one program and debug such large scale applications?

On a first thought, many of these problems seem orthogonal to each other; a connection between, e.g. the structure of the network and how to program efficiently large-scale applications is not obvious. However, we believe that an integrated solution to these problems does exist: designing the whole system based on streams. Our thesis is that by using streams one can:

  • Hide the effects of network latency and significantly increase the processing efficiency.
  • Radically reduce the programming and debugging effort that is required for writing applications.
  • Engineer the network in a way that enough bandwidth exists for the flow of data without needing an over-expensive and complicated "full bisection" network that allows full bandwidth between each pair of nodes.
  • Very effectively manage failures and load imbalances.

    Our work until now has been divided into the above four areas. We have evidence to believe that this architecture is a much better way to structure the network of processing nodes and the applications that run on them compared to the conventional methods that are used today, and we are currently building a system to demonstrate that.

    Last updated on Mar. 11, 2005