Build your own
Folding Farm
Whether or not you
know Linux!
By: Mark J. Foster
Introduction
You know what it's like: you stare at your Folding@Home scores,
thinking how it feels like everything is moving in slow motion.
Whether your motives are to compete with other participants, to help out
your team, to contribute more to this great cause, or some mixture,
you'll start to think about adding more horsepower. As you go through
the process of installing and managing the folding client program on
multiple PCs, you'll quickly come to two key realizations:
- Adding full-blown PCs is expensive!
- Managing multiple PCs can take a lot of time
Fortunately, there is a great alternative - building up a "Folding
Farm"! For the sake of this discussion, we'll define a folding
farm as a cluster of stripped-down PCs that can primarily be managed as
a single system. While some folks call a group of regular PCs a
folding farm, and call a group of diskless PCs a "Monster Farm", we'll
concentrate exclusively on diskless systems here, so the term folding
farm will do just fine.
The essence of the diskless approach is that you'll build just one
Linux server that will provide all the support necessary for many
diskless folding "clients". While this approach can take a bit of
time to set up initially, it very quickly becomes far easier to manage
than a group of PCs. Anyone who has more than three-four systems
dedicated to the Folding@Home project should seriously consider building
their own folding farm.
The key problem with the folding farm approach is that it is most
efficient to base such systems on the free Linux operating system, yet
most folks aren't comfortable with Linux. The goal of this series
is to define a standard folding farm configuration, then to walk you
through the process of creating your own. While this does limit
flexibility to some degree, it should make it possible for just about
anyone who has the financial resources to build a folding farm, even if
they are new to Linux!
The Philosophy
The variety of possible ways to build a folding farm is essentially
infinite: there are many different fundamental approaches. With
roughly a dozen different popular Linux distributions, it's pretty clear
that one size doesn't fit all! Having said that, the only way to
make an article like this possible is to narrow the scope down to a
specific approach, and in this case, the goal is to provide the path of
least effort, while still delivering a powerful, flexible system.
As a case in point, we'll be installing the full RedHat Linux 8.0
distribution on the folding farm server. Some folks may not like
this, preferring a more customized, stripped-down approach. That's
fine, too, but the lean-and-mean approach definitely requires more
work. Since the goal of this series is to simplify the
installation process, we'll lean towards decisions that minimize the
amount of effort you'll have to make to get your own folding farm up and
running.
About
Recommendations
To
be sure, you can build a folding farm out of just about anything,
starting with something as modest as a 200 MHz Pentium server booting
from a Zip drive, all the way up to a multiprocessor Xeon megaserver
booting from a multi-terabyte RAID array. However, this series
will concentrate on a specific range of configurations towards the
upper-middle of the performance spectrum, for that's where you'll get
the best performance for your hard-earned dollars. Along the way,
we'll make recommendations for potential components that you might want
to buy, if you are picking up new gear to make your own folding
farm. In fact, these recommendations will get pretty specific in
a couple of cases: such as specifying the motherboard recommended for
the client systems. While you are free to use any components that
you would like, the closest your components are to the recommendations,
the less effort you'll have to spend engineering your own solutions.
Folding Farm Architecture
The
most important decision you'll make after deciding which hardware to
use is deciding upon the architecture
- in other words, an overall plan for how things are
interconnected. Since the architecture chosen has a huge impact on
how you'll configure the system, this article necessarily focuses on a
specific configuration, including the following:
- A server, containing a hard disk and two independent 10/100-BaseT
Ethernet ports. This server will run the Redhat 8.0 Linux
distribution. Since it has the only hard disk in the folding farm,
all the CPUs will read and write from this single disk drive - no other
mass storage will be required (other than temporarily connecting a
CD-ROM to the server during installation).
- Dual networks. This article assumes that you already have
an Ethernet-based network at home, and have a broadband connection to
the Internet that is accessible across the existing network. As
part of the process of creating the folding farm, you'll be building a
second specialized network that is used solely to communicate between
the folding farm server and the folding farm clients. This network
will have its own Ethernet switch, and will be completely distinct from
the existing network.
- N client systems.
The assumption made here is that you'll be using either bare-bones
diskless systems, or bare motherboards, with no local mass storage and
minimal RAM. All of the client systems will be located in the same
physical space -- in other words, we won't be trying to boot clients
across the Internet, etc. However, later on in the series, we'll
look at using IEEE 802.11B wireless network bridging to connect folding
clients in different areas within the same building.
- Specific motherboards and Athlon CPUs. Since most
motherboards require special effort to boot from the network, this
series will focus on specific model numbers. Initially, we'll
concentrate on the ECS K7S5A, and later on, we'll look at using the ECS
K7VMM+. Additionally, while some folks will object, the reality
is that many independent benchmarks have shown that Athlon XP
processors are the most efficient choice for the Folding@Home
project. Consequently, motherboards that support Athlons will be
the primary focus.
- Linux Terminal Server Project, or LTSP. There are many
possible "lightweight" Linux distributions that can be used for the
clients. One of the best is LTSP, already designed for running
netbooted systems. As a result, we'll concentrate exclusively on
using LTSP for the clients.
- WINE. Though somewhat controversial, it appears that
running the Windows Folding@Home client program may currently be
slightly more efficient than running the Linux folding code. One
goal of this project will be to eventually enable the use of the Windows
folding client program under Linux, thanks to WINE (Wine Is Not an
Emulator).
Power and Heat
Before getting into the details of building up a farm, it's very
important to think about the two toughest issues you'll face (other than
cost!): heat and power consumption. When starting off, you'll
naturally want to cram as many motherboards into the available space as
you can, but be careful! As with most computer systems today, the
primary factor limiting compute density is thermal dissipation; that is,
getting rid of the incredible amount of heat that multiple CPUs can
generate. In general, two factors will help to solve this: more
airspace between units, and more airflow into and out of the farm, both
of which naturally conflict with the goal of packing everything as
tightly as possible.
A related concern is raising the ambient temperature. If the
first goal is restated as preventing the folding farm units from
overheating, the second goal is to keep the room from overheating!
With a large farm, you may actually need to plan for external ducting to
exhaust the heat. Even with small farms, you may find that just a
few CPUs will noticeably warm up a room, depending on airflow.
The third important factor to consider is power consumption. Make
sure that the area you plan on building the farm in has sufficient A.C.
power to be able to run all your equipment. Having popped circuit
breakers repeatedly at home and at work, I've learned how important this
issue can be. Depending on the specific client hardware you'll be
running, plan on 150W to 200W per client.
With all these factors in mind, you may well find that the best way to
construct your folding farm is to distribute the clients in different
locations, thereby helping to spread out the power consumption and heat.
Physical Layout of Your Folding Farm
While it is necessary to focus on a specific system architecture, when
it comes to the physical layout of your own folding farm, the sky's the
limit. Here's where you can really let your creativity flow!
Some of the many approaches that folks have used to build their farms
include:
For even more alternatives, please check out the outstanding Monsters and
Monster Farms webpage. Many thanks to the webmaster of that
site for such a great collection of ideas!
Selecting the Client Hardware
Determining which hardware you'll use for the folding clients is a lot
of fun, and is actually more involved than it may appear at first
glance. Most folks' goal will be to maximize the performance per
dollar of investment, while satisfying their personal subjective
criteria. What kinds of criteria might apply? Absolute
minimum cost may well be your highest priority. However, you may
also care about noise, for instance. If so, then you may want the
best bang-per-buck given quiet components. Alternately, space may
be a primary consideration for you, in which case you'll want to use
smaller motherboards and power supplies.
Let's say that you're concerned about noise, and you want your folding
farm to be as silent as possible. To do that, you'll need to pick
quiet power supplies and CPU coolers. That's the way I see things,
so I use ultra-quiet 350W power supplies from Robanton, along with the
Arctic Cooling Super Silent Pro TC CPU cooler.
Once the motherboard, power supply and CPU cooler have been chosen, the
next component to select is memory. While there is insufficient
benchmark information available to provide detailed guidance, standard
128MB PC2100 CL2.5 DDR DIMMs appear to be a fine choice for memory in a
folding farm application.
To pick the CPU, add up the costs of the other components that you've
selected. In this case, let's imagine that the cost of the
motherboard is $53, the power supply is $35, the CPU cooler is $16, and
the memory is $21. That means that the total costs other than CPU
work out to $125.
But wait a minute! Are there any other costs associated directly
with each client? Consider also the mounting methods you'll
use. Perhaps you'll be spending $5/client on a shelf and/or
mounting hardware. In addition, what about the Ethernet switch and
cabling? Let's say that that's another $10/client. If you
are buying a new server for the farm, you'll want to include a chunk of
those costs, too. If your server was $500 and you're building 16
clients, then that's another ~$30/client.
Add up all of these "fixed" costs to determine how much you are really
paying for each client system - in this case, $170. Given this,
which CPU will provide the best performance/$? Easy enough!
For each available CPU, divide the CPU speed by the sum of the fixed
costs plus the CPU cost. Here's an example:
PROCESSOR
|
CPU PRICE
|
FIXED COSTS
|
TOTAL COST
|
"MHz"/$
|
Athlon
XP 3000+
|
$589
|
$170
|
$759
|
3.95
|
Athlon
XP 2800+
|
$389
|
$170
|
$559
|
5.01
|
| Athlon
XP 2600+ |
$232
|
$170
|
$402
|
6.47
|
Athlon
XP 2500+
|
$179
|
$170
|
$349
|
7.16
|
Athlon
XP 2400+
|
$135
|
$170
|
$305
|
7.87
|
Athlon
XP 2200+
|
$102
|
$170
|
$272
|
8.09
|
Athlon
XP 2100+
|
$78
|
$170
|
$248
|
8.47
|
Athlon
XP 2000+
|
$69
|
$170
|
$239
|
8.37
|
Athlon
XP 1900+
|
$63
|
$170
|
$233
|
8.15
|
Athlon
XP 1800+
|
$55
|
$170
|
$225
|
8.00
|
Athlon
XP 1700+
|
$48
|
$170
|
$218
|
7.80
|
As this table shows, the Athlon XP 2100+ CPU provides the highest
performance per dollar invested, given the other components that were
selected for the example. If you were to repeat the same exercise
with less expensive components, then a slower CPU would deliver the best
bang per buck. Regardless of what your priorities are, it's worth
taking the time to go through this exercise yourself, since prices
change daily!
Bill
Of Materials
Before we get started, you'll want to round up the various components
that you'll need, if for no other reason so you can start creating a
budget for the project. Here's a list of the components you'll
need to build up a folding farm:
- Server system:
- CPU: Athlon 1600+ or higher recommended for new purchases
- 256-512 MB DDR RAM
- >10GB HDD (>= 20 GB Recommended)
- Dual 10/100-BaseT Fast Ethernet Adapters (an on-board Ethernet
is OK)
- Client systems:
- N CPUs (Athlon
XP 1800+ and up Recommended)
- N ECS K7S5A or
K7VMM+ Motherboards
- N 128MB SD-RAM
DIMMs (DDR Recommended)
- N CPU Coolers
- N Power Supplies
- N Video boards,
cheapest available (on-board video is OK)
- 10/100-BaseT Ethernet Switch, with at least (N + 2) Ethernet ports
- (N + 2) CAT5 Ethernet
cables
- Power Strip(s)
- RedHat Linux 8.0 CDs (Discs 1-3)
- LTSP
Linux Distribution, including the following components:
- The Folding@Home client program:
- Physical mounting plan & mounting hardware
- Sufficient A.C. Power
- Sufficient Cooling Capacity
- Configuration information about your existing home network
on-hand, including:
- TCP/IP addresses of the home gateway and DNS server(s)
- Addresses of the other equipment on your home network
In addition, you'll need the following accessories, at least
temporarily:
- Floppy disk drive and cabling for reflashing the motherboard's
BIOS (USB floppy recommended)
- CD-ROM drive and cabling for the initial installation of Linux
- Screwdriver or metal pen to use as a temporary power switch,
until the BIOS has been reflashed.
- Lots of caffeine!
Credits
[All this will get reorganized later, but it's important that we give
credit where credit is due right at the start! Many thanks go to
Jason Rabel, author of the excellent article FAH
Diskless Farm. In many ways, this article primarily restates
what Jason's already said. Thanks, Jason!]