Category Archives: Side Projects

Photo of a keyboard

FOLDING(IN)THECLOUD – AWS’s New C4 Instances

Late last year, we began running tests to see how efficient Amazon’s AWS cloud services are at protein folding, the computationally heavy physical simulations used in a variety of medical research. We discovered a remarkable result – under certain conditions, AWS is cheaper than the tiny additional cost of electricity you’d consume running the simulations at home.

Given this discovery, we were anxious to continue folding (and we still are – follow us on Twitter to learn about our findings on Google’s Compute platform and Microsoft’s Azure). We were especially excited about Amazon’s announcement of the c4 instance type, the successor to the c3 (the type of computer which showed the best performance in our previous tests). These machines offer a new, faster processor which we hoped would translate into even more folding efficiency.

A Minor Update

After collecting the data, however, we were underwhelmed – the c4 instances barely edged out their c3 counterparts in most cases. Other benchmarks (not protein-folding specific) have found a larger performance increase, making our results even more surprising.

Compare the performance of each c3 machine with its c4 counterpart below. Mouse over each bar to see its exact value and select different headers to see the values that went into calculating folding per dollar.

Folding points per dollar = Folding points per hour / Dollars per hour

See our original post to learn about our methodology.

Many Possible Explanations

There are many reasons the new c4 instances might not have delivered the performance boost we hoped for. Maybe folding@home is limited by memory more than by CPU (this seems unlikely, given that our original test showed the memory-optimized r3 instances underperforming the c3). Maybe folding@home is not taking full advantage of the c4's 36 vCPUs. Or perhaps a more statistically rigorous test would show larger gains.

Are you a folding@home aficionado, or do you live and breathe cloud computing? We'd love to hear your thoughts - did you expect larger improvements? Why do you think we saw the results we did? Leave a comment below, or tweet us!

Photo of a keyboard

FOLDING(IN)THECLOUD – Folding@Home’s Sweet Spot on Amazon’s AWS

At Maple Avenue Labs, we depend on the magic of cloud computing, the growing industry that makes it possible to rent, rather than buy, computer infrastructure. It powers our first product, huelab.me, and allows us to quickly and cheaply process photos to create personalized infographics / abstract works of art.

Earlier this year, we started investigating the effectiveness of using cloud computing for good by renting cloud computers to run folding@home. folding@home is a “distributed computing” project – anyone from across the globe can donate their computer’s processing power to run physical simulations of protein folding. By simulating how proteins take shape, folding@home is able to understand how diseases act and design drugs to fight them.

A Remarkable Result

huelab.me runs in Amazon’s cloud, AWS. AWS offers a variety of different computers, and we asked – which is the most cost effective for protein folding? We spent weeks running tests across different computer types and discovered an amazing fact. AWS offers a processor so efficient at protein folding that it is cheaper than the cost of the additional electricity your home computer consumes in order to run folding.

That’s so remarkable it’s worth repeating.

When a computer isn’t doing anything it consumes a certain amount of electricity.

When it is doing computation, it consumes slightly more electricity.

That tiny extra bit of electricity has some cost.

We measured how much folding our home computer could accomplish with one dollar’s worth of extra electricity. With that same dollar, you could instead rent processing power from Amazon and achieve more protein folding.

Does this mean you should shut down the folding@home program on your home computer? Absolutely not.

First, we were paying New York City electricity rates, some of the highest in the country (about $0.30 / kWh). Second, we were running folding@home on a 2009 Mac Mini – a 2011 MacBook Pro accomplished more folding per dollar than our Amazon machine.

folding@home is not expensive, and the 2009 Mac Mini is not inefficient. While folding, the Mini consumed only about half as much as electricity as a 60W light bulb. All this tells us is that cloud computing is – for some applications – insanely cheap.

How to Contribute

So if you’re not contributing to folding@home yet, you should. It’s quick and easy to get started. With just a few clicks you’ll be contributing to the fight against cancer, Alzheimer’s, Parkinson’s, and more. Go to folding.stanford.edu to learn more – or, if you use the Chrome web browser, you can contribute at folding.stanford.edu/nacl/ without installing any software.

After running these tests, we rented AWS machines for hundreds of hours of additional folding. We have completed more than 1,000 work units worth more than 10,000,000 points. Follow our team’s continued folding progress.

Future Work

Amazon recently announced a new generation of machine types. These are not yet available, but we expect them to perform even better than the current machines. When they become available, we will run tests on these new machines and update our results.

[In a follow-up post, we ran these same tests on Amazon’s newest machine type, the c4, and saw only modest improvements. See the performance of c4.8xlarge against the original machines below, in the results section.]

In addition, we plan to run similar tests on Microsoft’s Azure platform and compare the efficiency of AWS machines to Azure.

We will update our results when those tests are complete – follow our Twitter account to stay updated.

+ Running folding@home on AWS

Already an AWS user, and want to use AWS to contribute to folding@home? It’s easy!

  1. Launch your instance.
    • We found c3.8xlarge to be the most efficient instance type, and we used spot requests to minimize cost.
    • We used the default Amazon Linux image, Amazon Linux AMI 2014.09.1 (HVM). If you’re running on an instance with graphics cards, be sure to use an image with graphics drivers already installed so folding@home can use these extra cores.
    • Remember to add your instance to a security group which allows incoming connections from your IP so you can SSH into the machine (and later, connect to it using FAHControl to track its folding progress).
  2. SSH into your instance.
  3. Download and install the folding@home client.
    • wget --no-check-certificate https://fah.stanford.edu/file-releases/public/release/fahclient/centos-5.3-64bit/v7.4/fahclient-7.4.4-1.x86_64.rpm
    • sudo rpm -i --nodeps fahclient-7.4.4-1.x86_64.rpm
    • We found that this will usually result in a message “Starting fahclient … FAIL” but that the client will succeed on restart below.
  4.  Edit the folding@home config file to fold with full power, credit your user+team, and allow connections from your machine.
    • sudo vim /etc/fahclient/config.xml
    • Edit the following lines:
    • <power v='full'/>
    • <user v='YOUR_USERNAME'/>
    • Add the following lines to credit your folding@home team:
    • <team v='YOUR_TEAM_NUM' />
    • Add the following lines to allow you to remotely connect using FAHControl:
    • <allow v='IP_ADDRESS_YOU_WILL_CONNECT_FROM' />
    • <password v='PASSWORD_YOU_WILL_ENTER_IN_FAHCONTROL' />
  5. Restart your client to pick up the new settings.
    1. sudo /etc/init.d/FAHClient restart
  6. To track your machine’s progress, launch FAHControl and connect to your instance.
    1. Get your instance’s public IP from the EC2 dashboard.
    2. In FAHControl, click the “+ Add” button.
    3. Enter your machine’s IP and the password you entered in the config file above.

Now you’re running folding@home on AWS! To be a good citizen of the folding@home community, try to always hit “Finish” in FAHControl and allow all running units to complete their work before you terminate the instance – terminating an in-progress unit makes it tough for folding@home to know what work is currently being computed and what work needs to be re-assigned.

+ Methodology and Results

We ran folding@home on most of Amazon’s available machine types. On each machine type, we ran folding@home for 40 hours to get a good sample of performance. We calculated the average folding achieved per hour and divided by the cost per hour in order to calculate folding per dollar.

We used spot requests for all our AWS machines. We set our bid price high enough to prevent our tests from being terminated before 40 hours, then calculated hourly cost as the average price paid over this 40 hour period.

On two local personal computers, a 2009 Mac Mini and 2011 Macbook Pro, we allowed the machines to sit idle and measured their power consumption using an electricity usage monitor. We then ran folding@home for 40 hours on each machine, again measuring their power consumption. We calculated the “cost” of running folding as (power usage while folding – power usage while idle) * cost we pay per unit of power.

A summary of our results is shown below. Mouse over each bar to see its exact value and select different headers to see the values that went into calculating folding per dollar.

Folding points per dollar = Folding points per hour / Dollars per hour

AWS Machine Types

We chose a list of machine types to test the broadest possible range while not wasting too many resources on machine types known to be inferior so that extra resources could be committed to the most efficient machine type.

With this in mind, we did not test the entire range of memory optimized (r3) instances - after observing that c3.8xlarge was at least as good as r3.8xlarge, we surmised that each of the r3 instance types would be no better than the corresponding c3 instance type.

We also tested on the smaller instance types (for example, t2.micro) but these types were unable to finish folding a work unit before that work unit expired.

In a follow-up post, we ran these same tests on Amazon's newest machine type, the c4, and saw only modest improvements.

Folding Points

To compare the folding achieved across different machines, we needed a method for quantifying how much folding a processor achieved. Luckily, folding@home provides a points system with a goal of "keeping points in alignment with the scientific value of the results".

We found these points to be relatively stable for a given machine type, indicating they are not subject to wild fluctuations between different work units, at different times of the day, etc. The graph below shows points earned each hour by the c3.8xlarge instance over its 40 hour test run. It demonstrates the stability of the points values.

In addition,  we found points to be more or less constant over longer periods of time - when we launched c3.8xlarge again months later we observed similar values.

Bonus

folding@home offers bonus points to users "who rapidly and reliably" complete work. All our tests used a logged-in user who had already completed the minimum requirements to receive bonus points.

Because bonus points are awarded based on the speed with which results are returned, they may have given an extra advantage to high performance systems - a system which processed twice as fast could have earned more than twice as many points, since it could achieve twice as much folding but would also earn more bonus points on each fold.

Eliminating Edge Values

We ran the folding@home client for an hour before we began collecting data in order to eliminate any fluctuations associated with startup.

In addition, since points earned are not reported until a job is complete, we continued running the machines beyond our 40 hour test period until all jobs completed. We did NOT use these extra hours in our data analysis; we interpolated the number of points the machine earned during its 40 hour window.