compute with infinite capacity from within your IDE
Imagine to be able to write code to process a large dataset, and immediately being able with a click of a button to run that code on a compute cluster, without worrying about setting up anything like job submission to a grid, resource allocation, compilation etc etc etc
Now stop imagining, because this is reality today. The details how this can happen is via your NetBeans IDE (Integrated Development Environment) and Amazon’s Elastic MapReduce. Leveraging the capability of this Amazon web service for provisioning Hadoop (the open source implementation of Google’s Map/Reduce parallel programming framework) compute clusters of any size on the fly, Karmasphere offers an amazing Netbeans plugin.
Developers can add their Amazon credentials within the IDE after installing the plugin, write their code, and perform parallel computing on as big cluster as they desire (or to be realistic, as big as the credit limit their card has). The plugin takes care of communicating with the Amazon Elastic MapReduce API and submitting the code for execution, while Amazon sets-up the cluster in minutes and pulls the desired data from S3 storage.
The most impressive thing for me in this story, is the abstraction for all the layers that used to require an expert a few years back (see MPI and C), in order to perform high performance computing. And the second impressive thing is the democratization of access to large computing resources. Definitely infrastructures and software are in place in big corporations, allowing business analysts and researchers to write and execute code on big clusters without worrying about the details of cluster setup. But how about those outside those corporations ? And how about those that don’t want to code in C or C++ ?
Through Hadoop’s Streaming option though, anyone that can write a Perl script which will operate in a large dataset on a compute cluster, by following the most abstract programming model of Map / Reduce. With a laptop having an internet connection and a credit card, a Bionformatics researcher anywhere in the world can write code, and execute it without worrying for resource constraints, but only worrying about his or her application logic.