[tahoe-dev] MapReduce over Tahoe

Aaron Cordova aaron at cordovas.org
Thu Aug 27 17:12:50 PDT 2009


All,

I've written a plugin for Hadoop's MapReduce implementation that  
allows MapReduce jobs to be run over data stored in Tahoe. I've tested  
it using machines from Amazon's Elastic Compute Cluster, and am  
currently gathering performance information.

I'm hoping to highlight security issues of public cloud computing  
services and spark new thinking about solutions. Tahoe enables the  
creation of discrete domains for compute consumers, making it easier  
for multiple departments within an organization or multiple  
organizations on a community cloud to do massive data analysis and  
share results.

More details and code can be found at http://hadoop- 
lafs.googlecode.com . The code will be licensed under the Apache open  
source license, and hopefully will be included in the Hadoop  
distribution.

Help, suggestions, comments and questions are welcome.

- Aaron Cordova



More information about the tahoe-dev mailing list