且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

使用 Hadoop/MapReduce 查找连接的组件

更新时间:2023-02-26 15:58:59

我为自己写了一篇博客:

I blogged about it for myself:

http://codingwiththomas.blogspot.de/2011/04/graph-exploration-with-hadoop-mapreduce.html

但是 MapReduce 不适合这些图形分析的东西.***使用 BSP(批量同步并行),Apache Hama 在 Hadoop HDFS 之上提供了一个很好的图形 API.

But MapReduce isn't a good fit for these Graph analysis things. Better use BSP (bulk synchronous parallel) for that, Apache Hama provides a good graph API on top of Hadoop HDFS.

我在这里用 MapReduce 编写了一个连通分量算法:(Mindist search)

I've written a connected components algorithm with MapReduce here: (Mindist search)

https://github.com/thomasjungblut/tjungblut-graph/tree/master/src/de/jungblut/graph/mapreduce

还可以在此处找到 Apache Hama 的 BSP 版本:

Also a BSP version for Apache Hama can be found here:

https://github.com/thomasjungblut/tjungblut-graph/blob/master/src/de/jungblut/graph/bsp/MindistSearch.java

实现并不像 MapReduce 那样困难,而且速度至少快 10 倍.如果您有兴趣,请在 TRUNK 中查看最新版本并访问我们的邮件列表.

The implementation isn't as difficult as in MapReduce and it is at least 10 times faster. If you're interested, checkout the latest version in TRUNK and visit our mailing list.

http://hama.apache.org/

http://apache.org/hama/mail-lists.html