且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

使用火花和RDD映射卡桑德拉数据库的表

更新时间:2023-11-15 20:28:10

假设你有一个有效的SparkContext sparkContext 已经创建,添加了火花卡桑德拉连接器依存关系您的项目,配置了火花应用谈谈您卡桑德拉集群(见的文档为),那么我们可以在RDD这样加载数据:

Assuming that you have a valid SparkContext sparkContext already created, have added the spark-cassandra connector dependencies to your project and configured your spark application to talk to your cassandra cluster (see docs for that), then we can load the data in an RDD like this:

val data = sparkContext.cassandraTable("foo", "appusage").select("appid", "cpuusage")

在Java中,这个想法是一样的,但它需要多一点的管道,描述的这里

In Java, the idea is the same but it requires a bit more plumbing, described here