更新时间:2023-11-15 20:28:10
假设你有一个有效的SparkContext sparkContext
已经创建,添加了火花卡桑德拉连接器依存关系您的项目,配置了火花应用谈谈您卡桑德拉集群(见的文档为),那么我们可以在RDD这样加载数据:
Assuming that you have a valid SparkContext sparkContext
already created, have added the spark-cassandra connector dependencies to your project and configured your spark application to talk to your cassandra cluster (see docs for that), then we can load the data in an RDD like this:
val data = sparkContext.cassandraTable("foo", "appusage").select("appid", "cpuusage")
在Java中,这个想法是一样的,但它需要多一点的管道,描述的这里
In Java, the idea is the same but it requires a bit more plumbing, described here