且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

where子句在spark sql数据框中不起作用

更新时间:2023-10-23 15:59:28

正如你在你的架构中看到的 zipString 类型,所以你的查询应该是一些东西像这样

As you can see in your schema zip is of type String, so your query should be something like this

sqlContext.sql("select lat, lng from census where zip = '00650'").show()

更新:

如果您使用的是 Spark 2,那么您可以这样做:

import sparkSession.sqlContext.implicits._

val dataFrame = Seq(("10.023", "75.0125", "00650"),("12.0246", "76.4586", "00650"), ("10.023", "75.0125", "00651")).toDF("lat","lng", "zip")

dataFrame.printSchema()

dataFrame.select("*").where(dataFrame("zip") === "00650").show()

dataFrame.registerTempTable("census")

sparkSession.sqlContext.sql("SELECT lat, lng FROM census WHERE zip = '00650'").show()

输出:

root
 |-- lat: string (nullable = true)
 |-- lng: string (nullable = true)
 |-- zip: string (nullable = true)

+-------+-------+-----+
|    lat|    lng|  zip|
+-------+-------+-----+
| 10.023|75.0125|00650|
|12.0246|76.4586|00650|
+-------+-------+-----+

+-------+-------+
|    lat|    lng|
+-------+-------+
| 10.023|75.0125|
|12.0246|76.4586|
+-------+-------+