且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何在不执行的情况下验证 Spark SQL 表达式?

更新时间:2022-11-03 14:33:50

SparkSqlParser

Spark SQL 使用 SparkSqlParser 作为 Spark SQL 表达式的解析器.

SparkSqlParser

Spark SQL uses SparkSqlParser as the parser for Spark SQL expressions.

您可以使用 SparkSession(和 SessionState)访问 SparkSqlParser,如下所示:

You can access SparkSqlParser using SparkSession (and SessionState) as follows:

val spark: SparkSession = ...
val parser = spark.sessionState.sqlParser

scala> parser.parseExpression("select * from table")
res1: org.apache.spark.sql.catalyst.expressions.Expression = ('select * 'from) AS table#0

提示:为 org.apache.spark.sql.execution.SparkSqlParser 记录器启用 INFO 日志记录级别以查看内部发生的情况.


TIP: Enable INFO logging level for org.apache.spark.sql.execution.SparkSqlParser logger to see what happens inside.

仅凭这一点并不能为您提供最有效的防弹盾牌来抵御不正确的 SQL 表达式并思考 sql 方法更适合.

That alone won't give you the most bullet-proof shield against incorrect SQL expressions and think sql method is a better fit.

sql(sqlText: String): DataFrame 使用 Spark 执行 SQL 查询,将结果作为 DataFrame 返回.用于 SQL 解析的方言可以通过 'spark.sql.dialect' 进行配置.

sql(sqlText: String): DataFrame Executes a SQL query using Spark, returning the result as a DataFrame. The dialect that is used for SQL parsing can be configured with 'spark.sql.dialect'.

请参阅下面的操作.

scala> parser.parseExpression("hello world")
res5: org.apache.spark.sql.catalyst.expressions.Expression = 'hello AS world#2

scala> spark.sql("hello world")
org.apache.spark.sql.catalyst.parser.ParseException:
mismatched input 'hello' expecting {'(', 'SELECT', 'FROM', 'ADD', 'DESC', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'INSERT', 'DELETE', 'DESCRIBE', 'EXPLAIN', 'SHOW', 'USE', 'DROP', 'ALTER', 'MAP', 'SET', 'RESET', 'START', 'COMMIT', 'ROLLBACK', 'REDUCE', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', 'DFS', 'TRUNCATE', 'ANALYZE', 'LIST', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', 'MSCK', 'EXPORT', 'IMPORT', 'LOAD'}(line 1, pos 0)

== SQL ==
hello world
^^^

  at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:217)
  at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:114)
  at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
  at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:68)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:638)
  ... 49 elided