且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Groovy:验证JSON字符串

更新时间:2023-08-26 12:25:28

JsonParser 接口实现(JsonParserCharArray是默认接口).这些解析器逐字符检查char,当前字符是什么以及它代表哪种令牌类型.如果您查看

JsonSlurper class uses JsonParser interface implementations (with JsonParserCharArray being a default one). Those parsers check char by char what is the current character and what kind of token type it represents. If you take a look at JsonParserCharArray.decodeJsonObject() method at line 139 you will see that if parser sees } character, it breaks the loop and finishes decoding JSON object and ignores anything that exists after }.

这就是为什么如果将任何无法识别的字符放在JSON对象的前面,JsonSlurper会引发异常.但是,如果您在}之后用任何不正确的字符结尾JSON字符串,它将通过,因为解析器甚至都不会考虑这些字符.

That's why if you put any unrecognizable character(s) in front of your JSON object, JsonSlurper will throw an exception. But if you end your JSON string with any incorrect characters after }, it will pass, because parser does not even take those characters into account.

您可能会考虑使用JsonOutput.prettyPrint(String json)方法,该方法在尝试打印的JSON方面有更多限制(它使用JsonLexer以流方式读取JSON令牌).如果您这样做:

You may consider using JsonOutput.prettyPrint(String json) method that is more restrict if it comes to JSON it tries to print (it uses JsonLexer to read JSON tokens in a streaming fashion). If you do:

def jsonString = '{"name": "John", "data": [{"id": 1},{"id": 2}]}...'

JsonOutput.prettyPrint(jsonString)

它将引发类似以下的异常:

it will throw an exception like:

Exception in thread "main" groovy.json.JsonException: Lexing failed on line: 1, column: 48, while reading '.', no possible valid JSON value or punctuation could be recognized.
    at groovy.json.JsonLexer.nextToken(JsonLexer.java:83)
    at groovy.json.JsonLexer.hasNext(JsonLexer.java:233)
    at groovy.json.JsonOutput.prettyPrint(JsonOutput.java:501)
    at groovy.json.JsonOutput$prettyPrint.call(Unknown Source)
    at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
    at app.JsonTest.main(JsonTest.groovy:13)

但是,如果我们传递有效的JSON文档,例如:

But if we pass a valid JSON document like:

def jsonString = '{"name": "John", "data": [{"id": 1},{"id": 2}]}'

JsonOutput.prettyPrint(jsonString)

它将成功通过.

好处是,您不需要任何其他依赖关系即可验证JSON.

The good thing is that you don't need any additional dependency to validate your JSON.

我进行了更多调查,并使用3种不同的解决方案进行了测试:

I did some more investigation and run tests with 3 different solutions:

  • JsonOutput.prettyJson(String json)
  • JsonSlurper.parseText(String json)
  • ObjectMapper.readValue(String json, Class<> type)(需要添加jackson-databind:2.9.3依赖项)
  • JsonOutput.prettyJson(String json)
  • JsonSlurper.parseText(String json)
  • ObjectMapper.readValue(String json, Class<> type) (it requires adding jackson-databind:2.9.3 dependency)

我已使用以下JSON作为输入:

I have used following JSONs as an input:

def json1 = '{"name": "John", "data": [{"id": 1},{"id": 2},]}'
def json2 = '{"name": "John", "data": [{"id": 1},{"id": 2}],}'
def json3 = '{"name": "John", "data": [{"id": 1},{"id": 2}]},'
def json4 = '{"name": "John", "data": [{"id": 1},{"id": 2}]}... abc'
def json5 = '{"name": "John", "data": [{"id": 1},{"id": 2}]}'

预期结果是前4个JSON验证失败,只有第5个正确.为了进行测试,我创建了这个Groovy脚本:

Expected result is that first 4 JSONs fail validation and only 5th one is correct. To test it out I have created this Groovy script:

@Grab(group='com.fasterxml.jackson.core', module='jackson-databind', version='2.9.3')

import groovy.json.JsonOutput
import groovy.json.JsonSlurper
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.databind.DeserializationFeature

def json1 = '{"name": "John", "data": [{"id": 1},{"id": 2},]}'
def json2 = '{"name": "John", "data": [{"id": 1},{"id": 2}],}'
def json3 = '{"name": "John", "data": [{"id": 1},{"id": 2}]},'
def json4 = '{"name": "John", "data": [{"id": 1},{"id": 2}]}... abc'
def json5 = '{"name": "John", "data": [{"id": 1},{"id": 2}]}'

def test1 = { String json ->
    try {
        JsonOutput.prettyPrint(json)
        return "VALID"
    } catch (ignored) {
        return "INVALID"
    }
}

def test2 = { String json ->
    try {
        new JsonSlurper().parseText(json)
        return "VALID"
    } catch (ignored) {
        return "INVALID"
    }
}

ObjectMapper mapper = new ObjectMapper()
mapper.configure(DeserializationFeature.FAIL_ON_TRAILING_TOKENS, true)

def test3 = { String json ->
    try {
        mapper.readValue(json, Map)
        return "VALID"
    } catch (ignored) {
        return "INVALID"
    }
}

def jsons = [json1, json2, json3, json4, json5]
def tests = ['JsonOutput': test1, 'JsonSlurper': test2, 'ObjectMapper': test3]

def result = tests.collectEntries { name, test ->
    [(name): jsons.collect { json ->
        [json: json, status: test(json)]
    }]
}

result.each {
    println "${it.key}:"
    it.value.each {
        println " ${it.status}: ${it.json}"
    }
    println ""
}

结果如下:

JsonOutput:
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2},]}
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}],}
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]},
 INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}... abc
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}

JsonSlurper:
 INVALID: {"name": "John", "data": [{"id": 1},{"id": 2},]}
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}],}
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]},
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}... abc
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}

ObjectMapper:
 INVALID: {"name": "John", "data": [{"id": 1},{"id": 2},]}
 INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}],}
 INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}]},
 INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}... abc
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}

您可以看到获胜者是Jackson的ObjectMapper.readValue()方法.重要的是-它可以与jackson-databind> = 2.9.0一起使用.在此版本中,他们引入了DeserializationFeature.FAIL_ON_TRAILING_TOKENS,它使JSON解析器按预期方式工作.如果我们不像上面的脚本那样将此配置功能设置为true,则ObjectMapper会产生错误的结果:

As you can see the winner is Jackson's ObjectMapper.readValue() method. What's important - it works with jackson-databind >= 2.9.0. In this version they introduced DeserializationFeature.FAIL_ON_TRAILING_TOKENS which makes JSON parser working as expected. If we wont set this configuration feature to true as in the above script, ObjectMapper produces incorrect result:

ObjectMapper:
 INVALID: {"name": "John", "data": [{"id": 1},{"id": 2},]}
 INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}],}
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]},
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}... abc
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}

我很惊讶Groovy的标准库在此测试中失败.幸运的是,可以使用jackson-databind:2.9.x依赖项来完成.希望对您有所帮助.

I was surprised that Groovy's standard library fails in this test. Luckily it can be done with jackson-databind:2.9.x dependency. Hope it helps.