且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

转换到RDD JSON对象

更新时间:2023-10-22 13:53:16

首先我用下面的code重现你提到的情况:

  VAL sampleArray =阵列(
(果,列表(苹果,香蕉,芒果))
(植物人,列表(土豆,番茄)))VAL sampleRdd = sc.parallelize(sampleArray)
sampleRdd.foreach(的println)//打印结果

现在,我使用 json4s Scala库这个RDD转换成您所请求的JSON结构:

 进口org.json4s.native.JsonMethods._
进口org.json4s.JsonDSL.WithDouble._VAL JSON =类别 - > sampleRdd.collect()。toList.map {
情况下(名称,节点)=>
  (名,名称)〜
  (节点,nodes.map {
    名称= GT; (名,名称)
  })
}的println(紧凑型(渲染(JSON)))//打印呈现JSON

的结果是:

{\"categories\":[{\"name\":\"FRUIT\",\"nodes\":[{\"name\":\"Apple\"},{\"name\":\"Banana\"},{\"name\":\"Mango\"}]},{\"name\":\"VEGETABLE\",\"nodes\":[{\"name\":\"Potato\"},{\"name\":\"Tomato\"}]}]}

I have an RDD of type RDD[(String, List[String])].

Example:

(FRUIT, List(Apple,Banana,Mango))
(VEGETABLE, List(Potato,Tomato))

I want to convert the above output to json object like below.

{
  "categories": [
    {
      "name": "FRUIT",
      "nodes": [
        {
          "name": "Apple",
          "isInTopList": false
        },
        {
          "name": "Banana",
          "isInTopList": false
        },
        {
          "name": "Mango",
          "isInTopList": false
        }
      ]
    },
    {
      "name": "VEGETABLE",
      "nodes": [
        {
          "name": "POTATO",
          "isInTopList": false
        },
        {
          "name": "TOMATO",
          "isInTopList": false
        },
      ]
    }
  ]
}

Please suggest the best possible way to do it.

NOTE: "isInTopList": false is always constant and has to be there with every item in the jsonobject.

First I used the following code to reproduce the scenario that you mentioned:

val sampleArray = Array(
("FRUIT", List("Apple", "Banana", "Mango")),
("VEGETABLE", List("Potato", "Tomato")))

val sampleRdd = sc.parallelize(sampleArray)
sampleRdd.foreach(println) // Printing the result

Now, I am using json4s Scala library to convert this RDD into the JSON structure that you requested:

import org.json4s.native.JsonMethods._
import org.json4s.JsonDSL.WithDouble._

val json = "categories" -> sampleRdd.collect().toList.map{
case (name, nodes) =>
  ("name", name) ~
  ("nodes", nodes.map{
    name => ("name", name)
  })
}

println(compact(render(json))) // Printing the rendered JSON

The result is:

{"categories":[{"name":"FRUIT","nodes":[{"name":"Apple"},{"name":"Banana"},{"name":"Mango"}]},{"name":"VEGETABLE","nodes":[{"name":"Potato"},{"name":"Tomato"}]}]}