且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

需要对使用mongodb聚合查询从另一个集合连接的多个字段进行不同的计数

更新时间:2023-11-09 20:57:46

这应该可以解决问题.我在您的输入集上对其进行了测试,并故意添加了一些重复值,例如NYC出现在多个DESTINATION中,以确保将其重复数据删除(即按要求分配不同的计数). 为了娱乐起见,请注释掉所有阶段,然后从上至下注释掉注释,以查看管道的每个阶段的效果.

This should do the trick. I tested it on your input set and deliberately added some dupe values like NYCshowing up in more than one DESTINATIONto ensure it got de-duped (i.e. distinct count as asked for). For fun, comment out all the stages, then top down UNcomment it out to see the effect of each stage of the pipeline.

var id = "1";

c=db.foo.aggregate([
// Find a thing:
{$match: {"_id" : id}}

// Do the lookup into the objects collection:
,{$lookup: {"from" : "foo2",
            "localField" : "objectsIds",
            "foreignField" : "_id",
            "as" : "objectResults"}}

// OK, so we've got a bunch of extra material now.  Let's
// get down to just the metaDataMap:
,{$project: {x: "$objectResults.metaDataMap"}}
,{$unwind: "$x"}
,{$project: {"_id":0}}

// Use $objectToArray to get all the field names dynamically:
// Replace the old x with new x (don't need the old one):
,{$project: {x: {$objectToArray: "$x"}}}
,{$unwind: "$x"}

// Collect unique field names.  Interesting note: the values
// here are ARRAYS, not scalars, so $push is creating an
// array of arrays:
,{$group: {_id: "$x.k", tmp: {$push: "$x.v"}}}

// Almost there!  We have to turn the array of array (of string)
// into a single array which we'll subsequently dedupe.  We will
// overwrite the old tmp with a new one, too:
,{$addFields: {tmp: {$reduce:{
    input: "$tmp",
    initialValue:[],
    in:{$concatArrays: [ "$$value", "$$this"]}
        }}
    }}

// Now just unwind and regroup using the addToSet operator
// to dedupe the list:
,{$unwind: "$tmp"}
,{$group: {_id: "$_id", uniqueVals: {$addToSet: "$tmp"}}}

// Add size for good measure:
,{$addFields: {size: {"$size":"$uniqueVals"}} }
          ]);